Archives for June 2008

Two search tidbits

At SMX a couple weeks ago Eric Enge and I did a 20-25 minute interview. The interview transcript is now out in case you want to read through it. We discuss some of the ways to get links that are likely to stand the test of time:

Those links are typically given voluntarily. It is an editorial link by someone, and it’s someone that’s informed. They are not misinformed, they are not tricked; there is no bait and switch involved. It’s because somebody thinks that something is so cool, so useful, or so helpful that they want to make little sign posts so that other people on the web can find that out.

I mentioned link-generating methods from original research to case studies, a service, or even an open-source product. We also discussed widgetbait and some of the criteria on whether Google would consider something spammy or not. We talked about when reciprocal links can get excessive. We discussed a good rule of thumb of when to link out to other sites (when it’s good for your users).

At the very end of the interview, I took the opportunity to send some props toward SocialSpark. As opposed to some services where paid posts pass PageRank, SocialSpark posts require nofollow so that any paid links don’t pass PageRank. If paid posts respect that requirement from SocialSpark, they’d be within our webmaster guidelines. I’ve noticed once or twice where an advertiser tried to get an extra nofollow’ed link in a SocialSpark post, but when I’ve mentioned those 1-2 examples, IZEA has taken action to correct that. So we’ll continue to keep an eye on things, but I wanted to mention the progress that I saw in SocialSpark.

By the way, we’re currently caught up on paid link reports, so if you know of sites (maybe in your search niche) that appear to be selling or buying paid links that pass PageRank, it’s a great time to let us know. Use the authenticated paid link spam report form and someone will investigate the report. We’ll be concentrating primarily on the sellers, but if you send us a site that appears to be buying links that pass PageRank it’s trivial for us to look up all the backlinks for that site to find potential sellers and work from there. That feedback will also help us improve our algorithms, so thanks in advance for any feedback you want to provide.

What’s the second search tidbit? Last Thursday I was a guest host for the Daily SearchCast. I thought it was going to be a slow news day, but when I woke up that morning Jeremy Zawodny had just announced that he was leaving Yahoo. So Danny Sullivan and I ended up having plenty to talk about.

Check your search box for XSS exploits

Just a quick reminder that websites should check for XSS holes on their site, especially freeform text input such as search boxes. Even big sites can have these issues with XSS and escaping user input. (Note: don’t click on these search results.)

If you’ve noticed that your rankings in Google seem to be affected, you might consider a few searches on your site to see if anyone has injected spammy or porn content on your site. If your domain was example.com, you might want to run a few queries such as [site:example.com porn] or [site:example.com biaxin] or [site:example.com viagra] to see whether you run across unexpected results.

The Google security blog has written about XSS holes and exploits before and how to protect yourself. We’ve also written about protecting your site and cleaning up a hacked site before.

Added: (Switching from XSS to pure hacked sites for a moment.) Make sure to change your admin password if you update (say) your WordPress installation. Sometimes hackers are smart enough to save your password and come back even after you’ve fully patched your system. I tend to change my admin password at least every time I upgrade my version of WordPress.

Don’t end your urls with .exe

Sometimes at a conference people will ask me “Does it matter what extension I use for my pages? Does Google prefer .php over .asp, or .html over .htm?” And my answer is “We’re happy to crawl all of these file extensions. It doesn’t matter what you choose between any of those.”

Usually I also try to insert a reminder at the end of my reply such as “But there are some file extensions that are mostly binary data, such as .exe, where the vast majority of the time the data would be meaningless blobs, so there are a few extensions to avoid. If your files are named example.dll or example.bin and you don’t see Google crawling pages with that file extension, I’d recommend changing your file extension to something else.”

There’s a simple way to check whether Google will crawl things with a certain filetype extension. If you do a query such as [filetype:exe] and you don’t see any urls that end directly in “.exe” then that means either 1) there are no such files on the web, which we know isn’t true for .exe, or 2) Google chooses not to crawl such pages at this time — usually because pages with that file extension have been unusually useless in the past. So for example, if you query for [filetype:tgz] or [filetype:tar], you’ll see urls such as “papers.ssrn.com/pape.tar?abstract_id” that contain “.tar” but no files that end directly in .tar. That means that you probably shouldn’t make your html pages end in .tar.

The SEOmoz folks stumbled across this when they had a url that ended with “/web2.0” . It looks like previously they had a url looked like “/web2.0/” (note the trailing slash), which we were happy to crawl/index/rank. But when their linkage shifted enough that “/web2.0” became their preferred url, Google wouldn’t crawl urls ending in “.0”, so the page became uncrawled.

Even though urls ending in “.0” are often binary and therefore end up getting dropped later in our indexing pipeline, it’s always good to revisit old decisions and respond to feedback by running new tests. So just in the last day or so, we switched it so that Google is willing to crawl pages that end in in “.0”. This will help the small number of pages out on the web that want to serve up HTML pages with a “.0” extension.

You can see the results trickling into Google with a bunch of “X hours ago” fresh results:

0 file extension

So my quick takeaways would be:
– Why Google doesn’t crawl some filetype extensions (when we’ve seen good evidence that the extensions are mostly binary or otherwise not-very-indexable files).
– An easy was to use the filetype: operator, so that you can decide whether to avoid a particular filename extension yourself.
– Google is willing to revisit old decisions and test them again, which is what we’re doing with the “.0” filetype extension.

I hope that helps a few people who are considering unusual filetype extensions of their own. 🙂

Jeremy leaves Yahoo!

Jeremy Zawodny is leaving Yahoo!. That’s pretty huge news.

Wihle Jeremy and I have playfully jousted in the past, I have nothing but respect for Jeremy — to the point where we joked for April Fool’s Day a couple years ago about switching blogs. I’ve enjoyed being on search panels with him before, and he’s been a fantastic communicator on Yahoo’s behalf. I picked up several blogging tips from listening to him, including how to avoid getting a “talking to” by your company. The secret, according to Jeremy’s presentation a few years ago, was not to talk much about your CEO. 🙂

I think a ton of good karma will follow Jeremy wherever he goes next. It sounds like it has something to do with open-source, which is great for him and (I’m guessing) for open-source as well. My hunch has always been that Jeremy pushed hard for Yahoo’s open strategy behind the scenes, and I hope that Yahoo continues that effort. Jeremy, good luck in whatever comes next, and keep us posted!

Review: Stud-4-Sure stud finder

I wanted to do a few short blog posts about products that I really like and that work well. First up is a stud finder that I found. It’s called the Stud4Sure stud finder (not an affiliate link). It’s more accurate than fancy electronic studfinders and much more accurate than knocking on the wall.

This little sucker is essentially some magnets embedded in a piece of plastic. Sometimes the best solution is the easiest, eh? This thing is simplicity itself. Just rub it over a wall and it will “stick” over the nails that you’ll find in studs. Here’s what it looks like sticking to one of my walls:

Stud finder

It costs 15 bucks, which is pretty cheap in my book. Highly recommended.

css.php