One of the wonderful things about a search conference like SMX Advanced is that it gives us a chance to finish a lot of things we’d been meaning to do. Google just added a bunch of nice documentation in various places. We even did it in official places — much better than doing it on my personal blog. Here’s a few of the things that I know we’ve done recently:
One of the things that I like about robots.txt and the Robots Exclusion Protocol (REP) is that it’s well-supported by all the major search engines and has been for years. But more documentation is a good thing, and several of the major search engines recently did blog posts about how they support robots.txt and REP. You can read Google’s robots.txt/REP post, the Microsoft post, or the post from Yahoo.
By the way: if you haven’t seen it, Google also produced a really nice booklet about robots.txt for publishers (PDF link). This PDF is perfect for regular folks that don’t live and breathe search 24 hours a day.
We do appreciate getting suggestions and feedback from users, webmasters, and SEOs. I’m especially interested when people want to report spam, including paid text links. Google’s position on paid links that pass PageRank is well-known, because we’ve been pretty clear on the subject.
In a blog post earlier today, Reid Yokoyama put out a renewed call for spam reports. He gave a peek into the numbers of how many reports Google receives and how we prioritize (here’s a hint: our authenticated spam report form gets higher priority). Read his entire blog post if you’d like to hear more about webspam, paid links, and user feedback.
Just one additional note: we accept spam reports not just in English, but in many languages. For example, I’d love to get spam reports in Russian, spam reports in Turkish, spam reports in Romanian, or even spam reports in Arabic.
People ask me about cloaking software and technology all the time to find out how risky it is to use a cloak script when Googlebot visits (the short answer: it’s very risky). We did a blog post (and a video!) to describe the difference between things like IP delivery, which is serving different content to users based on IP address, and geolocation (which serves different content based on the user’s location). IP-based geolocation is a specific type of IP delivery that is within Google’s quality guidelines. Then we describe cloaking (which is serving different content to users than to Googlebot). I highly recommend that you read the post and watch Maile’s video for more information. If you’re interested in herding search engine bots in a whitehat/low risk way, that post will tell you what Google considers cloaking.
Earlier this year, Li Evans pinged us with a good observation. We’ve answered a ton of questions about nofollow in various places around the blogosphere. Li asked us to distill the important bits about nofollow into a single page and place it in Google’s HTML documentation. We just pushed that live, so you can read more about the nofollow attribute if you’re interested. Thanks for suggesting that, Li.
Better definition for doorway pages
Michael Martinez was a little less polite than Li. He essentially said that Google’s documentation had a pretty sucky definition for what a doorway page was. Fair point. So we revamped the definition of a doorway page to be more clear:
Doorway pages are typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase. In many cases, doorway pages are written to rank for a particular phrase and then funnel users to a single destination.
I think that definition is much better than our old definition of a doorway page. Thanks for the suggestion, Michael.
Primarily for users
One of our quality guidelines used to say
Make pages for users, not for search engines. Don’t deceive your users or present different content to search engines than you display to users, which is commonly referred to as “cloaking.”
We recently clarified that guideline to say “Make pages primarily for users, not for search engines” (emphasis mine). Why add “primarily”? As one of the main authors of those quality guidelines, I can tell you that the intent of that guideline was mainly to discourage cloaking (which is doing something different for search engines than for regular users). Some people have misinterpreted that guideline as “You can’t do a single thing for search engines that you wouldn’t do for your users,” and that was not my intent when I wrote that guideline. Instead, the spirit of that guideline is that users should be the primary consideration. But it is fine to do some things that don’t affect users but do help search engines.
I’ll run through 3-4 quick examples of things that are perfectly okay to do for search engines, but that you wouldn’t automatically do for users:
- Adding a nofollow attribute to a link doesn’t affect users, but can serve as a useful indicator to search engines that you don’t necessarily want PageRank to flow through that link.
- Adding a meta description. When a user visits a web page, their browser doesn’t show the meta description data in any way. But you can suggest to search engines to show a particular snippet by using the meta description wisely.
- You can tell Google your preference on www vs. non-www. Again, that’s probably not something that users see or that directly affects them, but it’s still a smart thing to do.
- Submitting a Sitemap to Google or making it available to other search engines is not an action that you’d take for users, because users don’t see Sitemaps. But it can be a smart move because search engines can do better if you provide that information.
Just to be clear: Users are vitally important. I still recommend that you keep your users in mind at all times as you design and create a site. We added the word “primarily” to indicate that people can do additional things that users don’t see but that helps search engines do better at crawling/indexing/serving your site.
I can’t believe I just wrote 350+ words about a one-word change to our quality guidelines. But I hope that gives some background and context.
No search engine is perfect, and everyone will have different opinions about what a search engine should focus on. But I appreciate the feedback that we get from users, webmasters, and SEOs. I know that the suggestions that we get help to make Google a better search engine. If you see me at SMX Advanced, please walk right up and say hello. I promise that I’m not frightening, and I’d like to hear where you think Google needs to improve. There will also be a bunch of other Googlers at the conference — don’t be shy about approaching them, either.