Okay, this post will be colossally boring to some people. But I wanted to give you a peek at debates behind the curtain in Google’s search quality group. Here’s a policy discussion about NOINDEX and how Google should treat the NOINDEX meta tag. First, you’ll want to read this post about how Google handles the NOINDEX meta tag. You may also want to watch this video about how to remove your content from Google or prevent it from being indexed in the first place. Here’s the conclusion from my earlier blog post:
So based on a sample size of one page, it looks like search engines handle the “NOINDEX” meta tag:
– Google doesn’t show the page in any way
– Ask doesn’t show the page in any way
– MSN shows a url reference and Cached link, but no snippet. Clicking the cached link doesn’t return anything.
– Yahoo! shows a url reference and Cached link, but no snippet. Clicking on the cached link returns the cached page.
The question is whether Google should completely drop a NOINDEX’ed page from our search results vs. show a reference to the page, or something in between? Let me lay out the arguments for each:
Completely drop a NOINDEX’ed page
This is the behavior that we’ve done for the last several years, and webmasters are used to it. The NOINDEX meta tag gives a good way — in fact, one of the only ways — to completely remove all traces of a site from Google (another way is our url removal tool). That’s incredibly useful for webmasters. The only corner case is that if Google sees a link to a page A but doesn’t actually crawl the page, we won’t know that page A has a NOINDEX tag and we might show the page as an uncrawled url. There’s an interesting remedy for that: currently, Google allows a NOINDEX directive in robots.txt and it will completely remove all matching site urls from Google. (That behavior could change based on this policy discussion, of course, which is why we haven’t talked about it much.)
Webmasters sometimes shoot themselves in the foot by using NOINDEX, but if a site’s traffic from Google is very low, the webmaster will be motivated to diagnose the issue themselves. Plus we could add a NOINDEX check into the webmaster console to help webmasters self-diagnose if they’ve removed their own site with NOINDEX. The NOINDEX meta tag serves a useful role that’s different than robots.txt, and the tag is far enough off the beaten path that few people use the NOINDEX tag by mistake.
Show a link/reference to NOINDEX’ed pages
Our highest duty has to be to our users, not to an individual webmaster. When a user does a navigational query and we don’t return the right link because of a NOINDEX tag, it hurts the user experience (plus it looks like a Google issue). If a webmaster really wants to be out of Google without even a single trace, they can use Google’s url removal tool. The numbers are small, but we definitely see some sites accidentally remove themselves from Google. For example, if a webmaster adds a NOINDEX meta tag to finish a site and then forgets to remove the tag, the site will stay out of Google until the webmaster realizes what the problem is. In addition, we recently saw a spate of high-profile Korean sites not returned in Google because they all have a NOINDEX meta tag. If high-profile sites like
aren’t showing up in Google because of the NOINDEX meta tag, that’s bad for users (and thus for Google).
Some middle ground in between
The vast majority of webmasters who use NOINDEX do so deliberately and use the meta tag correctly (e.g. for parked domains that they don’t want to show up in Google). Users are most discouraged when they search for a well-known site and can’t find it. What if Google treated NOINDEX differently if the site was well-known? For example, if the site was in the Open Directory, then show a reference to the page even if the site used the NOINDEX meta tag. Otherwise, don’t show the site at all. The majority of webmasters could remove their site from Google, but Google would still return higher-profile sites when users searched for them.
What do you think?
That’s the internal discussion that we’ve been having about NOINDEX meta tags. Now I’m curious what you think. Here’s a poll:
I’d also be interested in (constructive) suggestions in the comments about how Google should treat the NOINDEX meta tag. Try to step into both a regular user’s shoes as well as the position of a site owner before leaving a comment.