Archives for February 2008

An easy way to add new features to Google

Have you ever wanted to add a new feature to Google’s search results? There’s a really nice way to do it right now. If you’re not familiar with this functionality, it’s called a Subscribed Link, and it lets you “create custom search results that users can add to their Google search pages. You can display links to your services for your customers, provide news and status information updated in near-real-time, answer questions, calculate useful quantities, and more.” That page has a whole list of different ways to add new features to Google’s search results:

* Create search results specific to your product, service, or expertise.
* Design a basic version in minutes to see how it works.
* Build a dynamic version using XML, TSV, or RSS files or feeds.
* Include images in your Subscribed Links.
* Include Google Gadgets in your Subscribed Links.
* Test your Subscribed Links interactively and get debugging messages.
* Define query patterns using lists of keywords or regular expressions.
* Invoke the calculator to help construct your results.

I like that Google provides an open system to add functionality to our search results. If this sounds interesting to you, check out this blog post by Google OS (an unofficial blog), read through the subscribed links developer guide, or check out the Subscribed link FAQ.

Let’s walk through an example. I often need to know what my IP address is. Usually I go to Google, search for [ip address], and click on one of the top results. That works okay, but I discovered that there’s an even easier way. Go to this page and click on the “Subscribe” button.

Now when you go to Google and type a query like [my ip], you’ll see the answer right in your search results, like this:

Find my ip address

I painted out my actual IP address, but you get the idea. Now if only aruljohn.com would add the query [ip address] to the list of queries that triggers a subscribed link, that will let me be lazy and continue doing the query that I’m used to. 🙂

If you’d like to add some new functionality to Google, why not try it for yourself today? I made a simple subscribed link that looks like this:

Example subscribed link

in about a minute. It looks like you can make a subscribed link out of feeds very quickly. It looks like you can even add your own flexible gadget to Google’s search results, and it looks like this:

Example gadget in search results

By the way, I originally wrote this post a little while ago focusing on how to find out your IP address with a specific subscribed link. After Yahoo announced their “SearchMonkey” project tonight (congrats to the Yahoo folks!), I figured I’d add in some details about Google’s Subscribed Links and how to make a rich snippet result using Subscribed Links.

What should NOINDEX do?

Okay, this post will be colossally boring to some people. But I wanted to give you a peek at debates behind the curtain in Google’s search quality group. Here’s a policy discussion about NOINDEX and how Google should treat the NOINDEX meta tag. First, you’ll want to read this post about how Google handles the NOINDEX meta tag. You may also want to watch this video about how to remove your content from Google or prevent it from being indexed in the first place. Here’s the conclusion from my earlier blog post:

So based on a sample size of one page, it looks like search engines handle the “NOINDEX” meta tag:
– Google doesn’t show the page in any way
– Ask doesn’t show the page in any way
– MSN shows a url reference and Cached link, but no snippet. Clicking the cached link doesn’t return anything.
– Yahoo! shows a url reference and Cached link, but no snippet. Clicking on the cached link returns the cached page.

The question is whether Google should completely drop a NOINDEX’ed page from our search results vs. show a reference to the page, or something in between? Let me lay out the arguments for each:

Completely drop a NOINDEX’ed page

This is the behavior that we’ve done for the last several years, and webmasters are used to it. The NOINDEX meta tag gives a good way — in fact, one of the only ways — to completely remove all traces of a site from Google (another way is our url removal tool). That’s incredibly useful for webmasters. The only corner case is that if Google sees a link to a page A but doesn’t actually crawl the page, we won’t know that page A has a NOINDEX tag and we might show the page as an uncrawled url. There’s an interesting remedy for that: currently, Google allows a NOINDEX directive in robots.txt and it will completely remove all matching site urls from Google. (That behavior could change based on this policy discussion, of course, which is why we haven’t talked about it much.)

Webmasters sometimes shoot themselves in the foot by using NOINDEX, but if a site’s traffic from Google is very low, the webmaster will be motivated to diagnose the issue themselves. Plus we could add a NOINDEX check into the webmaster console to help webmasters self-diagnose if they’ve removed their own site with NOINDEX. The NOINDEX meta tag serves a useful role that’s different than robots.txt, and the tag is far enough off the beaten path that few people use the NOINDEX tag by mistake.

Show a link/reference to NOINDEX’ed pages

Our highest duty has to be to our users, not to an individual webmaster. When a user does a navigational query and we don’t return the right link because of a NOINDEX tag, it hurts the user experience (plus it looks like a Google issue). If a webmaster really wants to be out of Google without even a single trace, they can use Google’s url removal tool. The numbers are small, but we definitely see some sites accidentally remove themselves from Google. For example, if a webmaster adds a NOINDEX meta tag to finish a site and then forgets to remove the tag, the site will stay out of Google until the webmaster realizes what the problem is. In addition, we recently saw a spate of high-profile Korean sites not returned in Google because they all have a NOINDEX meta tag. If high-profile sites like

http://www.police.go.kr/main/index.do (the National Police Agency of Korea)
http://www.nmc.go.kr/ (the National Medical Center of Korea)
http://www.yonsei.ac.kr/ (Yonsei University)

aren’t showing up in Google because of the NOINDEX meta tag, that’s bad for users (and thus for Google).

Some middle ground in between

The vast majority of webmasters who use NOINDEX do so deliberately and use the meta tag correctly (e.g. for parked domains that they don’t want to show up in Google). Users are most discouraged when they search for a well-known site and can’t find it. What if Google treated NOINDEX differently if the site was well-known? For example, if the site was in the Open Directory, then show a reference to the page even if the site used the NOINDEX meta tag. Otherwise, don’t show the site at all. The majority of webmasters could remove their site from Google, but Google would still return higher-profile sites when users searched for them.

What do you think?

That’s the internal discussion that we’ve been having about NOINDEX meta tags. Now I’m curious what you think. Here’s a poll:

{democracy:6}

I’d also be interested in (constructive) suggestions in the comments about how Google should treat the NOINDEX meta tag. Try to step into both a regular user’s shoes as well as the position of a site owner before leaving a comment.

css.php