Archives for May 2009

What was new at Searchology?

Google launched 3-4 new features at Searchology today. You can read about Search Options, Google Squared, Rich Snippets, or Sky Map in my previous post. But I also pay attention to the small things that Google said. I noticed several tidbits that I don’t think we’ve said in public before.

– Pat Riley mentioned a couple internal code names for spell-check features. There’s the normal “Did you mean:” spellcheck link in red at the top of the search results. Then there’s a more aggressive feature (internal Google codename: “Chameleon”) that does mid-page suggestions:

chameleon result for labor

Finally, there’s an even more aggressive feature (internal Google code name: “Spellmeleon”) for when we really think the user messed up. In that case, we’ll include a couple results for the corrected query first, then results for the user’s original query. Take the query [ipodd] for example. Our algorithms strongly suggest that the user meant to type “ipod” so we’ll include those search results first.

spellmeleon result for ipodd

By the way, if you’re a power searcher and you really did want “ipodd” then you can do the query [+ipodd] with a ‘+’ character in front of the word that you want to match exactly. Let me just say that Spellmeleon makes life *so* much better for my webspam team. Tons of spammers target typos and misspelled queries all the time. If users see a couple of valid results before they see results for a misspelled/typo query, well, lets just say that users are exposed to a lot less webspam in Google. I’m a big fan of Spellmeleon. 🙂

– Pat Riley also mentioned that if you do some of these search improvements in a naive way, the additional server load is equivalent to if Germany and France just appeared out of nowhere and started sending all their daily searches to Google. So you have to do some smart things to make this search improvement viable.

– Scott Huffman revealed that mobile search results are blended between results from the mobile web and results from the regular/normal web. Makes sense, but not everyone knows that.

– Marissa Mayer mentioned that about 1 in 4 searches triggers a universal/blended search result.

– Marissa also mentioned that 40% of searches on any given day are repeat searches for that user (I’m not sure if that means repeated that day, or just repeated compared to past searches). She mentioned that to explain why SearchWiki can be useful, because if you’re repeating a search, you may want to customize the results to your taste. Marissa also said SearchWiki receives hundreds of thousands of annotations each day.

– Someone asked what Google is doing to crawl the deep web. My advice is to check out Jayant Madhavan’s paper to read more. Here’s a direct link to a copy (PDF) of the deep web crawling paper.

– Someone asked how important is it to search video with a text search query? Google did this for political videos during the election and I’d really like to see more in this area. Together with fellow Googler Wysz, I’ve made about 50 videos to answer common webmaster questions. Right now it’s a pain to create caption files for those videos. If Google could give me a rough speech-to-text transcript (with timecodes) and let me edit the transcript to correct errors, that would be fantastic. Then someone in Turkey could read my videos even if they didn’t understand English. I would love that.

– In answer to a question from Vanessa Fox, Kavi Goel mentioned that Rich Snippets will roll out slowly at first (probably beginning as a whitelist of trusted sites) but that over time, more and more sites could show up with rich snippets. You can read more about rich snippets on the Google webmaster blog or see example code. And if you’re really into RDFa or microformats or rich snippets, the folks at O’Reilly did a nice interview with two Googlers (Othar and Guha) involved with the project.

TechCrunch got some video of Google Squared. The whole video is interesting, but the part that I thought was funny was 4 minutes, 12 seconds into the video where the Google demo person signs into Google Squared and Michael Arrington does the polite “password lookaway” and looks at Steve Gillmor, who is also doing the polite password lookaway.

– Finally, Tara Calashain asks for a custom date range form (hear hear!) and then points out something I missed. Once you move into searching with date ranges, you can sort Google results by date. This opens up lots of options for power searchers. Here’s a search for [hubble telescope] with sort-by-date selected:

Sorting by date for hubble telescope

That’s pretty useful.

– If you want to see the slides from Searchology, it looks like Yvo Schaap took the time to snapshot each slide as it appeared. Until/unless Google releases the slide deck, that’s where you can see the slides unofficially. My favorite is slide #8.

Google Searchology 2009: Search Options, Google Squared, Rich Snippets

Google just finished its Searchology 2009 event. In previous years, Google has used Searchology to introduce Universal Search and Personalized Search. So what was new this year? Several things:

Google Search Options. Marissa Mayer referred to this as a handy “toolbelt” that lets you slice and dice your search results. You can do a search such as [cfl light bulb] and look above the search results for a “Show options…” link. Click on that to get a ton of useful ways to power search. You can see web results with images, like this:

Toolbelt for CFL light bulb search with images

My personal favorite is to sort by time (e.g. only show me results from the last week). That’s super-handy, and the option previously required clicking around in our Advanced Search. You can also search by genre, including forums and reviews. If you sort by reviews, Google will perform sentiment analysis and highlight interesting comments. You can also request longer snippets, see search results on a timeline, or explore more related search queries.

You can use a tool called “Wonder Wheel” to explore searches and see the results update as you click. For example, if you search for [matt cutts] then the Wonder Wheel will suggest that you might also be interested in search engine optimization. Click on that and the Wonder Wheel and the search results will change in real-time:

Wonder Wheel

Google Squared. You can type in any search and this Google Lab (scheduled to launch later this month) will try to build a useful “Square” that you can save. In the demo, if you typed in “small dogs” then Google would try to return types of small dogs, along with facts like how much they weigh. It’s easy to add a row to the Square, so you could add a row for Lhasa Apso and Google will try to infer the relevant facts from the web. You can also add new columns, e.g. if you type “energy level” then Google will look for corroborating facts across the web and try to guess the energy level of each type of dog. I can personally attest that Google Squared can be as fun as Google Maps–you can easily burn an hour just typing in random things to see what Google can do for that search.

Rich Snippets. See the official webmaster blog for more info, but this one is destined to be a favorite for webmasters. Essentially, you use some open standards (RDFa and microformats in the initial launch) to add some additional markup to your web pages. The markup is pretty simple and you don’t need to register with Google. Then when Google thinks it will help users, we show a “rich snippet” that has more information than a typical search snippet. Here’s a result on Yelp for a yogurt place, for example:

Rich snippets

Note a couple quick points. The markup annotates existing text that’s already on the page, and this richer markup exists out on the web. That means that any search engine can make their own rich snippets (there’s no proprietary data that only goes to Google). I like that Rich Snippets relies on open standards, that the markup is simple, and that the data is out on the web; it’s not locked up by Google in any way. I would expect Google to roll this out cautiously at first (much as we did with Sitelinks), but that more sites will see rich snippets over time.

Google SkyMap. Google also announced SkyMap, an Android app that lets you star gaze. With GPS, an accelerometer, and a compass, SkyMap can tell you what stars you’re pointing your phone at. You can also search for stars and the application will guide you until your phone points in the right direction.

If you’d like to read more, Techmeme has coverage and Danny Sullivan live-blogged the Searchology event.

On vacation for a week or so

Just to let you know, I’m on vacation for a week or so. If you send me email or tweet in my direction, don’t expect a reply for a while. 🙂

5 things you (probably) don’t know about me

(I was rooting around and found this leftover post from 2006 and figured I’d throw it out here.)

It looks like blog tag has come to the search bloggers. I’ve been tagged by so many people that I yield and surrender obscure facts about me.

  1. When I was growing up in Eastern Kentucky, there wasn’t always a lot to do. In high school, we once played Car Tag. In real tag, you chase people around until you can catch them. Car tag is played the same way. In order to win, you have to touch your car to the other person’s car. As I recall, I won at car tag. Please do not try this at home. Now we have things like the web to avoid boredom.
  2. My first computer was a Timex/Sinclair ZX81 that my Dad assembled from a kit. When we maxed out the 2K memory, he bought us a 16K expansion module. My second computer was a Commodore 64. I was a Commodore fan long after it was clear that IBM PCs would dominate that decade of computing.
  3. Growing up, my mother was an evangelical Christian and my father was a physics professor. As a result, I learned to have a healthy respect for people with different opinions and perspectives.
  4. In my freshman year of college, I was the eight-ball champion for my dorm. There was another guy who was better than me, but he had bad luck in the final game. On a good day at Google, I could sometimes beat Google Fellow Jeff Dean, who is a sharp guy with a pool cue. Now I haven’t played in years, so I probably suck big rocks at pool. Huh, Danny likes billiards too. Danny, we’re clearly just going to go a bowling/pool frenzy when you make it back to the valley. 🙂
  5. One of my cats, Emmy, likes nooks and crannies. Her favorite is curling up in a box or bag:
  6. Emmy in a grocery bag

Note: Back in 2006 I was going to tag a few people, e.g. Jim Allchin, but Allchin has left Microsoft and probably has better things to do now. That’s the hazards of doing blog posts ~3 years too late!

Thinking about thunder

I read an interesting blog post by Mike Markson, VP of Marketing for Blekko, which is the working name for a new search engine planned to launch to the public in a few months.

The title of Mike’s post was “Google Likes To Steal Other’s Thunder,” and he mentions several anecdotes to back up that idea. I was going to leave a comment on the post, but then Barry wrote about it at Search Engine Land, so I thought I’d go ahead and do a full blog post. I have actual knowledge (gasp!) of some of these incidents, so I can probably clear up a few misconceptions. Let’s walk through Mike’s anecdotes:

* This past Tuesday, Wolfram Alpha announces its structured data search product. On the same day, Google announced its new structured data product.

I wasn’t familiar with this one, so I dropped an email to Ola Rosling, the Googler employee who wrote the blog post announcement. It turns out that there’s a straightforward reason for the timing: the blog post was planned for a different day, but an early/unexpected baby arrival resulted in this blog post being rescheduled.

By the way, Wolfram|Alpha launches later this month and it sounds like a terrific idea. Any website that can blend large-scale data with the the processing power of something like Mathematica is just going to make the web a better place–I can’t wait to play with it.

* July 28, 2008, so called Google killer Cuil launched its search engine. It claimed that their index of 120B documents was 3x that of any search engine. Three days before though, Google announced it knew of 1 trillion URL’s.

I happened to be on the email thread that resulted in Google’s blog post, so I know that we passed one trillion urls seen and decided to do a blog post about it in early-to-mid June 2008, well over a month before Cuil launched and emphasized their index size.

* June 3, 2008, Wikia Search launched a feature that allows users to add and delete URL’s to search results. July 16, 2008, Google announced that it is bucket testing similar features. The features went live a few months later.

Sorry, but Google was testing our add/delete url feature months before Wikia. TechCrunch noticed Google’s add/delete feature as early as November 2007. Here’s an image of the feature from back in 2007:


* February 25, 2009, Cuil announced it is integrating longer snippets into its results. March 24, 2009, Google announced…you guessed it….longer snippets.

Sorry, but Google was doing longer snippets months before Cuil blogged about longer snippets. See for example this blog post from December 2008. Here’s an image from 2008:

Longer/adaptive snippets

Don’t get me wrong: I think Google can move quickly, and I don’t think that’s a bad thing. But it doesn’t seem fair to say we’re trying to steal someone’s thunder if (for example) our longer snippets came first. 🙂

Update: Mike added an update to his post. I’m looking forward to Blekko, because Rich Skrenta and his crew are smart folks. Watch an interview with Rich from a few weeks ago for more info.