We made a video about how Google handles the robots.txt file. You can watch it if you want:
This answers a couple questions such as:
- Why is my url showing up in Google when I blocked it in robots.txt? Did you fetch that url?
- How do I make that url disappear from Google?
Googler Douwe Osinga has a great personal project that demonstrates the Google Chart API. Just by clicking a few boxes, you can make an image to show the countries (or states in the USA) that you’ve been to. Here’s where I’ve been in the United States:
Clearly I need to do a trip across the northern part of the country. If you run a website, the Google Chart API is a great/free way to add pretty charts to your website or dashboard easily. You can even make google-o-meters
A few weeks ago we had a visitor at the Googleplex: Rob Hof, the Silicon Valley bureau chief at BusinessWeek. Rob talked to a bunch of Googlers and sat in on one of our weekly quality-leads meetings. The resulting story is out now. The first part of the story covers some of the challenges facing Google, but the second part gets into more detail than we normally get into.
What’s even more interesting is that BusinessWeek put up transcripts of some of the interviews. You can read interviews with:
Udi Manber, vice-president of engineering and head of the search quality group
Amit Singhal, head of Google’s core ranking team in the search quality group
Scott Huffman, head of the group that evaluates quality in the search quality group
me (Matt Cutts). I’m the head of the webspam team in the search quality group
Org-chart-wise, it looks like this:
Eric Schmidt would be at the top of the cloud, Udi would be the “Search Quality” box, I’d be in the webspam box, and Amit and Scott lead teams within the “Other groups” part.
The two interviews I liked the most were Amit’s and Scott’s. Amit sums up Google’s philosophy toward real-time, he discusses our pragmatic (yet algorithmic) approach to search, and our attitude toward our users:
Q: I think the criticism is: Where’s the money in those [non-search/ads parts of Google]?
A: The right way to look at it is not the money. Is there value to the users? If you bring value to the users, I think we will succeed in the long run. Some things make more money than others, but as long as we keep bringing value to the world, we will be successful.
I liked Scott’s interview because he goes into more detail of how we evaluate search quality than I’ve seen in the past. Evaluating search quality is really hard to get right. I also liked this quote:
But the other thing we always do is we go in and look in more detail at what are some of the individual positive and negative things that we’re getting out of this. Are the positive things really that positive, will they really make a difference to our users? And maybe more important, for the negative things, how important are they, can we live with them?
At the entrance to Google’s main cafe, there’s three doors. Two are normal doors that you pull to open, and they always work. The other door is a spiffy automatic door that slides open for you–except that the automatic door seems to be broken about 5-10% of the time. When the automatic door works, it’s very cool and you’d definitely prefer to use it. But when the door is broken, you’re left standing in front of a glass door and you feel like a dork as you wave your hands, move around, and generally try to get the “automatic” door to open for you. I’ve noticed that many people stopped using the sometimes-broken automatic door and instead always go straight to the reliable doors.
Search can be kind of like that door in a lot of ways. Spiffy features are great, but if they’re wrong or don’t trigger in some reasonable way that your mind can predict, the failure is worse somehow. The same holds true with the organic search results: a catastrophic search failure can stick in your mind much more than the 200 searches that worked well. Search quality evaluation is tricky because you need to take that factor plus hundreds more into account. It’s taken years for Google to really evaluate our quality well, and we still continue to learn important new things.
If you really want to understand more about how Google thinks, I highly recommend Amit’s and Scott’s interviews. They’re a great reminder to me that we have a very deep bench of smart, well-spoken people in the search quality group and in Google in general. I would love to see more Googlers talking about their work.
And finally, on the subject of Googlers talking about their work, a whole bunch of Googlers will be at the Search Marketing Expo East in New York this week. Joachim Kupke will talk about duplicate content, Ari Bezman will talk about maps, Jack Menzel will talk about what’s next in search and universal search, Jeremy Hylton will talk about real-time search, Maile Ohye will talk about best practices for search, Matthew Liu will talk about YouTube, and Frederick Vallaeys will answer questions about AdWords.
Also, don’t miss Bruce Johnson and Kathrin Probst from Google. They’ll be on the “CSS, AJAX, Web 2.0 & SEO” panel. If you’re at SMX East, I think you’ll enjoy that panel.
[I wrote this in January 2008 but never posted it. I think people might still want to read this, so I'm posting it now.]
In an election year, everybody gets a little more sensitive about politics, so I wanted a write a pre-emptive post in case anyone accuses Google of political bias in our search results sometime this year.
This is my personal opinion, but in my way of looking at the world, search quality > politics. That is, preserving the quality and accuracy of our search results is the best way we can help our users, while skewing our search algorithms to espouse a particular political party’s viewpoint would be anathema. This month I finish my eighth year at Google and begin my ninth (geez, I’m old), and in that entire time I can’t remember even the tiniest suggestion to bias Google’s search results toward any political party. The trust of our users is important, and in my opinion it would be an abuse of that trust to skew our search results toward any particular political view. I suspect that if you checked with old-timers at other search engines, they’d say similar things.
[A couple things to note: 1. This is a purely personal blog post--like other blogs posts I do, I haven't run it by anyone else at Google. 2. I'm writing it quickly because I have a lot of work to do. If I get something wrong, please let me know and I'll correct it.]
ABOUT two-thirds of Americans object to online tracking by advertisers — and that number rises once they learn the different ways marketers are following their online movements, according to a new survey from professors at the University of Pennsylvania and the University of California, Berkeley.
So naturally I clicked to see who the co-authors were. One of the study’s co-authors was Chris Jay Hoofnagle. Hoofnagle has served as the Senior Counsel and Director of the West Coast Office
of Electronic Privacy Information Center (EPIC). You haven’t heard of EPIC? EPIC was the group that in 2004 argued that Gmail should be shut down: “In a letter sent to California Attorney General Bill Lockyer on Monday, the Electronic Privacy Information Center argued that Gmail must be shut down because it ‘represents an unprecedented invasion into the sanctity of private communications.’ ”
I can guess what you’re saying. “That was five years ago. People didn’t know then how useful Gmail was going to be.” Okay, then did you know that EPIC lobbied the government to shut down Google Apps earlier this year? Here’s the article from March 2009:
A privacy advocacy group has asked the Federal Trade Commission to pull the plug on Gmail, Google Docs, Google Calendar, and the company’s other Web apps until government-approved “safeguards are verifiably established.”
If the FTC grants the request, hundreds of millions of Internet users would be unable to access their e-mail or documents until the agency’s formidable collection of lawyers in Washington, D.C., became satisfied with the revised applications. The outage would extend to businesses that pay for access to Google Apps.
The Electronic Privacy Information Center submitted the far-reaching request to the FTC in a letter from its director, Marc Rotenberg, on Tuesday.
Most people know that the choice of questions in an study can make a huge difference to the outcome. To fully inform the people who read the study, do I wish Chris Jay Hoofnagle had mentioned his connection to EPIC in the paper’s bio section? Yeah, I kinda do. At least when I checked Techmeme, not a single story mentioned Hoofnagle as a Principal Investigator on the grant and co-author on the study, or Hoofnagle’s connections with EPIC.
I’m sure that EPIC has done plenty of fine work to improve privacy on the web. I certainly disagree with some of their opinions: EPIC may have wanted to shut down Gmail five years ago and wanted to shut down Google Apps earlier this year, but I believe that would be a bad idea. I don’t think a majority of people want their Gmail or Google Apps accounts shut down by the government. And maybe this most recent study will be received as completely impartial–but I wish that Hoofnagle’s connections to EPIC had been disclosed in the bio section.
Don’t get me wrong. I welcomecriticism of Google (or other companies’ practices) from all corners of the web. From that criticism it’s important to look for ways to improve. People love (and hate) Google enough to give us passionate criticism, and I truly appreciate the feedback. It’s when Google’s features and products are greeted with indifference or apathy that I’ll really be worried.