Archives for June 2007

Google improves search for fresh documents

Here’s a power-searcher tip that didn’t get much attention the first time around, so I wanted to mention it. Tara Calashain recently wrote about changes to Google’s date search features.

Previously, I believe Google estimated the age of a url as the last time that we fetched that page. Given how quickly Google refreshes its main index, that didn’t mean quite as much recently. Now for date-based search, Google estimates url ages by the first date that we saw a url.

Here’s an example. A normal search for [Google] looks like this:

Normal Google search

If you restrict the results to the last three months, the same search for Google gives more recent urls:

Recent Google search

Those results are all things that launched in the last three months. And if I try it with something like [toronto] and I see recent discussion of SES Toronto instead of older urls.

Just a handy search tip for power users. Thanks to the Googlers who improved date-based search. 🙂

SMX Seattle wrap-up

Gah. I’ll never catch up on email; might as well blog a little bit. 🙂

I had a really good time in Seattle. I got to meet many more members of the Google Kirkland office, go up in the Space Needle, visit the Science Fiction Museum, walk around Pike Place Market, and even see a little bit of West Seattle. West Seattle was especially fun. I ate delicious macaroni and cheese at West 5 and picked up the new album from The National. It’s called Boxer and it sounds like a little like Lou Reed and The Feelies had a love child. I like it.

Also, there was a search conference. 🙂 This was the first SMX conference, and it was a blast. I think Danny did the right thing by pitching the first conference to an advanced audience. It made for a really friendly, laid-back atmosphere. I had a lot more people just walk up to say hello than I usually do, and that was wonderful. I met nice people from Austin, Isla Vista, and all over the world. If you look at the end-of-conference picture, you can just barely see me at the very top left (I’m wearing my green Ale-8 Tshirt). I talked to several Googlers during and after the conference, and we all got good feedback and ideas from the attendees.

As you may know, SMX joins two other major search conferences: Search Engine Strategies and PubCon. I don’t know how the competition between the industry conferences will turn out, but it’s good that each conference is ready to offer what they believe is the best experience for SEOs and webmasters. Adam Lasnik spoke on three panels at SES Toronto last week, and I’m looking forward to sitting down with him and going over the feedback from the last couple weeks. Please be patient though; I’m still digging out from my vacation too.

A few highlights of the conference:
– Seeing all the familiar SEO faces at the MSN party the first night (and getting to meet a few new folks!).
– Starting out the conference with a Q&A session with Danny, me, and the audience. Danny and I got to use these little headset mikes as if we were a boy band. 🙂 I pretty much despise PowerPoint-heavy presentations, so it was fun to just talk search with a few hundred webmasters. We covered everything from supplemental results (and how they’re indexed) to Wikipedia to Stephen Colbert. You can read the write-ups in live-blogging style, a slightly more cleaned up “Susan Esparza” style, an abbreviated summary, or even a write-up in haiku format. Some write-ups don’t mention that to keep the conference fun and casual, Danny agreed to strip from his normal suit into shorts and red Vans.
– Patrick from feedthebot.com asked why the webmaster guidelines aren’t more detailed, and then Riona and Vanessa managed to launch more detailed webmaster guidelines in time for the second day of the conference.
– If you like video, Mike McDonald caught me for a video interview later that day to recap the topics that had come up at the conference so far. I also got Matt-jacked into a dark room by Rand for the “SMX Diaries” video interviews. And WebProNews had some snippets of me from the Q&A session.
– If that’s not enough Matt-video for you, I just realized while doing this write-up that someone posted the video of the keynote Q&A session that I did at SES London a few months ago. That’s pretty funny.
– A nice Yahoo mixer, complete with Yahoo ice cubes (they’re purple plastic and light up when you put them in liquid). I saved a couple as prizes for my team.
– A good dinner with other search folks.
– Meeting lots and lots of cool people at the Google party. I never made it into the dance floor until the party was winding down, but I did get to see the vintage arcade games that were set up and the cool Google ice sculpture. Thanks to the Googlers who helped organize the party or manned tables to answer questions there!
– Meeting Satya Nadella and hearing his Q&A with Danny Sullivan.
– Lots of other fun panels the second day, including the “Give it up!” session.
– The end-of-conference photo and the general niceness of the Third Door team to take care of speakers.
– Woohoo for wifi! I really don’t want to attend any conference without wifi at this point.
– Hitting the SEOmoz SMX party. It was a bit of an off-night in terms of my ability to represent Google well. I bowled a 133, which is okay but not spectacular. Plus Rebecca “Bec” Kelley and Cameron Olthius beat me and a guy from Microsoft at eight-ball. I’d love to blame the MSFT fellow, but in truth I started the game trying to sink the wrong team’s balls. Of course, Cameron let me sink a ball before mentioning that. 😉 On the bright side, Danny and I got to rumble in pool. He brought his A-game, and I barely managed to eke out a win. Add in the chance to compare notes on how Google is doing with a couple blackhat spammers, and overall I’d call it a great party. Well done, SEOmozzers.

I know I’m missing a few things, but those are some of the things that stand out looking back a couple weeks later.

Wishing Vanessa the best

In case you haven’t seen it, Vanessa Fox has decided to leave Google. I’m really sad to see her go, but the work she’s done has really helped webmasters and Google. In the years that Vanessa has been at Google, she’s helped to launch and improve the webmaster console, communicate policy and get feedback with the outside world, and bring search engines together on the Sitemaps standard. She’s done a bunch of things to make Google a better company and search engine.

As a result of Vanessa’s influence, Google does a lot more webmaster communication than we did even a couple years ago, from conferences to blogs to user forums. I’ll miss her dearly, but the communication she began will be continued at Google. When I visited the Kirkland office a couple weeks ago, I was struck by just how many high-quality people the webmaster central team had. And as Vanessa mentioned yesterday, the team of people who communicate webmasters continues to grow.

Vanessa, thanks for all the wonderful things you’ve done at Google, and I wish you the best at Zillow or anywhere else you go in the future. Let me know if you ever need a glowing recommendation. Or I guess you can also just ping some of the tons of webmasters you’ve helped over the years. 🙂

Google responds to E.U. Working Party letter

Two or three weeks ago, the European Union Article 29 Working Party sent Google a letter asking about some of Google’s privacy practices. Google responded to the letter in a blog post today and made its entire response letter available (PDF link).

The two pieces of news I see are:
– Google previously committed to anonymize its server logs after 18-24 months. Today Google announced that it plans to anonymize server logs after 18 months.
– Google is considering reducing its cookie expiration time:

We are considering the Working Party’s concerns regarding cookie expiration periods, and we are exploring ways to redesign cookies and to reduce their expiration without artificially forcing users to re-enter basic preferences such as language preference. We plan to make an announcement about privacy improvements for our cookies in the coming months.

Cool. Those are good changes in my book.

Why I disagree with Privacy International

Sigh. Google as a company takes privacy very seriously. I personally feel strongly about protecting our users’ privacy. So I’m frustrated by a recent study that Privacy International did, and I want to know if I’m off-base in my reaction. I got back home from SMX and I’m surfing the web when I see this AP article entitled “Watchdog group slams Google on privacy”:

In a report released Saturday, London-based Privacy International assigned Google its lowest possible grade. The category is reserved for companies with “comprehensive consumer surveillance and entrenched hostility to privacy.”

None of the 22 other surveyed companies — a group that included Yahoo Inc., Microsoft Corp. and AOL — sunk to that level, according to Privacy International.

So I surf over to Privacy International (PI) to read the actual report, and I have to be honest with you — it made me mad. But I try not to blog when I’m angry, so I decided to sleep on it. After sleeping on it, I’m still pretty frustrated with Privacy International’s conclusions. Here’s my take.

Google didn’t leak user queries

In this past year, AOL released millions of raw queries from hundreds of thousands of users. Within days, a journalist had determined the identity of an AOL user from the queries that AOL released. But AOL got a better grade than Google.

Google didn’t give millions of user queries to the Dept. of Justice

In 2005/2006, the Department of Justice sent subpoenas to 34 different companies requesting users’ queries and other data. In fact, the original subpoena requested all queries done by users for two full months. AOL, Microsoft, and Yahoo all gave some amount of users’ queries to the Department of Justice. Google fought that subpoena (full disclosure: I filed a declaration in that case). The judge sided with Google; no queries from Google users were given to the DOJ. But Yahoo, Microsoft, and AOL got better grades in this report than Google.

Google will anonymize query logs

In March, Google announced that it would begin anonymizing its logs after 18-24 months. Google has continued to communicate on the issue, including a post on the Google blog in May discussing the reasoning behind that decision. In fact, we talk a lot about privacy, from blog posts to Op-Ed pieces in the Financial Times. To the best of my knowledge, no other major search engine has followed suit in a plan to anonymize user logs.

Misc bits

Other parts of the study just baffle me. The report claims (I am not making this up) that “Every [Google] corporate announcement involves some new practice involving surveillance.” I know that my years of working at Google may bias me, but does that sound impartial? Let’s test that claim. Here’s a Google corporate announcement we made on our blog in March. Google expanded our support for open-source in our third annual “Summer of Code”:

Last year we paid 630 students from 450 schools in 90 countries $4,500 each to work on open source software projects. These projects, selected by some 100 open source mentoring organizations from over 6,000 applications, provided students with invaluable real-world programming experience.

That’s over three million dollars in open-source development last year, with even more money set aside for this year. The program introduces students to open-source programming. In return the open-source community and regular users benefit from students’ projects. Does Google’s Summer of Code program have anything to do with surveillance? Nope, not even close.

Conclusions

Sigh. Okay, take deep breaths, Matt. My spleen is vented. 🙂 Personally, I think Privacy International should feel remorse about walking right past several other companies to single out Google for their lowest rating. But I think that there’s a larger danger here too. I believe this report could corrode earnest efforts to improve privacy at companies around the internet. Why? Because the bottom-line takeaway message that I got from the report is that a company can work hard on privacy issues and still get dragged into the mud. Consider: in the last year or so, other companies gave users’ queries to the government, leaked millions of raw user queries, or even sold user queries and still came off better than Google did.

Wait — someone sold my data?

If I ran a privacy group, I would *find out which ISPs sell their user data*. While Privacy International was conducting its six-month-long study, credit bureau Experian committed to buy Hitwise for $240 million dollars. From the press release:

Hitwise collects and aggregates information from Internet Service Providers (ISPs) on how over 25 million consumers use and search the Internet in the US, UK, Australia and other countries in Asia Pacific.

If you check Hitwise’s most recent blog post about UK site Gumtree, they discuss collecting user queries: “Hitwise captured 4,201 unique terms sending visits to the website.” Did those queries come from opted-in users, or from ISPs? If I ran a privacy organization, I’d want to know which ISPs sell user data. I’ve pointed out before that ISPs have a superset of data on a user compared to almost any other online company. Some have suggested that ISPs sell user data for as little as 40 cents per month per user. It looks like Privacy International didn’t include any ISPs in its study of online companies. Luckily, some other folks are looking into it. A Wired blog enlisted readers and started to get some answers on the topic.

If Privacy International really wants to focus on Google rather than digging into companies that are, you know, actually buying and selling user data, that’s their choice. 🙂 Note that I have nothing against Hitwise, Compete, or ISPs at all; I just think it’s unwarranted to call out Google when user data is being bought, sold, given to the government in the millions, or being leaked — by other companies. And I think Privacy International missed the mark badly by giving those companies a better rating than Google, or by not including the right online companies in their study.

Now it’s your turn. Am I off-base on this issue? Or did this study miss the mark? (I’m going to bed now, so I’ll approve comments in the morning.)

css.php