Archives for May 2008

Helping hacked sites

(I’m taking my wife somewhere really soon, so I’m just going to dash out a quick post.)

There was a Techmeme discussion this weekend about whether Microsoft should chase Google in search or find their own “Big Hairy Audacious Goal.” Into that discussion came a post by Ryan Stewart about being removed from Google’s index. It turns out that Ryan’s blog had been hacked, and Google does remove hacked sites from our index to protect our users. I left a comment at Ryan’s blog, but while I wait for it to be approved I thought that I’d post it here as well:

Hi Ryan, my name is Matt Cutts and I’m a software engineer at Google. Sorry to hear that your blog got hacked. I know that it’s disappointing if you don’t show up in Google, but there’s another way to look at it. It looks like your blog was hacked to show “buy pharmacy”-type links, but what if the hackers had hosted malware on your site? Then every user to your site might have gotten infected just by visiting your site. That danger to Google users is one of the reasons that we temporarily remove hacked sites from Google.

I’m glad that things look clean now and I’ve revoked the “hacked site” flag for your domain. I’d expect your domain to return to Google within 48 hours, if not sooner.

By the way, we did try to contact you. We sent an email to contact [at] digitalbackcountry.com, info [at] digitalbackcountry.com, support [at] digitalbackcountry.com, webmaster [at] digitalbackcountry.com, and a gmail.com address on May 19th at 21:25:23 with a subject line of “Removal from Google’s index.” I believe if you had logged into our webmaster console at google.com/webmasters and proved that you owned digitalbackcountry.com, we also would have left a message waiting for you there as well. That webmaster console is the primary way to request reconsideration in case your blog has been hacked.

We do try to communicate with hacked blogs where we can, and we also do blog posts to try to help prevent hacked sites and for site owners to recover from hacked sites. Some example posts that we’ve done in the past:

http://googlewebmastercentral.blogspot.com/2007/09/quick-security-checklist-for-webmasters.html
http://googlewebmastercentral.blogspot.com/2008/04/my-sites-been-hacked-now-what.html
https://www.mattcutts.com/blog/how-google-handles-malware-a-historical-overview/

The only last point I’d make is that users tell us loud and clear that they don’t want to be sent to hacked sites, because of the potential danger that they represent. Even though it’s stressful to be removed from Google, I hope you understand why Google might not want to send users to a hacked blog.

Again, thanks for cleaning up your site and you should return to Google’s index soon.

How Google should handle hacked sites is a tough question, but personally I think Google does a better job than other search engines of protecting our users and communicating with site owners about hacked sites. For example, here is an excerpt of the email that we sent to Ryan on May 19th:

Dear site owner or webmaster of blog.digitalbackcountry.com,

While we were indexing your webpages, we detected that some of your pages were using techniques that are outside our quality guidelines, which can be found here: http://www.google.com/webmasters/guidelines.html. This appears to be because your site has been modified by a third party. Typically, the offending party gains access to an insecure directory that has open permissions. Many times, they will upload files or modify existing ones, which then show up as spam in our index.

The following is some example hidden text we found at blog.digitalbackcountry.com:

Acyclovir Adderall Adipex Alprazolam Ambien Ativan Biaxin Bontril Bupropion Butalbital Carisoprodol Celexa Cheap Phentermine Cialis Online Cialis Cipro Clonazepam Codeine Darvocet Diazepam Didrex Diflucan Effexor Ephedrine Fioricet Flexeril Generic Viagra Glucophage Hydrocodone Online Hydrocodone Levitra Lexapro Line Xanax Lipitor Lorazepam Lortab Meridia Nexium Norco Viagra Tramadol Soma Phentermine Valium Norvasc Buy Acyclovir Buy Adderall Buy Adipex Buy Alprazolam Buy Ambien Buy Ativan Buy Biaxin Buy Bontril Buy Bupropion Buy Butalbital Buy Carisoprodol Buy Celexa Buy Cheap Phentermine Buy Cialis Online Buy Cialis Buy Cipro Buy Clonazepam Buy Codeine Buy Com Lvivhost Online Viagra Buy Darvocet Buy Diazepam Buy Didrex Buy Diflucan Buy Effexor Buy Ephedrine Buy Fioricet Buy Flexeril Buy Generic Viagra Buy Glucophage Buy Hydrocodone Online Buy Hydrocodone Buy Levitra Buy Lexapro Buy Line Xanax Buy Lipitor Buy Lorazepam Buy Lortab Buy Meridia Buy Nexium Buy Norco Buy Norvasc Buy Online Xanax Buy Oxycontin Buy Paxil Buy Percocet Buy Phentermine Online Buy Phentermine Buy Propecia Buy Provigil Buy Prozac Buy Renova Buy Seroquel Buy Soma Buy Tadalafil Buy Tamiflu

[…]

In order to preserve the quality of our search engine, we have temporarily removed some of your webpages from our search results.

(The rest of the email goes on describe how long the blog will be out of Google, and where to go in order to get back into Google’s index faster.)

Getting hacked is not fun. It’s just not. But I think Google does the right thing for our users by removing hacked sites from our index temporarily. I also think we do a pretty good job of trying to alert site owners that they’ve been hacked — more than any other search engine does. We alert many webmasters about hacked sites not only via email but also with our webmaster console.

Do I want more competition in search? Absolutely, because it keeps everyone on their toes and working hard for our users. But I think Ryan’s specific situation actually shows that Google is trying to do the right thing for site owners and users. Ryan, I hope there’s no hard feelings that your site was removed from our index after being hacked, and now that it’s clean you should be back soon.

Stupid Google Tricks: Get a calendar from the search box

I spend a lot of time in my browser. So much time, in fact, that I notice when I drop down to a command-line to type things. I wanted to look up a day later this year, so I typed “cal 2008” into a Unix terminal window. I caught myself thinking, “Hey, why doesn’t Google add a onebox shortcut for searches like ‘cal’ or ‘cal 2008’?”

On one hand, I could bug someone at Google with my request. To be honest, not many people would benefit from a feature like this. Then I realized that I could still solve the issue for myself with Google Subscribed Links. It takes 2-3 minutes to define a shortcut that says “When the user types query X, show a link to page Y in the search results.”

So I ran the command “cal 2008” and copy/pasted the output into a file on my domain. Then I made a simple subscribed link in 2-3 minutes. The interface looks like this:

Just a pointer to my calendar

If you’d like to add this subscribed link to Google too, you can subscribe to my calendar subscribed link with one click.

Anyone that is subscribed can search for [cal] or [calendar] or [cal 2008] and you’ll see a link like this:

Calendar link

And clicking it will take you to my calendar page.

You could have more fun with this, but I’ve already spent more time writing about it than the original hack took. Other thoughts:
– Google Subscribed Links can do more powerful things (e.g. use a feed file), but I didn’t need that power for this simple hack.
– I could have made a script to dynamically show the current year instead of 2008. But compared to the time to copy/paste a text file, I’d almost rather just change the text file once a year.
– If you wanted some practice with Google App Engine, an app to show a calendar for the current year would be a pretty good starter project.

css.php