How Google handles malware: a historical overview

Normally I like Nick Carr a lot, but the headline on his most recent article (“Google preparing to police web”) didn’t strike me as accurate. If Nick needs some background on how Google handles urls that potentially spread malware, maybe other people would benefit as well. I dropped a comment on Nick’s post that I’ll echo here, with minor edits and more hyperlinks:

(Disclosure: I’m a software engineer at Google.)

Nick, I normally love your posts, but your headline (“Google preparing to police web”) isn’t very accurate, because we’ve been tackling malware for quite a while. Here’s some historical context.

Almost exactly a year ago, Google and other search engines were raked over the coals for exactly the opposite reason: allowing users to get infected with malware from search engine results. See
http://www.mattcutts.com/blog/siteadvisor-study/
for more background. At the time, we were already anticipating the issue and had added “Don’t create pages that install viruses, trojans, or other badware.” to our webmaster guidelines.

Google’s response when we believed malware was present was to warn the user via an interstitial when they clicked on a search result that might infect their computer. See
http://www.mattcutts.com/blog/info-about-malware-warnings-and-how-to-appeal-them/
for an example post about this process and how to appeal it if you have removed the malware or believe there was an error.

Users liked the malware protection a lot, so we added some annotation to listings for sites that could potentially infect a machine. See
http://googlesystem.blogspot.com/2007/02/google-flags-pages-that-install.html
for more info.

Of course, it’s important to help regular webmasters who might have been hacked and not even know that they were infecting their users. To that effect, we added sample urls with suspected malware to our webmaster console. See
http://www.mattcutts.com/blog/got-malware-google-will-help-you-find-it/
for more details.

I’ve highlighted Niels Provos‘ fantastic work on my blog before, but Provos also provides free tools at http://www.spybye.org/ to help webmasters scan their own sites for malware.

All in all, I think Google does a pretty good job of protecting users from getting infected, while at the same time providing tools that assist webmasters in detecting and correcting hacked urls that could spread malware. Certainly compared to other search engines I think we provide more notice to users about potential malware urls, and we provide more info to webmasters about potentially hacked urls. So I think Google’s response to this issue balances the needs of users and webmasters pretty well.

I hope that helps give a little more context and historical background. Certainly I’ve seen emails from both sides of this issue, but I think Google strikes a pretty good balance.

Update: I forgot to mention that once you have all this historical background, then you’ll enjoy reading the USENIX paper “Ghost in the Browser” by Niels Provos and several other Googlers. It’s got a lot of useful information for people interested in malware.

54 Responses to How Google handles malware: a historical overview (Leave a comment)

  1. Doug Cress

    I find it an appropriate response, provided one becomes familiar with the Webmaster console which, given Google’s dominance in search, is a necessity.

    As an aside, I was also bothered by Carr’s statement: “Anything that makes people wary of visiting web sites or clicking on links stands as a big threat to Google’s business.”

    Big threat? Come on – its not like people are going to stop using the internet. Protecting the integrity of people’s computers is a worthwhile goal in and of itself.

  2. KJW

    Matt,

    I would agree with your stance on the topic. I think that Google has done a very good job from a malware stand point. Myself being a webmaster, I find the tools that are provided to me by the big G to be outstanding for all aspects of the web… malware included.

    Keep doing your thang.

    -KJW

  3. Well-said, Matt. Who would’ve thought flagging dangerous malware-spreading websites would be seen as monopoly behavior?

  4. Great post Matt!

    One of my websites received this label.
    “This site may harm your computer”.

    The only reason why I can imaging why that would happen is because I allowed the web hosting to expire. They replaced the home page with something else for about a week.

    (I got lazy and didn’t update my contact or billing info because the site is semi-abandoned)

    When I paid the bill and got my website back, the Google listing says “This site may harm your computer” between the page title and description.

    I can guarantee there is absolutely nothing harmful about the site.
    So… How do I get rid of this?

    Thanks for your help.

  5. I totally agree with you Matt. Google is doing a great job and is protecting many an unsuspecting user. Letting the webmaster know via the console is excellent.

  6. I think Google’s doing the right thing, they’re not “blocking people from going to those sites”, they’re just “warning people it could be a bad site”.

    Search engines help webmasters get additional traffic, but they’re not obligated to do anything. When a user uses a search engine, it’s assumed that the user obeys the search engine rules and is willing to accept the results search engines give (SERPs).

    All in all, Google has absolutely any right to do anything it wants on its own site (google.com), including the way its SERPs are displayed, how its ads are handled, etc. But if they want to keep being #1, they have to be competitive on “what users want”.

  7. Google should go a step further i would say.

    Delete the pages that are infected, send the webmaster a notification and see if they clean up the mess.

  8. Tonnie, I don’t think that sending emails would be possible considering the amount of “malicious” websites available nowadays, I’ve seen the protection in action many times, I find it very useful but I personally wouldn’t want to see any further actions from Google as a search engine… now that i think about it, a free Google anti-virus would be nice =0)

  9. Hawaii SEO, I would check in the webmaster console and we’ll tell you the specific pages that we flagged as potentially hosting malware. One of the links in my post (http://www.mattcutts.com/blog/info-about-malware-warnings-and-how-to-appeal-them/) tells how to appeal the decision to stopbadware.org, but the short summary is to go to http://www.stopbadware.org/home/contact_general and follow the directions there.

    Paul Zhao and Tonnie, I think that you’re both correct. Anyone can still go directly to the pages in question. When a site is hacked, showing an interstitial can be one of the best ways of alerting the site’s owners. At the same time Tonnie, if we feel that a page’s only purpose is to serve malware, or that the page is abusive/deceptive or spammy, we do reserve the right to remove the pages from our index entirely. A good example is that I saw some spam with tons of typos (keyword stuffing) trying to get bad spellers to a page, and then the page would very deliberately install a virus/trojan. Stuff that bad is way outside of our quality guidelines and we make it clear that doing that level of spam can result in pages being removed from our index.

  10. Talk about clearing the air. I think google is doing a great job keeping their listing clean of junk!

    I rarely see any spam … have never gotten a page with a virus from google … and the results are as relevant as they can possibly get!

    Look no one or site is perfect … however google is just about 99%!

    Keep up the great work!

    Darin

  11. Jay Cross, I have seen the annoyance that webmasters encounter if urls on their site are flagged, but I do think that’s counter-balanced by how bad it is for users to get infected with malware. We have improved the process by showing example urls to webmasters if they want to investigate via the webmaster console. I think that service is more than any other search engine does, and I like that spybye.org offers a similar service for free if people don’t want to sign up for our webmaster console. I believe you can download the source code from spybye.org as well.

  12. Dave (Original)

    Matt, must you keep spoiling stories with facts :)

  13. Yeah, Matt! Stop being logical and start telling the world what it wants to hear so that we can all go down the road to ruin in a state of infinite bliss and ignorance.

    Seriously, Matt, why answer that stuff? The only purpose an article like that serves is to prey on the emotions of the highly suggestible and naturally paranoid.

  14. Dave (Original)

    I believe the other term for it’s existense is known as “link bait”. Most like to believe that all big business are out to get them, paticulary ones who say “do no evil”. Often known as the “tall poppy syndrome” and there is no cure.

    Grap your tin-foil hats guys, Google is coming to get ya!

  15. Another option would be to add a text only preview or snapshot popup to that suspected link so users could inspect it without having to click.

    But it is really sad that a competitor could potentially hack and add malware to a website and ruin its rankings without its Webmaster even being aware :-(

    Before Google began communicating with Webmasters and getting feeback (a-hem) ;-) – imagine how may sites have been ruined over the years and no one knew why…..a classic example would be the recent case with Jennifer Convertibles.

  16. I applaud google’s efforts to reduce malware, however it could easily be taken much further. A very easy way to spread malware is by offering cracked programs to download. If Google banned sites that host cracks from its index it would both help the software community and help reduce malware. If I search in Google for many popular applications, within the first page or two there are sites with titles like ‘appname CRACK SERIALZ CRACKZ APPZ’ Surely it would be easy for Google to detect such pages and de-index these sites.

  17. Google heat spam & malwer that’s the reason beacause I’leave worpress for blogspot!!! Un the googlers.

    indie music rulz

  18. Sam I Am

    Hi Matt,

    Could you shed any light on if, at this moment in time, this kind of warning note also means a drop in ranking or a penalty of some sort? It kind of annoys me when for example on page two of a relatively competitive search term like ‘travel forums’ you’ll find notices like this (last week it was actually page one, along with quite a few other not so decent sites!)….

    Speaking of sites that target misspellings, I count the same site 37 times in the top 50 for the misspelling ‘buget hotels oslo’…. (I actually found this thinking I had searched for buDget, so was totally freaked out at first by the spam in the regular serps!!! :) ).

    Have a good holiday; don’t post too much!

    Sam

  19. I agree that Google has made some big efforts to handle malware sites and help users detect the sites and even help to keep the users browsing experience safe. We all know what it feels like to have to format because of some computer infection. I applaud Google for their efforts and Nick should have been a more informed and less biased. Nice facts Matt :)

  20. I agree with the comment that G should delete the url. Why would they want it in their index anyway. If they find a site thats not appropriate then sandbox it or supplement index it or better still don’t even send a bot over it.

  21. I have to just agree with Matt, Google does a perfect job here.

    * Google doesn’t ban questionable sites from the index, but makes you ensure you are looking for the correct thing, and that you know the potential dangers. I would think this is much better than banning them.

    * Google (aka ms Fox) run a very good system for webmasters to become aware of any problems and how you can fix them (webmaster console)

    * I support the decision not to explicitly censor searches such as “Crackz”, people who search for that kind of thing should know what they are getting themselves in for, and the kind of people who run the sites they are visiting. That is no reason to ban them from the index, Google needs to do as much as possible to stop it from becoming a defining part of what information there is, and stay as neutral as possible (hopefully personalisation is the first step towards this)

  22. I have to disagree with MikeB. I love crack websites because they allow me circumvent the frustrating DRM imposed on me by greedy software publishers who end up punishing their legitimate customers with needless copy protection. I’ve used software cracks for 15 years and I don’t think I’ve ever gotten a virus from one. Crackers are perpetually in competition with each other to release the highest-quality code.

    There’s no greater risk of infection from a software crack than there is from any other freeware. I applaud Google for examining sites individually rather than making blanket assumptions and banning the lot.

  23. I agree with Brian’s last statement. Google shouldn’t police cracks any different than they should police freeware, imo.

  24. OK, I’m impressed. I ran a scan of a known massively infected shared hosting company and all of the domains that my script kicked out as being infected were already flagged in Google as bad.

    I’m sure Matt knows which host I’m referring to, but I still think you’re running risks indexing ANY sites on this host as someone appears to run a script that randomly infects domains on these shared servers.

    This host has been having this issue for over a year now with articles and blogs written about it, yet it persists. If I were Google I’d flag all their domains until they got this fixed because you still run a major risk that any site on that host could be infected at any time after it was cleanly crawled.

  25. So now we’re going to have to learn how to optimize pages to avoid being flagged as malware?

  26. TerryG, I think it makes sense to try to distinguish between truly malicious/spammy pages (where it make the most sense to remove) and valid pages that have been hacked to host malware (where it makes the most sense to help the webmaster recover from that situation).

    IncrediBILL, when we see large webhosts with a significant number of infections, we often do try to contact the hosting company and see if we can give them advice/help. Sometimes that works well, but not always.

    Michael Martinez, it’s easier to never host malware; then you don’t need to worry about this. :)

  27. I just thought of two IT-related questions on this:

    1) Either in addition to or in replacement of an interstitial, would it be possible in the future to see a “Safe Search” variant whereby sites that include malware can be removed by the searcher by his/her own control?

    2) Can the interstitial be included in the Toolbar at some point? This would allow users of the toolbar fair warning if they click on a malware link that didn’t come from the Google search engine.

  28. Hmm, well the difference between a crack and freeware is that a crack is copyright violation and illegal. Publishers have the right to use what copy protection mechanism they want, if you don’t like it, don’t use their software.

  29. People who run websites that have software that tracks your every move or installs links the owner didn’t want there should definitely be flagged as malware and you should warn more people about them.

  30. Dave (Original)

    I love crack websites because they allow me circumvent the frustrating DRM imposed on me by greedy software publishers who end up punishing their legitimate customers with needless copy protection.

    Say what??

  31. When you are on top, you are the target for all who envy or hate. After all, ignorance is bliss.

    If Google does not police the trash that can be found on the net, they become no different than all of the other Internet destinations that I no longer visit. If you want the freedom to stroll through seedy locals, it is out there. Personally, business is better for me in a respectable neighborhood.

  32. “People who run websites that have software that tracks your every move or installs links the owner didn’t want there should definitely be flagged as malware and you should warn more people about them.”

    I completely agree graywolf, amongst having my own businesses, I also work in a corporate environment and the IT department has complete heart attack if something makes it past the firebox. The IT department assumes that the users will use common sense, but we all know the human brain periodically malfunctions. Keep up the good work Matt!

  33. Matt,

    I’m not to worried about malware specifically, because we seem to have a pretty good host and I haven’t seem evidence that we have an issue, but what steps can I take to make sure that search engines don’t have any other issues with my sites. O know there are a lot of factors, but when traffic trails off a lot you tend to get concerned. The site I’m wondering about is fairly new, so I’m wondering if it just needs more time to come back.

  34. I fail to see what the controversy is. Google isn’t blocking/banning the website, only giving users a warning and allowing them to make an informed decision. The fact that they supply the necessary resources to correct the problem makes this a valuable resource to the internet community as a whole.

    I for one would love to see some statistics on how Google’s new policy has affected the proliferation of malware worldwide… And when the new Google Internet Security Suite will be available for download.

  35. JB

    The BBC have a link to this article from their News homepage – http://news.bbc.co.uk/2/hi/technology/6645895.stm – if anyone’s interested in reading it.

  36. Jan

    My question Matt is how does Google treat a website that was (note WAS) infected, clears the virus, but finds a problem. In my case there were pages and pages of porn content, and on these pages were scripts so people landing on these pages would be taken off to an actual porn site. So here is my site sitting with all this content (now deleted) and my keyword / key phrase listing has gone down the tubes because of this. HOW to I report something like this to Google and get back in the search ranks for what I should I was listed for and still should be listed for?

    Jan

  37. Harith

    Matt

    I see you enjoying a very “active” vacation :)

    Just a reminder of tomorrow…. Mother’s Day. And in your case maybe Mother-In-Law’s Day too :)

    Sooo. what are you sending to Kentuky and Nebraska this year ?

  38. Hi Matt & co

    Should Google be used as a service which rids the net of all that is evil? It can certainly act as a tool to make us more aware about the sites we visit and for that it does a pretty good job but it’s a difficult line to cross. Once upon a time Google was a search engine. Now, it’s not only that but a whole host of other things too. My concern is one which has probably been echoed around your offices as the malware notification function was implemented, “What if we notify a legitimate website as having malware installed” The detrimental effect of a company’s ability to survive could hang by a thread on this one. However, it has been dealt with well. It is one thing to put up a notification like this but it is quite another to actually dump the listing into the supplemental index or remove it completely…

    Thankfully Google has not gone this far and I would hope it stays that way. There is only so much one should interfere and I would hate to think Google may one day overstep the mark, currently however they are doing a damn fine job!

  39. It’s good to see the development of this protection. Keep up the good work Google :p

  40. yes I think that sorts it out. Nice post

  41. Agree with TerryG . Google can stop crawling such sites or either sandbox them or send them in supplemental index. But obviously if they dont send bot there then they have to make a system for that so later on when the site is fine they can reindex them.

    As for Brian’s comment, agree. but what to do for that??
    I think G can categorise such sites for indexing…..just an idea :)

    I want to read all comment and comment on them but cannot.

    just thinking forward what G will do next for this.

  42. Dog

    I too agree that Google is doing a pretty good job. Great post Matt.

  43. I believe that any website which very deliberately install a virus/trojan should be deleted from the index of google .I applaud that google do the great job.

  44. Great job by Google on sorting the bad websites. Thanks!

  45. I agree with Scented Candles, great job GOOG!

    “All in all, I think Google does a pretty good job of protecting users from getting infected, while at the same time providing tools that assist webmasters in detecting and correcting hacked urls that could spread malware.”

    I would also agree with this. I’ve never had a problem when searching with Google as I have with others in the past. Though this issue seems to be taken care of now on most other engines, I’ve always considered Google + my Norton to keep my PC relatively safeguarded from Malware/Spyware.

  46. cookies

    Brian – its the crack sites that are causing software developers to implement harsher DRM and anti theft measures.

    I’ve even had potential customers tell me that they will see if they can find a crack instead of buying our software !!!

  47. Norman Diamond

    I’m pretty glad to see this kind of protection offered to searchers. These warnings are probably the second-most important kind of protection in its class. I hope that the most important kind is also being offered, but oddly didn’t see it mentioned.

    Around 2 years ago, old, and hopefully obsolete, here’s the kind of situation where searchers really most needed protection. I typed in some identifying information about a hardware device, hoping to find a device driver for it. Search results included a page that looked promising, it looked like the manufacturer’s page which would offer a download of the driver. (In fact a few days later I could confirm that it really was the correct page.) The manufacturer’s site was down. That’s nothing unusual, lots of manufacturers still have occasional outages. So I clicked on the link for Google’s cache of the page, and the cache looked pretty promising too, it looked like the real page (and confirmed it a few days later). I went back to the search results page and clicked on another link to download the driver from Google’s cache. I got the download! Ohhhh… so _that’s_ why the manufacturer’s site was down. But before they went down, Google had cached the virus.

    I hope Google takes equal care in its caching as it does in warning users.

  48. Perhaps I’m in the minority, but I’m in favor of Google doing at least some “policing”. It seems to me as an outsider that part of the business of running a search engine that millions of people use is creating trust among users. I think this works two ways:

    1. No one will use Google if they are not generating useful search results. So if Google suddenly penalized legitimate sites, it has a direct influence on their wallet size.

    2. If people feel that they are being jipped by Google’s malware “policing”, they won’t use Adwords or any of the other fancy tools that Google makes a killing from.

    In general, I would have no problem if Google openly admitted tomorrow to policing the web, because Google only holds power as long as they people use it, and they won’t use it if it’s not useful.

  49. Hello Mutt.
    Why do not you add antivirus to the Google Tool bar? Hereby, not only it would report of the infected pages when you do a search but also it would warn you when you come for other sites.
    When you come to Spain, Spain is diferent ¡¡¡

  50. Can you point me in the right direction…
    I have been on your site a few times, mainly when the Shi…. hits the fan on our site.

    We were penalized by google on monday. We are not knocked off google, but when we used to get 25% of all our traffic from Google search results and now we ger 5% something is wrong. I know you can’t comment on my site, but a general idea on where to look would be helpful. It happen once before and we found a link on our site to a “bad neighbor” when we removed it everything was fine. I have gone over all the rules the webmaster page has on google and I can not understand why this happens.

    Would you be able to post a response, generally telling people what else to look at if you are penalized?

    Thank you

  51. Last week I was pretty horrified to see in the serps that one of my main sites was marked as a malware site! I checked up and sure enough it had been hacked and some nasty code added to the index file. We fixed the site, blocked the security hole and informed the really nice guys at stopbadware.org and a couple of days later they reported back that the site was clean again.

    If the site hadn’t been marked as a danger source in the first place it could have been months before I noticed the problem so I could have lost a whole lot of business. A big well done and thanks to Google!

  52. Hey,
    a smiliar project to the SpyBye is the “Web Exploit Finder”, created by me and some friends from school last summer. Maybe you’ll like it:
    http://www.xnos.org/security/web-exploit-finder.html
    Contact me if you want to know more. benni.
    -SDG-

  53. I totally agree with you Matt. Google is doing a great job and is protecting many an unsuspecting user. Letting the webmaster know via the console is excellent

  54. yes, malware has been quite the pain for a while now and quite a few times I have managed to get it removed by myself, so I suppose I have been lucky, but it would be nice if the search engines protected us from it.

Leave a Comment

Your email address will not be published. Required fields are marked *

*

If you have a question about your site specifically or a general question about search, your best bet is to post in our Webmaster Help Forum linked from http://google.com/webmasters

If you comment, please use your personal name, not your business name. Business names can sound salesy or spammy, and I would like to try people leaving their actual name instead.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php