Generic Malware Debunking Post

Yup, I’m about to do another blog post where someone says that a website is clean but it doesn’t look like it to us. I did a very similar post in January 2007, and in that post I said

I’ve checked out a quite a few “we don’t have any malware” reports at this point, and I’ve yet to see a false positive — the sites in question have each had some malware on them.

Would you believe that a year and a half later, that’s still true for me? It may be possible that our malware flagging system has false positives, but I can’t recall a single case that I’ve seen where there wasn’t some security hole or malware that was a true issue for the website owner. If you want to know why, read Google’s white paper about how we detect such stuff — it’s called The Ghost In The Browser Analysis of Web-based Malware and it was written by Niels Provos and several other Googlers.

In fact, just last week I handled a very similar case where Google proactively reached out to a website that had a scripting flaw security. The deja vu from my January 2007 post plus the situation last week made me want to write a generic malware debunking post. 🙂 Are you ready? Here we go:

$ACCUSER = Brett Glass
$FORUM = Dave Farber’s Interesting People mailing list, specifically this email.
$LONG_ACCUSATION = (I’m going to quote Brett’s whole email here, just for context)

Everyone:

Google has been a strong supporter of the agenda of Free Press, an
inside-the-Beltway lobbying group which has spent hundreds of
thousands of dollars lobbying for regulation of the Internet under
regime known as “network neutrality.” While some of the tenets
included in this agenda are not reasonable, one of those that IS
reasonable is the notion that large corporations such as Comcast
should not block content with which they disagree.

However, Google — itself a large corporation — appears to be
blocking a site which expresses opinions with which it does not
agree on this very issue. When one does a search for the terms
“neutrality” and “site:pff.org” (the link

http://www.google.com/search?hl=en&q=neutrality+site%3Apff.org&btnG=Google+Search

will perform this search for you), many of the pages and documents
on the site — in particular, white papers expressing views with
which Google disagrees — are tagged with a warning that “This site
may harm your computer.” One cannot click through to the documents
and pages in Google’s search results without cutting the URL from
the page and manually pasting it into one’s browser.

The Web site, operated by a group known as the “Progress and
Freedom Foundation,” does not appear to contain any malware. When
one queries Google as to why the site was blacklisted, it claims
that “Part of this site was listed for suspicious activity 1
time(s) over the past 90 days.” Yet, we could find no malware or
other exploits in the blacklisted PDF files, some of which contain
very well presented and cogent arguments against the agenda which
Google has been actively supporting.

Could it be that Google (whose motto is, reportedly, “Don’t be
evil,”) saying, “Do as I say, not as I do?”

–Brett Glass

P.S. — What’s especially interesting is that if one queries Google
using just the term, “site:pff.org” (you can use the link

http://www.google.com/search?hl=en&q=site%3Apff.org&btnG=Search

to do this query), one can see that the majority of the supposedly
dangerous site is not blocked. But most or all of the documents
expressing viewpoints on “network neutrality” are.

$SHORT_ACCUSATION = “Google blocked a site with opinions that it disagrees with. Worse, the query [site:pff.org] seems to show that only urls under pff.org/issues-pubs/ are labeled as potentially harmful, and that is the directory where many of the documents that disagree with Google are.”

Given what we have so far, my generic debunking would begin like “Dear $ACCUSER, I saw on $FORUM where you mentioned that Google is flagging a website as malware. You said that $SHORT_ACCUSATION. I wanted to give you a little more background and context to let you know that Google did see an actual malware attack via a real security hole. The other thing you need to know is that Google flagged the site because of the security hole, not because Google agrees or disagrees with any particular content on the site.”

Then I’d give a little background history on all the different ways that Google helps users and webmasters avoid malware. Most of the background would come from this overview post. Since that post was published in mid-2007, Google has done even more to protect users:

– Niels Provos and his colleagues published another technical report with more details about the malware detection framework and what it discovered (more info here).

– Google launched a Safe Browsing API so that third party applications can benefit from Google’s list of malware and phishing urls. If you appreciate that Firefox 3 has better security, one of the reasons is that Firefox 3 utilizes the Safe Browsing API.

– More recently, the anti-malware folks at Google launched a Safe Browsing Diagnostic page where you can enter a url and get a ton of really useful information.

The last one is especially impressive. For example, check out the Safe Browsing Diagnostic page for pff.org:

Safe browsing page for pff.org

That page gives a ton of helpful info to site owners and anyone else who is interested in why a particular site or url was flagged as potentially harmful.

All that would go quite far to reply to people that had questions about their site being flagged for malware. But this post is getting quite long, so let’s get back to this specific report in this case. The original person who reported this situation had already noticed that not all of pff.org was flagged. If you do a site: query on Google, you only see warnings for pff.org/issues-pubs/ .

If you visit pff.org/issues-pubs/, you’ll see that it’s a web form. It looks like pff.org stored their data in a SQL database but didn’t correctly sanitize/escape input from users, which led to a SQL injection attack where regular users got exposed to malicious code. As a result, normal users appear to have loaded urls like hxxp://www.ausbnr .com/ngg.js and hxxp://www.westpacsecuresite .com/b.js <--- Don't go to urls like this unless you are 1) a security researcher or 2) want to infect your machine. Notice that even in this case, Google didn't flag the entire pff.org site, just the one directory on the site that appeared to be dangerous for users. I never like it when people accuse Google of flagging a site as malware just because we don't like it for some reason. The bright side of this incident is that pff.org will find out about a security hole on their site that was hurting their users (it looks like pff.org has disabled the search on the vulnerable page in the last few hours, so it appears that they're responding quickly to this issue). Flagging malware on the web doesn't earn any money for Google, but it's clearly a Good Thing for users and for the web. I'm glad we do it, even if it means that sometimes we have to write a generic malware post to debunk misconceptions.

44 Responses to Generic Malware Debunking Post (Leave a comment)

  1. Thanks Matt!

    Well written, informative post. Must read for webmasters.

  2. It’s always funny to see paranoid people jump to conclusions about why Google wanted to harm or de-list their site. Watch next time Page Rank updates and the intervals change. People move in PR through no fault of their own but immediately put together wild speculation to try to explain the change. It’s sorta of like early men worshiping forces of nature that they didn’t understand..lol. I almost said Cavemen right there but ever since the Geico commercials I am more sensitive to the feelings of Cavemen 😛

  3. http://thedailywtf.com/Articles/The_Great_Google_Banner_Ad_Conspiracy_.aspx

    You have to be forgiving though. When people are afraid of something and just wait for it to happen – anything they see will be “the ultimate proof” for them.

  4. now, i am curious.

    http://www.google.com/safebrowsing/diagnostic?site=http://www.google.com

    Has this site hosted malware?

    Yes, this site has hosted malicious software over the past 90 days. It infected 1 domain(s), including hoyem.org.

    http://www.google.com/safebrowsing/diagnostic?site=http://www.craigslist.org

    What happened when Google visited this site?

    Of the 22 pages we tested on the site over the past 90 days, 0 page(s) resulted in malicious software being downloaded and installed without user consent.

    what i found interesting about the safe browsing diagnostic tool, is that the question “What happened when Google visited this site?” and its responses lead me to believe that malware tests aren’t conducted each time googlebot visits. it told me “Google has not visited this site within the past 90 days.”

    which isn’t true. it probably means google hasn’t tested my site for malware within the past 90 days.

  5. Can we run that diagnostic ourselves on our own site or sites we visit?

    There’s a local weekly metro news site here in Detroit (I’m not affiliated with it in any way, I just normally read it) and lately my firefox browser has been complaining that it’s got malware. This has been going on for weeks if not months, so I haven’t gone to it. I’d just be curious to run the diagnostic to see if it’s actually infected (it’s http://www.metrotimes.com for what it’s worth)

  6. But Matt, Google does flag sites though. In other ways. Not just with a “malware” notice.

    Picking up a small thread in the “$ACCUSER’s” email and running with it…

    Here is part of Google Net Neutrality stance:
    “Network neutrality is the principle that Internet users should be in control of what content they view and what applications they use on the Internet. The Internet has operated according to this neutrality principle since its earliest days. Indeed, it is this neutrality that has allowed many companies, including Google, to launch, grow, and innovate. Fundamentally, net neutrality is about equal access to the Internet. In our view, the broadband carriers should not be permitted to use their market power to discriminate against competing applications or content. Just as telephone companies are not permitted to tell consumers who they can call or what they can say, broadband carriers should not be allowed to use their market power to control activity online. Today, the neutrality of the Internet is at stake as the broadband carriers want Congress’s permission to determine what content gets to you first and fastest. Put simply, this would fundamentally alter the openness of the Internet.”

    Google is -not- neutral when it suits them.

    For example… Google’s Page Rank data is freely available if you use a tool bar or site operator in a Google search. Not too unlike Google visiting a publicly available site and indexing the publicly available content. I’m sure that you have a TOS or something about how Page Rank data is used but I’m sure that Google itself does not read and follow every TOS that is on every site out there. I’m sure there are a few that have TOS that say Google must pay them thousands of dollars if republished… or some similar bit.

    But as this video shows:
    http://blogoscoped.com/archive/2008-07-08-n14.html
    You’ve said that Google can feed misinformation in rank checks if you don’t like what people are pulling from Google…

    Yet you’ve said again and again that in regards to the average joe, cloaking is NOT ok. That Googlebot should see the same thing as a normal visitor. Yet Google obviously cloaks in it’s own way, when it suits them.

    In a previous post to Google Groups I asked about your automation policy and part of your comment included:
    “One thing I would *not* recommend is that if a tool is blocked for bad behavior, trying to make the tool more “sneaky” (e.g. trying to make the tool
    look closer to a web browser).”

    Why do you think that automated tools try to look more like a browser? Because Google can cloak the information that it serves up or disable your IP address if you -are- straight forward. Case in point, your own comment out feeding bad data to some one checking Google.

    Why does Google want to have it both ways? They can freely pick up, index and make money off of everyone else’s publicly available content but you want to disallow others from doing the same of freely available, publicly available content from the Google site.

    Google uses an automated tool to discover and index content but wants to disallow others to use automated tools. You can promote Net Neutrality when it suits you but not when it’s inconvenient.

  7. netmeg, great question. The Safe Browsing diagnostic page will give info on any page/site you want to query for. Just change the sitename in the url parameter. For example, http://www.google.com/safebrowsing/diagnostic?site=http://metrotimes.com reports

    Of the 2596 pages we tested on the site over the past 90 days, 136 page(s) resulted in malicious software being downloaded and installed without user consent. The last time Google visited this site was on 07/14/2008, and the last time suspicious content was found on this site was on 06/17/2008.

    Malicious software includes 154 trojan(s), 46 exploit(s), 38 scripting exploit(s). Successful infection resulted in an average of 6 new processes on the target machine.

    Malicious software is hosted on 15 domain(s), including heihei117.cn, woai117.cn, dota11.cn.

    11 domain(s) appear to be functioning as intermediaries for distributing malware to visitors of this site, including heihei117.cn, fengnima.cn, nihao112.com.

    So yes, it sounds like metrotimes.com has some security issues to look into. I believe if the owner of metrotimes.com registers their site in our webmaster console, we will show them example urls from their site that have issues. Then after the site has been cleaned up, they can use the webmaster console to request a review of the site and (I think) get an automated answer back.

    Scott, my short answer is that Google’s PageRank indicator shows some of our opinion of the reputation of a site. Our opinion of the reputation of a site can be negatively affected if we see it violating our guidelines by (for example) selling links that pass PageRank. In the specific instance you’re talking about, someone was trying to scrape an enormous amount of PageRank values from Google by pretending to be the Google Toolbar. Google wasn’t doing anything special or different for Googlebot (i.e. we weren’t cloaking to Googlebot), we were trying to get a high-volume spammer to stop scraping us, much in the same way that we turn off general Google search against high-volume scrapers, worms, and trojans as we’ve mentioned here: http://googleonlinesecurity.blogspot.com/2007/07/reason-behind-were-sorry-message.html . That has nothing to do with cloaking, given that none of these anti-scraper actions involved Googlebot or even regular users in any way.

    Scott, I’ll close by quoting one sentence:

    They [Google] can freely pick up, index and make money off of everyone else’s publicly available content but you want to disallow others from doing the same of freely available, publicly available content from the Google site.

    For the first part of the sentence, I’ll just say that anyone can block Google in robots.txt and we’ll happily not index that site. All reputable search engines abide by robots.txt. For the second part of the sentence, PageRank is not freely available from Google by using the site: operator in Google. If you want our PageRank opinion for the value of a specific url, you have to install the Google Toolbar, and it’s natural that we protect ourself against automated scrapers.

    Gary Schubert, I’d never seen that — thanks for the laugh. 🙂

  8. “Doesn’t make any money [for protecting users from malware]”

    I would be a lot less likely to use Google if the search results included malicious pages. I am really glad you sanitize your search results. There’s nothing wrong with making money by offering a valuable service, but let’s be clear: Google gives away free services to increase advertising inventory.

  9. Jonathan Hochman, I don’t think of it as “Google gives away free services to increase advertising inventory.” That doesn’t explain why Google decided in 2002 or so to really try hard so that we didn’t partner with badware/malware. It doesn’t much explain why we decided that we would not show pop-up ads on Google at around the same time. Or why we actively try to show fewer but more high-quality ads. All of those steps work to reduce Google’s advertising inventory.

    Let me suggest a more general theory: when surfing on the net is better (more fun, pleasurable, useful), that’s good for users and indirectly good for Google. Therefore if you press for an open net where users have choice and enjoy and use the net more, that may be good for Google somewhere down the line, but it will definitely be better for users now. That theory nicely aligns the incentives of our users with Google’s incentives.

    The two theories are pretty close — yours is a touch more cynical than mine 🙂 — but they differ slightly. I can tell you that the latter theory is more close to how the Googlers I know think. Showing fewer ads, getting people off of Google’s web search to other properties quickly, the development of Android, the white spaces proposal — I see all of that as part of a single effort: making the net a better, more fun, more useful place to be. Sure, people will search more and that may be good for Google. But it’s also just a good thing in general.

  10. Thanks! I sent this on to them.

  11. (And by the way, at the very time I was reading this, I was interrupted by a client who was infected by malware from some website, and I spent the better part of an hour walking her through removing it from her system – hopefully we got it all. So if Google wants to toss up notices about potentially bad sites, I’m all for it)

  12. hi,

    I am just in the middle of crisis with my web site (mrak.org) which is also flagged as dangerous.

    I am running simple wordpress installation with standard plugins and just about everything seems fine. i have checked source code of the site and it is not corrupted. i have tried to check source code of suspicious pages (48 of them) and could not find a single line of the code which I did not expect there.
    malicious links are everywhere on my site (main index, tag links, category links, archive links)

    google message states that I am linking or that malware is coming from 85.255.118.0 which is “russian business network” related site; I am not aware that I have linked

    I have tried to use exploit prevention labs scanner which could not find anything…

    the point is that to the very best of my knowledge i was completely unable to find where the problem lies

    there is a chance that malware is visible only to some IP adresses and possibly not to my country?

    so, if you can give me any clue on how to proceed or what to check I would be more then grateful

    while all of advices here are nice, I can not grasp anything useful on how to track problem on my site

  13. This seems to work well but when can we expect to receive a warning from Google that our websites might have been hacked?

    I would assume that a website with years of trust in Google that suddenly posts hundreds of Spammy links to [whatever] would raise a red flag that could trigger an email to warn the webmaster of a possible de-listing.

    It would be great if there was an early warning system in place versus the current process of…

    1) Discovering the website has disappeared in Google, (Horror)
    2) Investigating and discovering the spam & links
    3) Removing the spam
    4) Groveling to be re-included.

    Thanks.

  14. @Scott,

    Not to presume to know how Matt would respond, but have you checked out what Google does when it picks up your content? It checks your robots.txt file and honors it if you tell it “don’t index this directory.” It gives you options in the webmaster tools for slowing down the indexing, removing URLs, and other things.

    Google isn’t having it both ways. It’s asking people to show it the same respect it shows webmasters. Most automated access tools it has problems with are aggressive and stupid. They don’t slow their roll or follow rules. Google usually has to take measures to block or disable them so that a few rogue automated processes aren’t sucking up so much processor time and bandwidth that it degrades Google’s ability to serve data to legitimate applications/users.

    If someone was hammering your site, you’d block them or throttle them. You’d do it just to stop them from monopolizing your resources. You’d feel it was your right to protect your site. Why should Google be denied that right?

  15. mrak, you might need to set your referrer so that it looks like you’re coming from a Google search. Check for recently changed files, too, and your .htaccess file. We saw stuff like hxxp://85.255. 118.252/ind.php?src=411&surl=www.mrak.org&sport=80&suri=%2F <-- don't go there. Dave, the problem is this. If Google just found out that a site is hosting malware, how long should we wait and let the web server infect users before flagging it as malware. As a user, you'd want to know about a potentially dangerous page as soon as Google could reliably detect it.

  16. Great stuff Matt. One thing I should of posted on the article about spam clean up. I would like to see older content moved down the Google rankings a bit. Sometimes when I search for a particular keyword I get CNN 2001. I know CNN is a great authority website but sometimes things get out dated. Just a heads up.

    I’m a frequent visitor to your site. Hope everything is well.

  17. Just an example I searched for autism articles

    I got this http://www.time.com/time/covers/1101020506/scautism.html

    from May 6, 2002

  18. Why would anyone tell google they are being targeted by them? It dosnt sound like a good way to get any help.

  19. Hey Matt,
    I don’t know about you, but sometimes I type slower than I think…
    I was trying to say that there are many free bits of info available from Google and elsewhere but with info that comes from Google. There happens to be free info available directly from Google via the site operator the tool bar, etc. as well as through various free searches that end up getting data from Google. Many of them are found here:
    http://www.google.com/search?q=how+to+check+page+rank+from+a+google+search
    http://www.google.com/search?q=how+to+check+page+rank
    Going back to my ongoing question that you never answer, I don’t see any of these guys listed in a Google guidelines… How is neutral to list one company and not others? Why does Google list us rather than make a more neutral stance against automated tools?

    As you admit, you were trying to discourage the high volume scraper from scraping from Google. To do this, you served up information that was different from what is normally expected. The means may be different than traditional cloaking, but as you note, he was easy enough to identify that you served up something different than what would have normally came up A rose is a rose. This is cloaking but with a different name and different application than what is commonly used.

    Additionally, I feel that if you can make the argument that this action is not cloaking, then I have to take issue with how you use the word “scrapper”.

    Want to see real “scrapers” in action and put a bit more oomph into the “we don’t index search results pages” stance? Check this out…
    http://www.google.com/search?q=inktomi.com/click%3Fu%3Dhttp://clickserve
    http://www.google.com/search?q=yahoo.com%2Fclick%253Fu
    http://www.google.com/search?q=clickserve.dartsearch.net%2Flink
    I’m sure you are more creative than what I’ve got in my searches.. but most of the sites found are scrapers. You might write a filter to remove some of those from the search results…

    WP is a tool. I would say that it is a more advanced web browser if anything else. It is not a -scraper-. It is also not a search engine bot. It is a tool that iis designed to help one to perform searches like a web browser. I’ve even seen some rudimentary browser add ons that do some of the same types of things. Yet I don’t see Google calling out Firefox, or those tool authors. If some one creates a rank checking tool in Yahoo Pipes or a similar service are you going to name Yahoo Pipes?

    And Google can’t kvetch about the bandwidth/resources taken up when they want neutrality from the telcos/cable companies who want the same thing, but only from sites like Google. I want to support Net Neutrality but it’s hard when sites like Google speak out of both sides of it’s figurative mouth.

    The ISPs feel that sites like Google hog bandwidth and so they want to create a tiered internet where if bandwidth hogs don’t pay, they get in the slow lane or slow tier. Google feels that IP addresses that hog bandwidth should be served into the slow lane or cut off. Or in some cases, as you made clear, get served mis-information (fake PageRank numbers). This isn’t a huge leap from:
    http://www.mattcutts.com/blog/confirmed-isp-modifies-google-home-page/

    But overall Matt, is WebPosition “banned” from Google? And why list it only when there are other programs and services that do as much or much worse, let alone actual scraper software (not WP) that pollutes the search results with scraped information and often MFA pages..??

  20. matt, sorry to bother but…
    my .htaccess is clear (default wordpress)
    .php files seem intact
    on the whole site including database I can not find string you quoted

    i have installed http sniffer and could not find the string anywhere

    I really, really wish to resolve this but simply do not understand where to look next

  21. David Stansbury

    I applaud the extra step, above-and-beyond good faith efforts of Google to help protect internet users, and making diagnostic info available is certainly a huge benefit to unwary webmasters. I wonder if arbitrarily providing website insecurity info to any curious visitor is such a good idea, as it may inadvertently provide pre-qualified security breach data to the “bad guys”. Or perhaps the message is sanitized enough to not provide too much information? Detailed security problems should certainly be available to authenticated Webmaster Tools users. Good stuff as always. 🙂

  22. Would you believe that a year and a half later, that’s still true for me? It may be possible that our malware flagging system has false positives

    Here’s a chalange for you: http://www.google.pt/search?hl=pt-PT&q=psd&meta=cr%3DcountryPT that’s the main party in the opposition that used to be headed by the now president of UE com. Just imagine the republican party website flagged by google. I’ve looked at it and have yet to find fault.

    Now, I do agree that this is a very positive thing for webusers and, in the long term, webmasters, and, as such, it creates value for Google.

  23. As many beefs as I have with Google these days, I think it’s great that they flag malware sites. It’s one of the things I think they still do from the pre-public days without thinking about profit.

    Nothing I hate worse than to follow a link and have my virus scanner go bonkers and lock up the page.

  24. I was really ignorant of this capacity of Google until I read this post. Amazed. I think it provides a chance for the webmasters also to correct the problem, if they want the site to be indexed in the search result. I do not think Google should tolerate sites spreading Malware forever but obviously webmasters should get a chance to rectify it.

    Very considerate approach to caution all parties involved.

  25. Thanks Matt.

    I see your point. It is good to remove the malware or hacker spam threat from the Google index ASAP. But… A courtacy email to the webmaster at the time of removal would help the webmaster diagnose and remove the malware or hacker spam in a matter of hours versus weeks.

    BTW – If a website or blog has been hacked and stuffed with spammy links… Does the Google penalty increase with each day the hackers spam goes unnoticed by the webmaster?

  26. Dave, I believe that in many cases we do email the webmaster. I doubt that we do it in all cases, since we don’t always have contact info for a website.

  27. Someone may wish to let the BBC know they are serving up malware 😉

    http://www.google.com/safebrowsing/diagnostic?site=http://news.bbc.co.uk

    However, I don’t see an alert in the search results:
    http://www.google.co.uk/search?q=bbc+news&sourceid=navclient-ff&ie=UTF-8&rls=GGGL,GGGL:2006-11,GGGL:en

    Though I may be looking at the wrong results on that second link as it is likely a sub page, either way, the BBC are serving malware.

  28. Well written, informative post. Must read for webmasters.

  29. Matt, i’m curious about what Corey pointed out. On Sunday, i found out that GSB was reporting that Google infected 1 site (hoyem.org) during the last 90 days. Corey wrote that here on Monday.

    Now, GSB isn’t reporting that anymore. What changed?

  30. chris Jangelov

    I checked a few sites in the .se, .no and .org domains. Some of them from a major software vendor. None of them were reported to have been visited within the last 90 days.
    Interesting.

  31. The more I see the more I believe Google isn’t being malicious. Just wish they would come up with a way for me to re-include my business url in G-Maps. Thanks Matt Cutts.

  32. Matt, I was referred to stopbadware.org by Google, and a couple of days on their mailing list convinced me that Google’s badware stuff is pretty wacky.

    After some fun descrambling JavaScript on one of our self publishing services, I discovered it was just a referral script with a referrer ID to a competing search engine, done to spam search engines. Didn’t appear to be any malware involved any where. In these cases I accept the content isn’t useful to the net, but it wouldn’t seem to qualify as “badware” by any normal definition.

    Also the diagnostics are insufficient for webmasters. If you are going to tell people not to visit a page, you need to give webmasters a much clearer reasons than some random IP addresses that are nothing to do with their site. The explanation read like “we don’t like you because” “asparagus” “moon rock”, with no way for the user to get to “asparagus” other than to try and discover what Google found on their site. Please referring URL, and ideally any other information gleaned on the referral or method. Often the webmasters aren’t gurus and they are turning up at stopbadware.org asking for very basic diagnostic information – like which pages are bad, and what does the error mean.

    PS: I also noticed that we delivered the dodgy content as text/plain – this doesn’t seem to stop browsers executing it as if it were JavaScript – hmm surely some mistake in browser implementation here. I don’t think our self publishing system can inspect all the plain text and make a reasonable assessment if plain text is going to be mishandled by the end users software, otherwise we’d never serve anything to certain operating systems just to be sure.

  33. “But overall Matt, is WebPosition “banned” from Google?”

    Scott, the reason that WebPosition is mentioned in our guidelines is because of the sheer quantity of unwelcome queries that WP Gold used to send to Google to scrape Google’s rankings. I think we already covered this though.

  34. Hey Matt,

    We actually got hit by an attack yesterday, the site has been classified as distributing malware by Google. We have removed everything we could and submitted a review request.

    Can we expect to see lasting ranking drops because of this or does Google allow for the fact that things like this are out of our control?

    More importantly I wanted to comment on this particular thread to support Googles implementation fo the malware blocking system, without GMWT and G Search to let us know this malware could have been on the site for weeks causing problems for our clients.

    All systems have problems and I am sure there are genuine cases out there of this system screwing people over a few times but the greater good is that this has helped users and webmasters alike to increase overall web security.

  35. I didn’t know about the details of checking sites or particular web pages that are being flagged for malware before I read this post of yours and tried the link that you have given in reply to one of the readers as “The Safe Browsing diagnostic page will give info on any page/site you want to query for. Just change the sitename in the url parameter. For example,

    http://www.google.com/safebrowsing/diagnostic?site=http://metrotimes.com

    Thanks a lot for sharing the information.

  36. Matt,

    For the record 🙂

    Sometimes your title is mentioned as:

    The head of WebSpam at Google.

    And sometimes:

    The head of Spam Detection at Google.

    Are above two different teams? Are you the head of two teams now? Got promoted without telling us or something 🙂

  37. Speaking of spam detection, I couldn’t believe what a story I read on digg today. Confirmed it a couple of hours ago and was there.

    http://www.pcbugsquad.com/2008/07/googles-grandcentral-blog-has-been-hacked/

    It appears Grandcentrals blog was hacked and now Google, themselves, are pushing spam links for buying drugs and online pharmacies.

    I just checked it out and these spam links are still in the html? You woul dhave thought the algos would have caught this one on your own site ? 🙂

  38. Head of webspam? No… that spot is reserved for adwords and pay per click guys. aka the spam on the right side team. 🙂 aka buy links and not get penalized team. (sorry but truth is truth)

  39. Matt has replied ontwitter to my above posted question:

    I normally say “Head of the webspam team at Google” unless people don’t know what webspam is. But I usually only work on webspam.

  40. @Matt:

    When you say you are head of webspam, it sounds like you are in charge of creating spam rather than fighting it.

  41. Matt:

    Your account above does omit some things. For example, many of the links flagged as pages containing malware were not Web pages at all, but rather PDFs — just documents. And they were perfectly clean. What’s more, only SOME of the documents on the site were flagged as malware — ones that came out at the top of Google’s search results for items related to “network neutrality” on the site. So, at best, the PDF files that were flagged as having malware were false alarms due to overzealousness. At worst, there may actually have been some “activist” inside Google flagging them, though I hope not.

  42. Hello Sir Matt Cutts,

    Thank goodness I finally found your blog.

    This topic ” Generic Malware Debunking Post” seems to be just what I’ve been looking for in terms of trying to comprehend what’s going on with my site.

    After viewing some of your teaching videos on Youtube and seeking help regarding my issue from the Google Webmasters Forum with no successful results, I decided that you were definitely “THE ONE” who possessed the knowledge to best assist me.

    It appears as if my Blog has been Penalized because Google apparently thinks its Spam..

    or….

    URLs from my site (more than 300) are being blocked for some other reason.

    None of them are Duplicates, Spam, Porn, Paid links or 404s, so why are they being restricted?

    My Traffic Ranking also keeps going UP and then dropping really LOW for no logical reason.

    This is happening despite the fact that my Traffic is pretty Good.

    Ex: From like 51,000 to 27,000 in one week.

    After submitting a sitemap, I was being indexed then suddenly that appeared to stop as well.

    When I checked my Diagnostics and Robots.txt file I found this statement:

    “Allowed
    Detected as a directory; specific files may have different restrictions”

    I initially thought that someone was perhaps sabotaging my site from another server, etc., however now after reading your post entitled, ” Generic Malware Debunking Post”, it looks as if Google thinks my Blog is a Spammer which most definitely isn’t true!

    I sought re-consideration for Indexing and was assured that I’m am but I don’t see any new posts showing up in Google.

    My Blog is currently being hosted on Blogger and while I do plan to change platforms, I’d like to fix the errors first or at least find out what’s wrong.

    In a nutshell HELP!

    Please

    Thanks

    RB

    http://votemecool.blogspot.com

  43. Hi Matt,

    Just saw this and thought it was interesting/funny. I used the Safe Browsing Diagnostic page on google.com itself and got back info that you guys had a trojan on your system and that it infected 1 other site.

    Glad to see it didn’t effect other sites and resulted in no new processes on the target machine!

    You can view the details at: http://www.google.com/safebrowsing/diagnostic?site=http://google.com

    Thanks for letting us know about the useful tool Matt.
    – Sean

  44. Another self congratulatory, “aren’t we wonderful” blog from Google!

    Simple fact – you do sometimes block websites without justification. Everyone makes mistakes and you and your systems are just as capable of making errors as we all are. Time to stop being so Godlike, yes?

    You blocked my website last Friday and your own systems have not worked as they should have done on this occasion.

    The diagnostic page offered no information whatsoever – it implied there was no evidence of malware! The webmaster tools added no further information, and incidentally – the only content within my website which is externally controlled is from Google!

    Four sets of independent checks were carried out and no malware was discovered; neither was there any identifiable evidence of a security vulnerability.

    The assessment page Google published on the Saturday (less than 24 hours after the warning went up) even confirmed that the site was clean and the warnings were being removed. They were still in place on Monday so I requested another review with a comment stating that the warnings were supposed to have been removed a few days ago. My comments were ignored but a second review was undertaken, once again showing there was nothing wrong with my site. The warnings are still in place today (Wednesday).

    DIAGNOSTIC PAGE
    What is the current listing status for http://www.removed.ie?
    Site is listed as suspicious – visiting this web site may harm your computer.

    What happened when Google visited this site?
    Of the 139 pages we tested on the site over the past 90 days, 0 page(s) resulted in malicious software being downloaded and installed without user consent. The last time Google visited this site was on 09/29/2008, and suspicious content was never found on this site within the past 90 days.

    Has this site acted as an intermediary resulting in further distribution of malware?
    Over the past 90 days, http://www.removed.ie/ did not appear to function as an intermediary for the infection of any sites.

    Has this site hosted malware?
    No, this site has not hosted malicious software over the past 90 days.

    How did this happen?
    In some cases, third parties can add malicious code to legitimate sites, which would cause us to show the warning message.

    FROM WEBMASTER TOOLS
    Status of the latest badware review for this site: A review for this site has finished. The site was found clean. The badware warnings from web search are being removed. Please note that it can take some time for this change to propagate.*

    * How long is “some time”. I am timing this from the first “all clear” given on Saturday 27th September.

    You are quite right that website owners should be more thorough in their checks, but you are ignorant of how your actions can affect innocent people and you should show them a little more respect than you do. Your process is harsh and flawed. Google should remove suspicious entries from its index allowing full and thorough investigations behind closed doors.

    As it is, you have already lost me at least two clients and you have provided no evidence whatsoever to show your warnings were even justified.

    Have a nice day.

css.php