“Fetch as Googlebot” tool helps to debug hacked sites

One of the most tenacious blackhat webspam techniques we continue to see is hacked sites. I wanted to remind site owners that our free “Fetch as Google” tool can be a really helpful way to see whether you’ve successfully cleaned up a hacked site.

For example, recently a well-known musician’s website was hacked. The management firm for the musician wrote in to say that the site was clean now. Here’s the reply I sent back:

Unfortunately when our engineers checked this morning, the site was still hacked. I know the page looks clean to you, but when we send Googlebot to fetch www.[domain].com this morning, we see

<title>Generic synthroid bad you :: Canadian Pharmacy</title>

on the page. What the hackers are doing is sneaky but unfortunately pretty common. When you surf directly to the website, you see normal content. But when a search engine (or a visitor from a search engine) visits the website, they see hacked drug-related content. The reason that the hackers do it this way is so that the hacked content is harder to find/remove and so that hacked content stays up longer.

The fix in this case is to go deeper to clean the hack out of your system. See http://support.google.com/webmasters/bin/answer.py?hl=en&answer=163634 for some tips on how to do this, but every website is different.

One important tool Google provides to help in assessing whether a site is cleaned up is our “Fetch as Googlebot” feature in our free webmaster console at http://google.com/webmasters/ . That tool lets you actually send Googlebot to your website and see exactly what we see when we fetch the page. That tool would have let you known that the website was still hacked.

I hope that helps give an idea of where to go next.

Something I love about “Fetch as Googlebot” is that it’s self-service–you don’t even need to talk to anyone at Google to diagnose whether your hacked site looks clean.

22 Responses to “Fetch as Googlebot” tool helps to debug hacked sites (Leave a comment)

  1. We had the exact same problem with our website. It looked clean, but it was not. Matt didn’t reply to my email though ;).

  2. I don’t even understand why hackers would do this. Are the hacked pages even making any money for them? Do the hacked pages rank for a period of time before Google catches them? It seems that Google catches them pretty quick, so I just don’t see the point.

  3. Fetch as Googlebot is a very useful tool. When I’m launching a site, I always use it and once the new site has been fetched there’s the feature to submit it to straight to the Google Index. Very useful.

  4. Hi Matt,

    Very good point. Other way a user can emulate Googlebot is by using curl and doing:

    $ curl -D – -A “Googlebot” site.com (for example)

    To see if there is anything hidden. There are other types of spam/malware that only shows if the user agent is Windows, but the referer is Google.com

    We have a quick ( and free) site check that looks for some of those variations along with a long malware/spam signature list: http://sitecheck.sucuri.net

    Thanks!

  5. Thank you for this information. Is there any tips for making a website secure and make the chances of someone hacking your site less likely?

    Also the other day I was at my friends house and we were using Google to shop. When we click one of the results it redirected us to a site that was clearly different and full of spam. When we went back to the results and re-clicked the link it took us to the correct site. Was this site hacked or was it some other element that caused the redirect?

  6. I want to say great job on the webmaster tools, thanks. I have a small question: When does the “URL and linked pages submissions remaining” amount increase? Does it ever refresh?

  7. I was never quite sure what “fetch as Googlebot” was about before-thanks.

  8. Useful information as usual Matt. I think it should be restated that people can see if Google has detected Malware on their site via the WMT console. Like this: http://cl.ly/image/0H3G031k093l

  9. I have personally tested this tool and is very useful. Great job Google Webmasters 🙂

  10. How exactly you know MAtt if your site has been hacked after the Fetch? I’ve never tried this tool but I want to clarify it first..

  11. Hi Matt,

    Are there any drawbacks to using ‘fetch as googlebot’ regularly?

  12. It’s nice to see that there is an easy way for webmasters to check this. Hopefully this helps to reduce the success rates of hackers to the stage where it’s not even worth doing.

  13. Thanks for the update..this has been driving me crazy for days now. I think Google does have a limit of 50 submissions per week for just the URL. One of the great features of this tool is that it lets you see your site as Google’s search bot sees it.

  14. I am dealing with this right now for several of my sites. Visitors send me the error messages, my host scans and says I am clean and so does my Webmaster tools.

    For another site people don’t send messages, host and webmaster tools tell me that domain is not clean.

    Very frustrating and I have been surfing all morning and just found this about the Googlebot tool. Yea – a new toy!

  15. A couple things to note in the 12-step recovery process 😉
    – some hackers do heavy spam & light spam at the same time. Therefore you might clean out the heavy spam & think the job is done, while the lighter integrated spam remains.
    – in addition to cloaking spam to Googlebot, some hackers may install files somewhere else on the server outside of wordpress, so you can clean up the issue & then have it return days later. in some cases if it returns, in addition to updating salts, wordpress passwords, ftp, database password & cleaning spam out of database and the spam files off the server, you might also need to change servers.
    – some hackers also embed spam in some of the static files on the site too. and that stuff tends to use a display:none or also be deeply embedded into the content area in a way that is hard to notice at a glance.

    @Brett yes those hackers do it to make money & make lots of it. They tend to target some of the most profitable areas, like prescription drugs and such.

  16. I think this will be welcomed by a lot of business owners when you consider the jeopardy this can cause to businesses and how disruptive this can be to their day to day operations. I think this could also be a wake up call for a number of web developers out there.

  17. Doesn’t this basically show you the html code view of your website? What would be the difference in this and right clicking the page in Firefox, and hitting view source?

  18. I use the Fetch as Googlebot occasionally. I forgot that it will let you see how Googlebot actually sees a page. I think most people tend to use it for submitting a new page or post to Google. I know there is the Malware feature in Google Webmaster Tools as well. I guess I figured if a site was hacked then the Malware would be the place to start checking.

  19. After un-hacking a site like this one it is incredibly important to update WordPress code, all the plugins and the themes. If you don’t, the site will just get hacked again, often within weeks!

  20. Hey Matt,

    As wonderful as Fetch as Googlebot is, it is not useful for sites that use SSL/TSL, there is no option for fetching an HTTPS page using Fetch as Googlebot, so for secure sites all the requests result in a 301 response.

    It’s worth noting that a certain other search engine allows webmasters to fetch as bingbot any url, including HTTPS urls, just saying.

    Ivan

  21. nice i saw it earlier but never used it

    i’ll try now

  22. I echo Matt’s question from August – does the number of submissions available ever reset?

    Working in a vertical market the limit of 10 soon gets used up when you have new clients coming online all the time. I guess the answer would be to create a WMT Account for each client but that then means having to remember which account and password to use, and as my clients have no interest in looking at the various bits of information in the WMT Accounts (they are busy earning a living looking after their own clients) it seems to be an extra level of complexity.

    I can understand if you are a single website or even someone that looks after a few accounts but we are one of the leaders in our market and have over 1,00 clients!

css.php