Suggest what webspam should work on next

Today is July 1st, which is a special day because it marks the beginning of the second half of the year. Just in the last day or so there have been a couple pieces of good news: better indexing of Flash, and we re-wrote our “What is an SEO?” guide to improve the tone, then asked for more suggestions on how to improve it.

July 1st is also a good time to sit down and ask the question “What do I want to accomplish during the rest of this year?” I’ve been talking to various people on my team about which projects to tackle next, and I wanted to ask for your feedback too.

In the comments, feel free to suggest projects that you think Google should work on next in webspam. I have a comment policy and I’ll reserve the right to prune comments that don’t contribute to the discussion. But if you have a constructive, polite suggestion then I’d be interested to hear it.

The one other thing I would ask is to please think about your suggestion before reading the other comments. If people read the other feedback first, the suggestions won’t be as independent.

153 Responses to Suggest what webspam should work on next (Leave a comment)

  1. You’ve already come up with your suggestion before reading this comment, right? 🙂

    One thing I’d like us to do is read through the rest of our webmaster documentation and look for other articles where we could improve our tone or where the information is stale.

    I’d also like us to make a concerted effort to look at the most popular posts on the Google webmaster blog or my blog and see if those posts should be updated or perhaps turned into HTML documentation with more information than just the blog post has.

  2. I’d love to see you guys clean up Google Blog Search. There’s certainly a lot of junk in there.

    I’d also like to see you take on search results in the index. This is self interested on my part, but I see a ton of job boards that have their search results indexed. This wouldn’t be a problem if the content was useful, but often the search results just include the name of a company, and don’t even have jobs from that company. I consider this SPAM, even if it’s big name companies putting it out. I’m sure it occurs in other areas too, but this just happens to be the industry that I’m looking at every day.

  3. The Digital Point Network – that little 1pixel gif that identifies sites in the network should be pretty easy to flag up as spammers trying to manipulate the rankings.

    Also, splogs – made for adsense blogs – this will probably be easiest to tackle by tightening up Google publisher criteria, rather than algorithmic changes.

  4. Hi Matt

    Nice post, i have been working hard over the last couple of months on my website, not sure if it was ever confirmed but there was a big shift the algorithm and many listings got lost, i believe this was called the Dewey update, can you give any more info on this, my second question is why is there so much spam in the search results. I have seen so many sites that have no relevance or power, not to get started on the linking. Is there going to be any changes soon to get rid of this and can we expect the rankings to settle down, it does kill of small business like mine when Google makes big changes, it’s a very frustrating time to see poor quality sites list better.


  5. Clean-up blogspot. A significant # of the blogspot links that I click on are spammy redirects to completely irrelevant pages.

  6. My major suggestion would be probably what a lot of others will say, and that’s to look at spam from the Adsense and Adwords standpoints. If it’s spammy from a search point of view, it should be spammy from an Adsense/Adwords point of view as well. In other words, use the same ranking factors for Adsense/Adwords that you would use for search.

    Personally, I’d love to see you guys start with sites that are spammy and automatically harvest content, such as entertainment blogs (not that I read any…but every time someone has one, they seem to appear on web design forums asking why they “don’t rank”) and YP-style directory sites that present the most basic of publicly available information and nothing of any real use (the network of mini-sites comes to mind there).

    WRT to the blog posts/content, my only thought is that if you’re going to update that content, start providing specific examples of things that are right and things that are wrong and try to do it 50/50 (although with the amount of wrong ideas that are out there, I haven’t got the foggiest idea how you balance that scale out). There are a number of people out there who are being fed lies like the whole “links are what it’s all about” (you don’t even need a link of any kind to receive referral traffic for at least some keyphrases), and nowhere near enough people talking about things like how to structure a site for both users and SEs. You guys touch on the subject briefly in the Webmaster Help Center, but there could be more of a tutorial-style explanation with examples for newbies, with an explanation of why this sort of thing is as important.

    That’s about all I’ve got for quick thoughts.

  7. Maybe Google could put some more resources into identifying the scraper blogs on Blogger. I find a lot of my stuff gets picked up full-body by pseudo-SEO blogs. They are NOT getting the content from my RSS feed.

    If there were a way we could certify content so that search engines know it is only authorized on one site, that might help.

    I don’t know how you could do that, short of embedding some sort of authentication string in your text.

  8. Hi Matt,

    I’m sure your webspam team have looked into this already and apologize if this question is like a broken record but – will the webspam team look into improving on detecting these so called “link networks”?

    We’ve seen sites over the past 12 months, that rank for highly competitive financial keywords in the UK of whom are associated with these networks. The networks, in some cases have the ability to generate millions of links from completely irrelevant sites.

    Will this be something the webspam team are considering in improving/clamping down on, or have already begun and this is now old news?

  9. Hi Matt,

    Thanks for asking!

    1) We are an advertiser and get lots of clicks from random “adsense” repurposers that are pages full of adsense links. Please try to eliminate those sites that ad no value from your adsense network AND push them way down on your search results.

    here’s an example,YT0xO0w9U3RlYW0gQ2xlYW5lcnM7Uj0xO1M9TSM.

    2) As a review services provider, it’s frustrating to see a lot of sites that have “blank” pages of reviews (i.e. they have pages and pages full of “be the first to write a review” listings…but very few with an actual review). How can you identify those with actual content vs. a page waiting for reviews. I don’t consider this “spam”…but very poor relevance. And it encourages creating millions of shell pages with no content vs. creating only high-quality pages.

    Here’s a simple example of

  10. 1. I hear that Blogspot users are having a lot of wishes in regards to better spam protection. Not sure if it’s related to Google’s webspam department, but it is related to Google and spam. Ionut Alex. Chitu knows more about the problems I think.

    2. Hand out clearer guidelines as to what is allowed in terms of buying pages on a site (like buying a YouTube channel) in relation to one’s non-nofollowed links pointing to that page, and in relation to non-nofollowed outgoing links on that site.

    3. Not sure if it’s related, but it would be nice to get an “adult content level” indicator as part of the Webmaster tools. For instance, I can see a lot less images on my doing a Google Image site search with the Safe option enabled, than without, and I would like to figure out why.

    4. There is no four, please skip directly to 5. Thank you for your understanding.

    5. Sometimes, seemingly obvious keyword repetition spam is ranking high in Google results (you know, like “buy rolex rolex clocks rolex rolex rolex by rolex new rolex”). Wonder why?

    6. Sometimes, large-networked sites are ranking above what I (certainly subjectively) would consider their earned ranking. A good example is that I’m often looking for PHP functions, and — while not spam, and sometimes helpful — is often ranking above the offical definition.

    7. All in all I have to say I don’t see spam that often during my usually English-query web searches. So I guess this list will end here. Suffice to say spam is a huge problem in e.g. Gmail (perhaps not huge in relation to other email clients, which might fare worse, but just in absolute numbers it’s still a major headache). It’s also a huge problem for blog owners in general… why doesn’t Google release a service to help any kind of blogger? Say, a “comment spam level” indicator REST API. When someone submits a comment at say, I can then quickly send the comment as parameter to and get back a number between 1 and 10, everything above 5 being spam (or if I want to lower or increase the threshold, I can). Then my comment software can decide whether or not to block this, or whether to bring up a captcha etc.

  11. I would like to see the ranks of expert answers styled sites reduced if they don’t provide anything but the questions. One example

    The sites which just list lots of links to ebay results are also really annoying. Very often they don’t even have links which are up to date. I see these very often in German search result.

    I would also love to have the “filter” functionality back in personalized search. It made it possible to block these kind of sites even before I see them in the search result. I guess this information would also be useful to the webspam team if enough people would use it.

    I also noticed the blogspot spam redirects recently.

  12. Don’t clean up blogspot, just crawl it frequently, find all the sites the blogs link to, and remove and ban 99% of them from the search index. Actually, just gradually whitelist the search index so that no site not explicitly approved gets in. That’d make my searches 100% relevant.

  13. In local search we are still seeing large industry portals that outrank local websites. The whole point of a local search is to feature local sites, right? Some of these portals in the travel, real estate, and other sectors do not even feature local sites; instead they feature sites that are further afield, mostly there to scrape local results.

    Same thing in the ecommerce sector that search queries are returned with large shopping sites that don’t even carry teh product or brand but advertise something else completely. A lot of them are basically holding pages with affiliate links, no real useful content or the product you may be searching. Basically most of them just keyword spam on each and everything.

  14. Is there a relation between Google spam & Gmail spam? This morning I’ve received a spam in my inbox (really too many uncaught spam these last days…) and I’ve googled the URL of the spamming website, and it seems it is clearly known by Google that this site is a spam.

    And cleanning Blogspot’s spam might be a good thing too 🙂

  15. Still, huge non-relevant multi-way link exchanges are the biggest problem in several fields. Google can’t do anything about exchanged links (usually in the footer) if the network is big enough and if it’s done smartly:

    1. the linked sites change from site to site
    2. sites do not link to every other site, only to a small part of the network
    3. the links always change their anchor text a bit or a lot

    I spamreported about this problem, but I guess, it’s too hard for you to deal with it with an algorithm.

    If someone interested in an example, follow the footer links on this website:

  16. Matt,

    I came up with this BEFORE I read your comment… I swear! Like you said… some of the official information from Google gets stale, so it would be nice to see you guys being a little more proactive in maintaining the quality of that information. As I’m sure you’re aware, the SEO community tends to interpret that information literally, so you should try to make sure that everything is as clear as possible and can’t be misinterpreted.

    Also, maybe you could set up some kind of feed or alerts that notify us when pages in the Webmaster Help Center have been updated.

    (For your efforts to proofread the information from Google, I’ll give you a headstart: On the page “Google 101: How Google crawls, indexes, and serves the web” the indexing section needs to be updated to reflect the improved indexing of Flash files.)

    As a continuation of Michael Martinez’s suggestion, I’d like to suggest that Google provides an option to “register” content before it is published. This would allow webmasters and content publishers to submit content to Google, wait for Google to verify that the content doesn’t exist already, and then assign the content to a specific URL.

    I also suggest that the data in the Google Webmaster Tools is updated more frequently, and that all data is time stamped. Certain data already has this, but there are many instances where some data is much older than the rest, and it is difficult to determine when Google has updated only a specific section but not all sections.

    I suggest you provide a way to monitor rankings or suggest 3rd party applications that you approve of.

    Lastly, I suggest that you provide some way for webmasters and SEOs to test their content for spam violations. Maybe you could incorporate this information into the GWT, similar to how you report things like duplicate title tags and uncrawlable pages. Obviously, I don’t expect you to get too granular with this information, but maybe you could report things like hidden text or invisible links–things that commonly occur due to coding mistakes, as opposed to “evil” intent.

    To summarize my suggestions: provide more ways for the honest websites to work with Google. After all… we are working towards the same goal: improve the quality of our content. Thanks, Matt!

    – Darren Slatten

  17. Matt — As you requested, I thought of my idea before reading comments. In fact, to keep from getting distracted, I still haven’t 7read the other comments yet, so forgive me if this is a duplicate idea.

    My focus is building & optimizing websites for small local businesses who are merely trying to compete on a local level. Let me use a completely fictonal example. A local shoe store doesn’t care about ranking for “shoes” in Google’s global results, but when local consumers here in Ashland Ohio search for “ashland ohio shoe store” — they had BETTER be right at the top.

    Trouble is — there are national portal sites, perhaps in my shoe example it would be (??) They make up a dynamically generated page on their site about Ashland Ohio Shoe Stores – which outranks the mom & pop shop.

    My opinion for an improvement? You need to give greater weight to the geography of the business. Google already has this data, both from Google Local results, and from the info about where the IP is registered. So if is headquartered in Boise, ID — then it shouldn’t be outranking the mom & pop shop on the term “ashland ohio shoes” when the mom & pop shop has been in Ashland for 50 years.

    The blended results with business listings has helped this, but the local small businesses deserve to do better in the traditional organic results too.

    Generally speaking, keep up the great work. You guys rock.

  18. “And why beholdest thou the mote that is in thy brother’s eye, but considerest not the beam that is in thine own eye?” — Jesus

    I agree with Ken Chan — Google’s own web properties are spammy. Like “college student installed PHPBB on a .edu site, then graduated” spammy. Blogger/Blogspot is full of random word salad pulled out of legit blogs and other sources. Does the red flag do anything?

  19. Well, July 1st is also Canada Day 😉

    For web spam, I’d like to see Google be quicker to remove offending sites that have been reported or are obviously spam. I know you like scalable solutions but manually removing some sites could keep people from trying the same techniques while you develop the scalable solution.

    Keyword repetition spam as well as paid link sites still rank way too high forcing us to also consider buying links.

    Real estate seems to still be getting away with crappy links and even crappier sites. There are companies out there that offer template sites and then interlink them for good results for their clients. It’s not fair to some of the other sites with good content. I just see a lot of bad results in the real estate industry so that’s one area that could use work.

  20. For a fresh start up, the process of getting words out is very hard and tedious. In a simple world, if the logic of “organic search result” were to hold true, the mere mention of Search Engine Optimization is an oxymoron. Having said that, here are few of my qualms:
    1. Why can’t G looking at content of a page as a more core logic to establish the rank for that page? You can do number of very popular searches like “cheap air fare”, “cheap tours”, “cheap web designs”, and in almost all of these searches, you just have to look into top 10 websites to determine whether the techniques the website uses are spammy or not. The whole notion of “content is king” has vanished it seems.
    2. If user centric approach is what we are aiming at, if all we want as a publisher and G as a distributor of the just information, then why is the web community at large so quiet when it comes to optimization by virtue of Web 2.0. G should come up with some specific guidelines with primary focus on Web 2.0.

  21. As some have mentioned I too would like to see a improvement in the placeholder filtering.
    Not sure if that is the correct term either, so here’s an example:

    You search for a game or software, or hardware, or something else.
    You get a ton of results, unfortunately many of those when followed,
    lead to a page, that is almost empty, has the name of the game for example, and that’s it.

    I understand that this is basically just the system those sites use, add a title to the database and the system dynamically create the page.
    But sadly, most of the time they stay empty of content like that for months, even years.

    Placeholder Hits, Dumb Hits? Not sure what to call these, but I really wish Google was able to push those way back on page nnn or something.
    They certainly do not belong among the first 10 search result pages.

  22. A lot of “unique” content can be created with Markov chain generator with relatively small input.

    If you could somehow analyze site’s content and determine if it makes any sense or how much “artificial” it appears. But maybe you guys already do this?

  23. I’d like to see Google take action against these shady companies who promise to have a special relationship with Google, and get you on the first page if you’d only pay them X amount. These companies should kicked out of the SERPs, so that innocent webmasters who are not SEO experts dont get taken advantage of.

  24. Add a “This result is spam” link to every search result shown (next to “Cached” and “Related Pages”). It would send an AJAX query to Google and then dim/disable that result in the listing to show you that you’ve marked it as spam.

    What happens with the AJAX query is up to you. I’m not saying it should actually directly affect PageRank, but there could be something where if you get a bunch of spam reports for the same domain, it would alert your team and give you a nice little report showing that User A searched for “Aruba” and got as hit #5 and didn’t like it, and User B searched for “pliers” and got as hit #7 and didn’t like it, etc, and then you can go through and figure out what’s up.

    Also, what everyone else said about “be the first to review this product” pages. DO NOT WANT!

  25. Not exactly WebSpam, but do companies like Amazon and other giant retailers need to be that high up in the SERPS when they are selling products.

    They seem to instantly get high rankings based on the Titles and URLs of their new pages.

    Also Wikipedia is becoming too dominant at the very top of the SERPS

    It would be great to see more variety on page one when doing searches – not just the all too common sites

    Universal search is a super great idea – but those types of links and videos should be on the right side with the Adwords links, perhaps above them

  26. I think the easiest spam to clean up would be Image SPAM. Image search rarely returns what I am looking for in the top 10, 20 or sometimes 30 search results. So little is given to the name of an image. If it is something like DSC00032.jpg you know it is just a generic name. However if it MATT-CUTS-CAT.jpg and that jives with the alt and maybe title text, you know that it is probably a good picture of your cat. After that stage, you would then compare the incoming links.

  27. How about a feature to block/not count specific inbound links? Specially when these are unsolicited, totally irrelevant to my site’s subject and most likely created with the intent of harming my rankings?

    I also do like the idea to “register” content before it is published/ submitted

    Thank you for the opportunity

  28. I’ve been thinking about this a lot lately. Obviously I’ll second (… third … fourth … ) the Blog Search web spam (which, admittedly, has gotten much better).

    But I’m mostly curious about paid directory inclusions. Working for a largeish web hosting compay, I get to see some interesting stuff. For instance, we’ve partnered with Best of the Web ( to offer it to our customers. Of course, the main reason people would want to be included in BOTW is to get the link juice.

    Now, the paid link bit on the Google guidelines is pretty clear: it’s got to be editorial to pass on the juice. It’s fairly obvious BOTW is not editorial>

    That’s caused my company to want to get into the paid directory game (I’m not personally a fan). The word I heard through the grapevine is that sometimes Google gives paid directories a pass, even if they’re not editorial.

    Is that true? If so, what *are* the guidelines? Why would any paid, obviously non-editorial directory ever have *any* value in determining a SER?

  29. inside webmaster tools you can tell: hey! this page is spammy, pls rewrite it or we’ll flush it in a month. tnx.

  30. I’d like to see you focus on spammers who are focused on micro-blogging sites such as Twitter, Tumblr, Pownce, etc. The end goal of these spammers is to get their content on Google search results. It would be great if you could do something to filter these results out. Most of their content includes shortened url’s (tinyurl, snipurl, etc.) to their splogs or affiliate pages. So maybe you could drill into those url’s and exclude any Twitter/Tumblr/Pownce pages that include links to known splogs.

  31. I suggest you re-examine your definition of webspam (as I’m sure you do constantly).

    For example, is it right that certain large organisations submit thousands of pages to Google, which Google happily indexes, and when we mere Earthicans go visit, we are told to “Log In or Bog Off”.

    While technically, this may not be cloaking, in effect it is, because Google is obviously seeing the page, but we Earthicans are denied that view.

    With some searches, 50% of the first page is effectively a lie.

    To Google: acceptable. To me: webspam.

  32. … sorry, missed point …

    If you feel you must include the ‘log in pages’ (despite no related content to the search), surely you should at least give it 20-point penalty, to give those sites with freely available quality content a boost over those who charge a fee for their content?

    This is an area where Google is visibly kowtowing to big companies.

  33. Matt,

    One of the things that would be really helpful is to reduce the time that Google needs to take spam results out of the index. The less time that a spammer can benefit from using spam, the less interesting spamming becomes in the first place.

    That also requires a much faster response time to reinclusion requests I think, or even better, not having to need reinclusion requests, but automatically reinclude when the spam has been removed. An identified spam site can remain in the “hidden” (for lack of a better word) index and you can keep revisiting to verify if the spam has been removed. This of course for relatively mild forms of spam which are more likely to be caused by ignorance than being intentional.

    Also it may be interesting to consider looking at spam from a purely page point of view, instead of punishing a whole website at once. It happens quite often I think that part of a site uses spam techniques. This takes out the “shall we remove or not debate”. Then you can simply take the bad pages out and leave the pages that don’t violate the guidelines in. The impact is much smaller and it makes it a lot easier to act without being sued all the time.

    Also interesting would be to do something about 301 redirect link spam, though perhaps that’s already fixed. I’m not really paying attention to it, I just know it exists.

  34. Matt,

    Not sure this is your department or not but I would love to hear your thoughts on a Google SEO Certification. Much like Google trains and ‘anoints’ certain PPC companies with an Adwords certification… I believe it’s time for some of the search engines to start offering a training program and some sort of certification for SEO firms. I’m unsure how much spam it would knock out today – but doing your best to drive out the black hat tactics before they get into the SERP’s might do wonders in the future.

    A side note here – I do believe your reworked “what is an SEO” page will be very helpful for those who know how to read into it… unfortunately most SEO firms have multiple definitions for – as an example – cloaking ‘Pearly White’ or doorways ‘information pages’ and it’s simply very confusing to any buyer. Going back to my above thought… a simple ‘Google SEO Certification’ goes a long way in cleaning up the SEO industry from black hat firms and saving site owners potentially $1,000’s of dollars.

    I’m sure there is a litany of reasons why Google can’t do this ethically – but I like the idea 😉

  35. Take five and go infiltrate into Gmail spam team and make them do something about the Russian spam that keeps on coming to the Inbox altough I’ve marked this crap spam hundreds if not thousands of times…

  36. My suggestion is target scrapper sites. They always have thousands of pages indexed but none of the pages make sense to a human. Can you pull phrases out of a web page and see how many times they appear elsewhere on the Internet? If a site gets too many of these phrase match hits it could be flagged.

  37. Great post Matt,

    Recently we have seen alot of duplicate sites come up in the results which is a little wierd.

    As you can see both plumbingworld and plumbingstore are basically the same site with different logos.

    Another issues which I am sure you hear about alot is scraper sites, we have had a few issues with sites grabbing our rss feeds or just scraping our content for their own websites. Obviously thats a complex issue but its fresh in my mind because I justed noticed someone scrapping our blog this morning. If you have any recommended defenses or items we might want to pay attension too that would be great.


  38. Honestly, what about better responding to numerous spam reports.
    I can understand that there is a lot of them, but some feedback should be cool.

  39. Matt,

    I would like to see more attention on the listings from Local Business Center, there are many businesses that use different titles to promote the same location (only changing the phone number and using fictitious URLs) winning at least 20% of the space, while other companies can’t hardly make it to the top.

    If I remember more later on, I will post a new entry here.

    Good luck with that.

  40. Duplicate Content:
    Now if a site has more authority can copy your content and he can ranking before your.
    The best way to know the original content is to use ping.

    Link Exchange:
    The exchange links benefits the person who has more friends and not the best content.
    All the exchange links has one characteristic in common, the external links are consecutive.

  41. Focus on the quality of landing pages for pages in search results. For a few years now, when I searched for answers to my technical problems in Google, the first result was often a website called

    When ever I opened this site, I closed it. I could not see why Google would place a site that I had to log into and pay to be able to see the answer to the question. I considered this site spam.

    The problem is that except for the question, you need to scroll down 8 pages before you see the content you searched for. I never realised that the content was on the page. This happens for me on a page like

    What I think should be considered is:

    Search result landing page quality – Having to scroll 8 pages down is bad

    Bookmark near content – If there is a bookmark close to the matched content, Google should take me to the bookmark, not just the page.

    Content requiring logins should be considered annoying (which to me is close enough to spam) This should be lower in search results.

  42. Agreed, clean up blogspot.I have seen many spammers who will set up a free blogspot blog. Perhaps proof of identity should be required, in addition no whois information from registered domain names or web hosting is tracked, perhaps that should be a requirement.

  43. Hi Matt,

    I see a lot of comments about scraper sites here, and while they’re a big problem in my industry (travel), I’m NOT finding that Google is ranking them highly any more.

    Definitely the “placeholder” filter gets a vote from me–something that could rank a page partly based upon the volume of content unique to that page (within the site, i.e. try to detect what’s “template”). Of course, as someone mentioned you’ll have to watch for generated tofu content 🙂

    Now, my own contribution: chuck out the dependence on anchor text! Just cause it’s easy for the algorithm doesn’t mean you should do use it :-p

    Real, “editorial vote” links are often (maybe MOSTLY?) written with the company name as the anchor text, and the relevant term merely in nearby text, e.g. “There’s a great new company making left-handed wingflots called“. Look at real independent review sites. Look at newspaper stories.

    The “votes” you SHOULD be counting the most aren’t going to use the keywords as anchor text (ok, I grant you the special case of a user searching for a company by name). In fact, links with something OTHER than the company name as anchor text should be an INDICATOR of spam.

  44. Clarify the difference between full duplex link exchange and editor based link exchange for the end user. There is still a ton of misinformation and confusion circulating among webmasters in part because Google has not differentiated between the two with the exception of the word “excessive”.

  45. Hi Matt,

    I think fighting THIS spam would be very nice:

  46. I’d put money on the fact that this is a feature, but it drives me nuts that when I do a search for computer help on a specific topic, that often times many of the top results are from a site like meaning that I can’t actually see the content Google indexed without paying. I want my search results to be immediately accessible, without pay barriers in my way.



  47. Hi,

    I’ve been lurking for a while and this is my first comment.

    I have to agree with the poster above, a way of verifying that pages aren’t being considered spam due to accidents coding rather than evil intent..

    I’ve already complained about a month ago about a set of scraper sites which have simply copied a load of links (including my site) on extremely similar out of the box layouts. The contact button on all these sites doesn’t work and the domain names have all been registered through the same company.

    I’ve emailed the hosting provider who were useless and blatantly couldn’t care less… I’ve heard nothing back from Google… It’s really fustrating and makes it so much harder to trawl through external links in the webmaster toolkit.

    A faster, sleeker way to indicate that a site is suspect, possibly backed by an automated spam detector. Spammy sites can be notified to Google, automatically checked and those that are suspect can be dealt with swiftly.

    (Matt if you want the list of sites and all the details let me know or help self to my Webmaster Tools Account).

  48. Back in the day of altavista, excite, and infoseek I know directly that AltaVista had very strict penalties for search engine spam. When I was very new to the internet and with my first site I had listened to what I now know was a blackhat SEO who gave me what he called a tip in order to do well in AltaVista. He told me to put in a list of my keyword phrases in one paragraph that was just slightly off the color of my background color. Not totally hidden, but hidden for most visitors. He told me since it was not exact that AltaVista would not mind at all. He was wrong. I was banned for one full year for the hidden text. AltaVista was good back then and actually wrote me saying that is why I was banned. That opened my eyes in a big way and is why I am like I am today. 🙂

    Anyway; get much more strict on spam. Right now it’s like a slap on the wrist with the actual SEO getting off scott-free. The owner gets penalized and the SEO gets nothing. Not right. Further; if a site is caught, and depending on the severity of the infraction, make the darn penalty or ban actually mean something. It’s too worth the while and time right now for blackhats to continue to be black since all they have to do is “clean up” things and back into the index they go. Not good. Not right. It’s not a good motivator to getting rid of spam. If you make the penalty or ban mean something, the blackhat will think twice first, or even think more than twice.

    I know this isn’t a specific suggestion, but I think it’s an overall approach that needs addressed in this day and age of blackhats boasting about how they do this and that, and how they are experts, etc.

  49. On a loosely related subject, I would love to see Google Canada actually display a LOGO that supports the fact that it is CANADA DAY today.

    How insulting that this got overlooked.

  50. How about trying to figure out a better way to rank websites with less influence on back links and more on the content of the page. Their has to be a better way.

    Also I would hope to see many more advantages for those of using GWT comparable to the rest of the world.

  51. I agree with Doug; clients are being told “Cheat like this and we’ll have you on the first page of the SERPS. Don’t worry, if Google drops the site, we’ll have you reincluded in a week. You may lose a couple of weeks trade each year, but you’ll beat the honest guys every time.”

    I’m hearing more enquiries about how re-inclusion works than any other single item.

    Is this REALLY what Google wants?

  52. Matt,

    Use links from non-tradition sources (Google Academic, Google Books, Google Patents) to help identify quality sites. The advantage is that links from these sources aren’t easy to generate since academic papers, books and patents are all juried to some extent, and while it might be possible for end-runs, it would hardly be worth the time or expense just to generate a link.


  53. When a site gets in trouble for some reason with the
    googlbot and receives a penalty, and the site is setup in the
    webmaster tools. use tool to inform the owner of the
    site what the problem is. Even in general terms.
    Right now people are just shooting in dark if its a
    minor issue.


  54. Bring back web neighboroughs. Allow webmasters to face the Algorythm God as a group instead of as individual sites. Small groups of webmasters can policy the quality of their neighborough and either keep webspam outside or drop in the ranks as a whole.

    Make the webspam fighting knowledge base a tool kit. Casual web makers and users will get more feed from a cute embedable ‘webspam fighting tip of the day’ or a ‘rate your knowledge about webspam’ test, than from GWT or the blog, which they may not even know about.

    A badge identifying a webmaster as a Google Guidelines Aware or something like that.

  55. Better protection for legitimate websites.

    This whole war on links, pagerank, webspam and so on is throwing many legitimate sites and pages out of the index and ruining people. I think you need to take a step back and refine these algo’s instead of adding more non-perfect weapons to your arsenal.

    I have one several year old forum, thousands of pages, thousands of natural backlinks, vBSEO installed so it’s pretty difficult to get the SEO wrong, unique user generated content, members links are Nofollow, never engaged in link buying or selling and it went from thousands of uniques a day from Google to none and every single page de-indexed completely overnight.

    I’m getting swamped with Emails from members saying the site is gone from Google, so there is a notice on the site and i’m sending thousands of people to Yahoo if they want to find pages in the site using an engine instead of the mediocre vB search.

    My other site has “Weekend Pages”, a few weeks ago tons of my better high traffic pages went from top 5 to nowhere. Now they come back to the top 5 for a few days, then they will be “Not in Top 100” for a few days to a week then back again.

    Seriously, my pages can’t be considered some of the best and most relevant on the internet some days and completely irrelevant the next without any variables changed on my end.

    This has caused me great stress and hardship and i have at this point in time completely lost faith in Google and i’m exploring other options for my search, email, revenue streams etc and liking things so far.

    So i think webspam should work on rewarding the good sites more instead of a “Nail the baddies with a scattergun” approach and natural selection will order the index favorably for the user without taking out valuable pages and sites in the process.


  56. Sorry for my Englisch. I am Michael Janik from Hamburg, Germany.

    My suggestion: Make the Pagerank of a site a secret. The Pagerank display is one of the reasons why the link buying industry grow so quickly. If a potential link buyer do not know the page rank he will not be teased to get a link on that site. The most link buyers look for page rank and they would not buy it if they are uncertain about the effectiveness of a link placement. The best sales argument of a link seller is a high page rank. Just take away their best sales argument by not displaying pagerank anymore.

  57. Better detection and removal of malware sites from the SERPs. I would imagine that infections would turn your users off even more than irrelevant & spammy results.

  58. Matt –

    I would like to see fewer of the “old favorites” in the top 10 both for commercial searches and for informational and a greater mix of “pretty good” sites (even if the SERPs changed every time I re-ran the query!).

    Two points: 1) duplication of similar content from major sites that syndicate their product content and paid merchant listings swamps the top SERPs, and 2) major sites get all the visibility and so naturally get all the links. Both mean it’s a losing battle for anyone but the top dogs.

    I know (and you do, too) that the big companies out there do a lot of work to make sure they do all the right things to make Google happy. Search for many products and you’ll find Amazon, Shopping, CNet, PriceGrabber and all the rest amongst the top results. They have lots of content, well-established sites, and zillions of links (all “natural” of course :-).

    Amazon, and several others are indeed good product sites, but there are a lot of other sites that are specialized and might have more relevant information. And yes, one of my two jobs is at just such a company, and we’re getting squeezed out, more and more over time by the big guys.

    In particular, numerous sites syndicate the same product information from and the same list of merchants, and frequently they each get a separate listing for the same warmed over content. Heck, some of them are the same company! (Try “samsung ln-t4061f”, for example).

    The problem is, people link to sites that rank well in Google. When we were ranking well, we got lots of links. Yet today our site is much, much better than it was (and still better than the competition), but we’re hard to find any more as we get squeezed out. Now relatively few sites are linking to us naturally.

    I have the same problem on my personal web site — entirely non-commercial and an exercise in “doing the right thing” (on several levels, including SEO). I am not getting links there, either. Sure, there are other sites as good or better than mine, and I am not begging for links. But in obscurity, and without a staff of people doing link-baiting, link “acquisition”, deals, and so on, I have no chance of being found by enough people that the 1 in 100 who reads the site will also link to me.

    So the established sites have become the “new spammers”, in a kind of way. The little guy, or even the medium-sized guy cannot effectively break through any more. The big guys are entrenched at the top.


  59. Thanks for all the suggestions, everybody. I’ve been reading through them and pondering them (other than Philipp’s #4 suggestion). I also like what Todd Mintz posted about where he’d like Google to go:

    I hope to be able to respond to some of these comments individually, but I really do appreciate all of the comments.

  60. How about working on something to detect what country/province/state/country a page is talking about and ranking that page higher for people in that area.

    Coming from New Zealand and working for an NZ hosted site publishing international content, we often rank a lot better for searches then for searches. In my opinion, if the content is about something happening in the States it shouldn’t matter that the website hosting the content is based outside the USA.

    More of an algo suggestion then a spam related. But it’s frustrating to see that less relevant content gets ranked ahead of our USA content in the USA simply because Google’s figured out we’re from NZ.

    Since the States has around 100 times more people/Google users then New Zealand, it’s effectively an out-of-towner’s tax. We could set the country in GWT, but it’s not clear how this would effect our rankings in other countries.

  61. Another thing i’d like to mention, although slightly outside the scope of Webspam is the Re-inclusion process.

    Currently Google is very transparent in terms of what you cannot do, if you cloak you will be removed, if you buy and sell links you can be removed and so on but if you are removed/penalized when you complied with the Webmaster Guidelines due to an algorithm change or circumstances outside your control the situation is much different. If you are a “known” blogger such as Ryan Stewart, Dazzlindonna getting Pagerank back etc you simply post and the problem is fixed in record time.

    For the rest of us, we are left with a form to “confess our sins” and post in to a seemingly black hole and wait for weeks or months and the process is anything but transparent.

    Sometimes the issues may be a simple mistake on the webmasters behalf, if Dazzlindonna who has been doing SEO for years cannot even find the problem on her own SEO site and needed intervention from Matt to point out the issues what hope has the average webmaster got?

    I have seen mentioned Google can’t provide feedback because it will give black hats ammunition, fair enough.

    Why can’t Google Groups have a dedicated reinclusion section where people can post their sites, and if a couple of Googlers click a “This site had a raw deal” button it would get flagged in the reinclusion que as high priority and enable some feedback?

    I can’t see cloakers and black hats lining up to have their site looked at, and Google pays people as “Quality Raters” to remove sites so what about reincluding sites?

    Despite what goes on behind the scenes, from where we sit everything is well documented and transparent about being removed from the index but being reincluded is guesswork and a message to what may well be a catchall spam mailbox.

  62. Dave (original)

    Matt, shouldn’t your webspam team always be working on the webspam that causes the most grief for Google end users?

    I would hazzard a guess that most suggestions given by Webmasters and SEO are self-serving.

    End users vote with their feet where Webmasters and SEO are golden hand-cuffed to Google.

  63. Richard Hearne

    On documentation – Sitelinks documentation is far too obscure IMO. Even comments on recent JuneTune were woefully inadequate. (Not really spam-related, sorry.)

    On spam – with paid links apply the same rules to F500 as you do to mom-and-pop sites. Unless you start punishing the serious big spenders (or change your entire algo) your war on paid links is destined to fail. As long as your more worried that BrandX not appearing in your index is more harmful than letting them away with breaking the rules I think the problem will only get worse.

    On GWG – go help JohnMu out more. (Has he gotten that pay-rise yet?)


  64. Anyway; get much more strict on spam. Right now it’s like a slap on the wrist with the actual SEO getting off scott-free. The owner gets penalized and the SEO gets nothing. Not right. Further; if a site is caught, and depending on the severity of the infraction, make the darn penalty or ban actually mean something. It’s too worth the while and time right now for blackhats to continue to be black since all they have to do is “clean up” things and back into the index they go. Not good. Not right. It’s not a good motivator to getting rid of spam. If you make the penalty or ban mean something, the blackhat will think twice first, or even think more than twice.

    Matt: I know you said to comment independently, but I already did that…so now I can second a notion. 😉 And this one is a damn good one…punish the truly guilty (most of them leave a pretty obvious footprint anyway) and warn/educate the innocent.

    As long as it’s a case whereby a webmaster wasn’t an unwitting participant (e.g. if the webmaster’s site was hacked), start the ban at a minimum of one year. And don’t just make the ban a search results ban, either. Cut ’em off Adwords, Adsense, Analytics…hell, cut ’em off their GMail accounts (most SEOs are using them to spam their crap anyway). If they’re tied into a Google service in any way, shape or form…cut ’em off!

    From a financial perspective, cutting them off everywhere you can in a blanket form (which would be pretty easy to do) eliminates the financial drain associated with having to deal with SEOs who choose to screw around. Chances are that any “income” they’d provide that big G couldn’t otherwise get their hands on would be more than offset by the losses associated with having to deal with their crap, and the crap of those that follow the “leader” because it worked once, so therefore it has to work a million times.

    If that doesn’t work, I’d be willing to work as a pro bono executioner. 😉

  65. I’d like to see something done about some of the “brand spam” spam sites. Experts-exchange is one of the worst offenders. Whenever searching for something technical like an error message, they always show up, yet have no real content other than repeating the exact same question I have (with all the answers blured out and behind a pay wall). is another big offender. How is it not hit by the duplicate content filter?

  66. I’d like two features added to Webmaster Tools that would avoid unintended “spamming” from complex sites.

    #1 a method (like is offering) to tell Search Engines which part of dyanmic URLs should be ignored (or rewritten to the correct URL). Of course we can tell SE by submitting Sitemaps how a well formed URL looks like, but this works not for dynamic pages created by user interactions etc.

    #2 a short cut between the tools “Delete URL” and “Crawl errors”… should be easy to add a check box or set up rules (again… yahoo is doing it already) which URLs should be deleted

    In a perfect world #1 and #2 shouldn’t be necessary but if you fight with legacy contents and publishing systems it would be very helpful.

    This would reduce “noise” in Google’s Index and by this way the results look less “spammy”.

  67. Some people use long title tags which look like spam.Try searching for ‘web host india’ and you’ll know. I am not sure that we can classify it as spam but the title tag looks like its been made to tamper with SE algorithm and doing something like that is not acceptable by Google right? I hope you find this suggestion to be useful.

  68. I want to echo Andy Chen’s idea. I would like to see some kind of change toward those sites, with the sweet domain names that they will never sell because they simply scrape content from other websites, and use a “frameset” to pull from search engines and databases only to reap adsense revenue.

    That kind of behavior must make y’all at Google go a little nuts.

    Smiles. Love your blog by the way!

  69. The thing that really annoys me (not quite web SPAM) but similar, is sites with excessive scripting so that the pages take ages (several, maybe up to 10 seconds) to render. Half the time I can’t even navigate away from them until they’ve rendered, but I guess this is “hard to do” and also very subjective.
    Page load time (I guess you can only measure just the raw html download time when crawling ) would be an interesting metric and nice to be used in the site weighting – or added to the SERP, like page size.

    Following on from some other themes above, (i know…) I get constantly frustrated by not only “you need to register or sign up” sites – agree that sites with free commentary should rank above those that require a log in. An alternative way of doing this would be a “log in required” label next to the SERP – at least allow me to avoid them. Also second having a “last update date” on each page of the webmaster tools – it’s so good, yet sooo frustrating at times.


  70. A positive rather than a negative strategy. IE “how can we better rank and reward sites that are adding value to the Internet” rather than “how can we penalise and remove sites from the Internet”.

    The Internet pages that are available today are no better than they were 4 years ago in so far as getting “new” information from new “smart thinking”, “hard working” webmasters (not using this talent to send backbreaking, soul detroying work by having to send email after email to try and get a link). We all know that this is because of the spammers but how do we tackle this from the positive to credit the webmasters that are prepred to work hard. At the moment if a site invested 100K into producing high quality written and video content it would get indexed but would probably not rank for 2 years (its a non starter – ironically, originally employed to make the spammers not start – it also killed the development of the NET in the cross fire).

    If managed, we would once again see a BOOM in content development that the NET experienced between 2002 to 2005.

  71. Dave (Original) while wanting your site to appear in a position within the index that it “should” be may seem self serving, it’s also very frustrating to the end user also. When they craft very specific queries to find you and can’t, they will vote with their feet and search in other engines.

    Many end users don’t always bookmark a site but remember what they searched last time to find you, it doesn’t make sense to be #1 one day and nowhere the next for queries with your domain in it. I’m not a SEO but i unfortunately have needed to become one when it’s the last thing i want to do.

    One more Webspam Team Suggestion:

    Google Trends Spam, it’s currently being rendered almost worthless by spammers. Example:

    Sad story but this one highlights the point, the domains are obviously done by the same person and stuffed with keywords leaving nothing worthwhile to read under recent blog stories. I keep seeing the same domains scraping content based off the Trends feed, some are even on a subdomain called yes you guessed it //trends. and going by some of their Alexa it’s working.

    Now the worst part, 3 times in the last week i’ve had my browser crashed and/or taken over by malicious sites targeting trends because they can use it to get round the “This site may harm your computer” warnings.

    As a user, i like using Trends to see what’s hot and read stories etc but currently i’m to afraid of clicking a URL in fear of being infected with spyware plus it’s hard finding real blog posts amongst the spam which is rendering the service unusable.

  72. IMO the last suggestion by Steve is the best one by so far it is unbelievable. You do not need to ban sites, you just need to reward sites that make an effort and gain organic links. I am working on a curtain site at the moment and the owner and I have been creating articles for the site for about the last 6 months or so. the site now has at least 100 true editorial links (100% organic). The site is getting beat by sites with spammy domain names and no editorial links whatsoever. One of the sites beating it has never had a single technorati reaction.

    Technorati is the best way to spot the sites using paid links. The ones that are just buying will either have no reactions or they will just have links going to their homepage with the right anchor text.

  73. Look harder at partial duplicate content, which is often scraped or taken from RSS.

    One way you can identify spammy directories which aren’t exercising editorial control is by looking at their percentage of partial duplicate content. Usually most of the website entries will be duplicated somewhere else on the net, because the editor won’t have edited the text to fit with the directory’s guidelines and a lot of submitters send the exact same text to hundreds of add-url forms. So rather than checking each page for duplication, you also need to look at each paragraph.

    Take a closer look at the sense of what is written. Can you check the grammar or spelling for the document’s declared language? How many of the sentences are complete? How many are just lists, and how many of them are chopped up? What is the reading age?

  74. Matt,
    Thanks for always including the community in your research.

    I would love to see some additions to personalized search.

    The scenario:
    I am researching the prices of satellite TV with intent to switch from cable.

    How personalized search works currently:
    Right now personalized search will readjust each time I do a similar query as it tries to guess if I am looking for information on how satellite TV works or if I am looking for a transactional component.

    Additional option to personalized search:
    Allow me to click a radio button to remove a listing from my future search results (options to reset of course). This way, sites that I found to be relevant to my research (transactional) will be easier to find in the future. I can actually force results from page 3 or 5 onto page 1 for easier analysis.

    Sites can be removed on an individual basis without any changes to the global search economy. Allowing personalized spam filtering may help both sides of the ball. People looking for informative sites and not looking for the transactional side will actually dig deeper into the SERP and visit pages that may hold value but suck in SEO. It would also help the mom and pop transactional sites who have the best value but can’t get past page 6 or 7.
    Anyway just random brainstorming..

    P.S. I’ve been telling people for years that the “P.S.” is the most read part of any communications so I just wanted to see if it’s still true.

  75. Definitely work on the LYRIC SITES!!!!

    Lyrics are crazy popular online and yet the results are FULL of spammy sites that just make for a horrid user experience.

    Please . . . I beg of you to work on this.


  76. I know Google works hard to control spam. I also know things have improved greatly over the years.

    Expanding on what Adam and Andrew wrote above; I don’t know if it would be possible to know the SEO who helped a website spam, or even find out. Here’s a solution to think about:

    Any site who is being helped by an outside consultant or designer would have to disclose that fact to Google. It could be simple like registering the site with Google and then putting in a meta tag like this:

    meta name=”DesignerSEO” content=”ihelpyou, Inc”
    meta name=”SEO” content=”ihelpyou, Inc”
    depends on if the outside agency is just doing SEO or is acting as both.

    Then a little icon signifying this is displayed in each listing on the SERP.


    If an agency isn’t doing anything they wouldn’t want Google to know about, they have no prob disclosing the fact they are helping the site, right? If the SEO or site owner does not want to make the info public, then either the SEO or site owner should have to pay a fee to Google for keeping the info private to the public. Only Google would know the info.

    This would accomplish many, many things as I’m very sure is obvious.

  77. Oh, and if the fee is paid to Google to make the info private, the tag would look like this:

    meta name=”SEO” content=”private”

    but Google would know who from the registration.

  78. Matt,

    Personally I would like to see more interaction with webmasters that bother to post spam reports.

    A simple rating system would be good.

    5 – Excellent report, logged.
    4 – Good report, lacking detail, logged
    3 – average, already in system – no log update
    2 – Poor, insufficient detail, insufficient evidence, do not agree with assertion that website is spamming
    1 – Total waste of our time

    Other than that, I would like to see a big update that kicked out a lot of the detritus that seems to still be living off paid links…


  79. Some spammers are not hard to find.

    Here’s an Adsense ad I copied just two minutes ago:

    “Blackhat Seo – Top 5 Ranking In Under 30 days Fully Functional 7 Day Trial Versio”

    Maybe the novel spelling of ‘version’ fooled the spam detectors? ;o)

  80. I like Philipp Lenssen’s and Christof’s suggestions. First sites that require you to become a member to view answers, like experts-exchange shouldn’t be given any preference. They facilitate spam in my opinion. I kind of feel the same way about sites like “” but I doubt there’s a way to handle a NYT property gracefully.
    Lenssen’s ideas about a google API to battle comment spam sounds like a great idea to me. It would give google access to blog comments, ip addresses, urls and ripped/poorly generated content to further building better spam detection.

    As far as my idea, I was using the Google experimental search with Dates and find problems with posts being post-dated (for the future). It would be nice if this was cleaned up since I think people caught on that they can post blogs/sites using future dates, like 2050 and become listed at the top of the results.

  81. Hi Matt,

    I would like to see more features in the Google Webmaster tools for webmasters to control older spam submissions. Lets say that after the BIG UPGRADE in September last year, many of us have a lower link profile due to bad links.

    What I suggest is a report spam as Yahoo! has, so every webmaster is more in control of his own backlinks. Maybe like an OPTION: Select preferred backlinks for Google or something like this.

    Thanks for letting us sharing our concerns.

  82. Federico Muñoa, that’s an interesting suggestion. The webmaster tools group is more of a sister group (they aren’t a part of webspam), but I’ll pass that feedback to them.

  83. Having recently seen a few of my own managed sites completely dropped from the google index, I would love to know the actual reasons why.

    I received one email stating one of my sites violated an adsense policy, but had other sites, not even running adsense, dropped at the same time.

    As recommended, I participated in discussion on the webmaster forum, I tried several ways to find out why, including an email to you directly, yet all I ever hear is the same canned response, read the webmaster guidelines.

    Having been a webmaster for more than 14 years and this being the first time I have had a site dropped from your index… the response is very weak as it is nothing but gray area!

    I know and realize you are not here to provide specifics as to what we should or should not be doing, but when we DO follow the guidelines and still get dropped, there IS a specific reason and that reason should be shared with the webmaster, or ANYONE else who may want to know why.

    Saying: “Read the Guidelines” is doing nothing more than inviting more ways to skirt the guidelines.

    Saying: “Your site was dropped because it DID THIS” would instantly clean up your index.

    Thanks for the pulpit, even if only for a few seconds…


  84. 1. Clean up blogspot! Lots of scrapper sites that take up space and sometimes good subdomains. Sometimes people also just register subdomains and don’t ever plan to use it. I’ve noticed some that stayed INACTIVE for a couple of years. Re-allow people to register those subdomains again.

    2. In webmaster tools, show the PR of each site that our sites get a backlink from AND what the anchor text was that we got a link from. I know there is for anchor display in the webmaster tools section too, but it leaves a blank to know where those anchors came from. I would like to know what page linked to me, using what anchor text, and what their PR was all in one glance.

    3. Have more frequent QA sessions like you did sometime ago. More feedback b/w Google and webmasters is ALWAYS good.


  85. Matt, there are two areas close to home that you might look at.

    One, people are now using Google Notes as a place to post their spam content that they then link to from their spam comments.

    Two, people are now creating new Google Groups and then using its “invite” feature (or something like that) to send their spam, pointing, of course, to a Google Group filled with their spam rubbish.

    I’d also like to concur with the above posters: the Blogger/Blogspot spam is still out of hand. I get Google Alerts (!) pointing to it for my preferred keywords every day (but then, I get hundreds of alerts a day). Much of it contains only rubbish content, but a great deal of it contains indiscriminate RSS feed content stolen from unsuspecting sources. You can only mark so much of this stuff as spam, or report it in the Webmaster Tools, before you throw your hands up and say, “Why bother?” There’s got to be a better way.

  86. Google needs to seriously address the scum that are using domains as the entire URL. Sites like, which avoid the “nearly two-link” limitation for host domains and thus flood the search results for the search term across their thousands of domain variants. A URL like:
    Cannot be perceived as anything but URL keyword stuffing.

  87. On the topic of content registration, I think that would cause more harm than it could possibly help. All it would take is for the 99.999% of webmasters and site owners that are NOT using GWT would get a raw deal out of it, because their content would soon be “registered” to a site scalper. And there would be no resolution to it, since the content *was* “registered”, after all, right?

    Registration is a nice concept, but clearly not a solution to the scalper problem.

  88. I’ve got a potentially interesting one that I’m about to submit further details on via the webspam form. I just wanted to sync both the post and the submission together so that you could possibly keep an eye out for it.

    The issue in question involves non-web documents (e.g. PDFs) and indexing of content that would likely not be if it were saved as HTML content. I’m seeing a few documents show up in SERPs with this particular issue.

    I’m sending you a non-blackhat example via the spam report form as well (just after I finish this post). If you look for the term “PDF exploit” in your spam reports, you’ll see what I’m talking about. Sorry to be so vague…but this is something that would be exploited left, right and center by the great unwashed.

  89. 2 things really get my goat.

    1- Pay barriers or membership only content. There are 2 examples I can think of. Webmaster World has many pages indexed in Google but when clicking on a link, I’m told I need to log-in to get the content. This means I have to pay. The same goes for the Delia Smith website. Many recipes are “premium content” but can be found on Google. Surely this is cloaking and your current webmaster guidelines say this is unacceptable.

    2 – Non existent downloads or deceptive downloads. Examples, when searching for a driver, many sites seem to send the user round in circles without being able to find the download. Great for showing AdSense ads, but no good for the user. It shouldn’t be too difficult to verify the download algorithmically. Another example is where websites are selling otherwise free software. An example of this would be when searching for MP3 downloads, a website advertises the fantastic system with customer support blah blah and after the user parts with money, you find out that you have just bought limewire or kazaa for instance.

  90. @Robert (second comment of this post)

    You must have submitted while I was typing – I didn’t want to name drop and used the term “link network” instead.

    I was also referring to Digital Point.


    Can I ask what your opinion is on Digital Point as some sites are generating millions of links from this network from sites of which have 3-5 links at the bottom of their page that change at random on every refresh. Whether this is an attempt to make it harder for Google to find but like Robert mentions – every associated site contains a tracking pixel that surely could be tracked with a bit of “Google magic”!

  91. Mark (the one whose site was banned),

    I had a site banned recently due to the fact that it was hacked (nothing I had done personally…just some random keyword-stuffing hacker idiot) and I found out through Google Webmaster Tools. If you sign up there, add your site, and verify it (you don’t need to submit a Sitemap…just verify that the site’s yours), you may or may not get a detailed answer there.

  92. Phantombookman

    Boxxet !
    Every time they decide to scrape one of my blog posts you rank them and dump mine presumably as duplicate.
    This despite they actually credit me and link to me !
    One aspect for your serps as well is a lot of the articles ‘expire’ so when clicked on by your users they get a 404 meanwhile the original article is gone forever.

    It’s strange as I have had blog posts in your index and ranking within 5 minutes so it’s hard to see how boxxet beat you to it.
    That said, the owner does boast on his blog that part of their success is due to the efficency of their crawl system and getting content into Google first – sadly it’s somebody elses.
    Cheers Matt

  93. I have a couple of suggestions for you.

    1. Look into cleaning out worthless doorway sub-domain pages. Certain verticals, for instance, the city+apartments search, fill the serps with them. These are pure and simple landing pages whose only purpose is to increase their ranking. These are clearly violating your TOS.

    2. Institute some sort of penalty for flagrant violations. If hiding text or cloaking only gets you kicked out of the SERPs for a day, why would the spammers not do that?

    3. Fight scrapers. It’s frustrating to create good content and see it attributed to a scraper selling ads.

    4. Lastly, as someone who loves his pets like you do your cat, I have a rather silly personal request. My dog’s site is being incorrectly filtered by the Google SafeSearch filter after above mentioned scraper was hotlinking his photos, and I can’t get it corrected, even with multiple re-inclusion requests. He’s #1 in images for Yellow Lab (for well over a year), but since getting scraped, he’s only there if you turn off safe search. There’s nothing on his site but really cute pictures of Buddy. I’ve submitted 3 re-inclusion requests and started a topic on the webmaster group…what else can I do? Maybe feedback in Google webmaster tools about filter triggers would be helpful?

    Happy 4th everybody!

  94. Matt, I think Google is already doing a great job in reducing the spam that show up in the Google results but still, there are many things that I think Google need to do, in order to make Google a perfect Spam Free search engine. First of all, Matt, I don’t understand why sites that are marked as “Potentially harmful sites” still show up in the Google search results?

    I wonder why Google still show these sites although Google know that these sites contain viruses or spyware etc? Other then that, there are few more things where I think Google should work on. For example, there are many websites that are scam. People write about these sites on different forums and other places; and they tell how these sites scammed them.

    If you take a good look at such sites, you will yourself see that they offer no services, and they just want to get others money and give nothing in return. I have reported such sites to Google many times but I still see them on the top of Google Search Results. Matt, I suggest you may do a little bit of research about these scam sites and you will find out the truth about them.

  95. Forgot to mention earlier, but regarding scraper sites…

    Usually people can duplicate content of a legit site, make several blogspot sites and post it there. Then when Google indexes the content, they will see all these dupes and it will go against the original site. Surely, they must be some way of recognizing scaper sites and sites that were online a long time before them.

  96. Hi Matt,

    Frankly speaking, I had the intention to find the first recent ‘spam’ post of yours and put my comments under it so forgive me if I’ll be a bit off topic here.

    I wrote a blog ( on my website about recent changes in Google Turkey ( which was read by thousands of webmasters on other websites. I basically mentioned the changes and the updates that you pointed out during recent seminars and conferences. However there is still big question to remain and nobody knows nothing about it. What happened was the first shock was on May 26th. My website dropped 1/3 of the visitors sent by google. I was getting about >3K unique visitors and on May 26th, it dropped to 600. I checked with other webmasters and some of them was 1/12, some was 1/5. While everybody was trying to understand, another shock came on June 26th and it affected tons of other Turkish websites.

    The worst thing is there is no SEO coverage or information about what is happening to Google like Google America or Google Europe. None of the webmasters has clue of what wrong they have done or what made their site loose all the visitors.

    My feelings is that you put the Google Webmaster Guidelines hanging till May 26th to explain where you are standing and let know everybody there will be some big changes in the algorithm. So you took the first action May 26th and there were some optimizations to it on June 26th. However, I don’t know and nobody else knows.

    So questions are:
    -First off, is this a temporary deal or will be the same forever? The SERP is changing daily basis and keywords seem to be fluctuating a lot.
    -Is Google Turkey start point or “scapegoat” for a bigger change world wide? or is it something only affecting Google Turkey?
    -Indexing seems to be much slower then used to be. Are you taking actions against duplicate content or Google Bots trying to stay away from new content to learn more?
    -There seems to be a big spam storm ( ref) which is affecting only turkish websites, are you planning to take any action to clean those from your index?

    Thanks and happy july 4th.

  97. AbilityDesigns

    Do NOT fight spam alone !

    1)“Introduce a bulleted comprehensive spam report feature” in Google Webmaster Central.

    2)“Incentivize honest webmasters who report certain benchmark number of “genuinely” spammy results”. – reinvest portion of your massive profits in this ; )

    3)Incentivize honest and transparent websites creation in commercial segment ( about us, contact us, industry accreditations, privacy, disclaimer ) and discourage the sites that are not transparent on these parameters.

    4)“Dis-incentivize spammers harder.” Go all out after them – Dug up their WhoIs info , IP, etc

    5)“Prepare and publish a ‘Spam Index’ just like Regular Index, Supplemental Index.” . Publically shame the spammy sites in front of web community.

    I’m sure you have resources to achieve all of the above algorithmically to largest extent possible.

    Happy Independence Day and keep up the good work !

  98. Matt,

    I join the consensus here in saying that getting rid of scraper sites, whether on Blogspot or using the WordPress automatic scraper plugin, would be the most useful improvement.

    I’ve got a 5-year-old site that gets dozens of quality inbound links every day and yet am not appearing on the first page of Google because of these scrapers. Indeed, I’ve got scrapers scraping my scrapers now and they rank ahead of me, too! Further, I’ve gone from PR 7 to PR 4 in the process despite 22,000 pages of quality, original content, 14,500 Technorati inbound links, and 94,000 plus RSS subscribers. We’re even indexed in Google News! But the scrapers are killing me.

  99. Doesn’t Google have enough behavioral data to find out which sites should rank better (or worse)? I mean if users repeat a search or always go to a different search result, doesn’t that show where there might be SPAM?

  100. Matt,

    “The one other thing I would ask is to please think about your suggestion before reading the other comments.”

    And thats what I’m doing, on the risk of repeating an already suggested idea 🙂

    – I wish to see Google Quality Guidelines available to download in PDF. Or a kind of e-book which Google WebSpam Team would keep it updated.

  101. How about going after sites that use link manipulation by abusing the nofollow tag and redirects to outrank the source of their content (such as does)? (link manipulation for ranking purposes is against Googles TOS)

    The nofollow tag was created to stop spam on blog comments and inform google that you don’t trust the source where the link is pointing to, not to be used as a clever maneuver to outrank the very source of your content..

  102. Since I had the first comment, I clearly didn’t let anyone else’s opinion seep into my comments. Now that I’ve read the comments, I have another suggestion.

    A lot of people are asking for more severe/permanent penalties for blatant disregard of Google’s guidelines. I actually think that this is the wrong way to go. It creates a greater incentive for throwaway sites. Spammers are all about automation. Once they find a loophole/edge/technique, they repeat it as many times as possible. Long term penalties won’t hurt them, they’ll just create new sites. The more likely they are to get shut down for marginal techniques, the less likely they will be to actually invest time and effort in creating a valuable site. Harsher penalties will just make spammers more brazen. Penalties also hurt good webmasters who make mistakes, whether it they accidentally hide text, hirie a bad SEO, or decide to try a blackhat technique out of frustration.

    I don’t know much about how penalties actually work, but I think Google needs to move more towards a filter based system. If it looks like spam, it doesn’t show up. Not because it’s not in the index, but because it is deemed not relevant based on the techniques the webmaster used. This would be just like the way duplicate content filters work (that’s an assumption on my part, but there seems to be a consensus on that from a lot of smart people in the industry). This should create an incentive to go whitehat. Say a spammer realizes that his techniques aren’t working any more. He owns some domain names that are actually decent. If he’s penalized, he’ll never invest in creating content on those domains again. He’ll probably just create spam on new domains. If he can get back in the SERPs within a week or two by going whitehat and generating quality content, that seems like a better incentive to create. I know it sucks for those of us who are playing by the rules that spammers would essentially get off scot free if they become “born again,” but the end results might be better for everyone.

    Maybe I’m just dreaming, but I don’t see harsher penalties helping anyone. Google needs to get even better at identifying spam, and then keeping it out of the SERPs. Kicking sites out of the index for a set period of time will just lead to more bad sites. And what about false positives? The question is how does Google create incentives that reward quality content the most?

    Long term penalties for sites that are seeking short term gains don’t really make sense. Eliminating the rewards of spamming (which is different from penalization) is the only way to incentivize quality, in my opinion.

  103. How about adding a spam rating (or replacing the PageRank indicator) to the Google toolbar ?
    I’m suggesting this because more and more sites and often innocent site owners seem to be penalized for linking to sites that contain spam while those spammy sites are being found in the top 10 for a given search phrase.

    Surely, simply finishing what you already started would be good enough for me. The oldest type of spam… hidden links and invisible or very small text…. can still be found in prominent positions from where i’m looking (mainly Dutch language sites) despite them being reported using the webmaster tools.

  104. Willy,

    You’re right in that spammers are all about automation…but that’s precisely why stiffer penalties need to be put into place. Having the sites in the index on any level presents the possibility that these sites get found. People will search for anything under the sun and 100 things above it, and the marginal possibility that these sites can be found under a query such as “sheep entrails shoved up a monkey’s anus” is worse than no possibility of them being found at all. And since these sites rely on automation, they often pick up hundreds, thousands, millions of these phrases. It’s crime and punishment logic. The objective is to set the punishment at such a level as to prevent occurrence or recurrence of the crime (not that SE spam is criminal…but it’s somewhat similar in concept.)

    This is why you need to go beyond just the SERPs and into other things. Cut off the Adsense. Cut off Adwords. Cut off their GMail (come to think of it…just cut off GMail and let it die before 100,000 more so-called SEOs join the ranks of the spammer community and piss us all off). Cut off anything that could possibly be used for a profit motive. On top of that, make it that much harder to get in…and more importantly, back in.

    This is also why there are some of us who have long been advocates of “opt-in” logic with search engines (including at least one very well-known member of the web community, who shall remain nameless unless he chooses to reveal himself…that’s up to him, though). There is more than enough content out there, and the dinosauric logic of spidering everything that could possibly be spidered is simply no longer tenable. Force people, as much as is reasonably possible, to stand up and be counted. If it can be done in 10 minutes with SSL certificate issuance to a reasonable degree of certainty as far as identity is concerned, there’s no reason the same can’t be said for search engines and related products and services. Follow the Comodo lead on this one, people.

    If anything, though, that’s what webspam should work on next…increases in manual items to eliminate full automation, and more accountability.

  105. You changed the answer of “What is an SEO?”. Does this means Google changed the algo? If yes -> Are there big difference? Do we have to wait big changes at our SERP rankings?

  106. This isn’t directly related to webspam, but here’s an example of something webmasters may want to be nice and report privately.

    This page is missing a closing anchor tag </a> after “Webmaster Guidelines”. That’s something I’d personally like to report on the downlow. I’ve made the same mistake (a few times), and many others. I wouldn’t want to publicly call out someone for something like this.

    The problem is that there isn’t any form (besides the spam report) that can easily be accessed from the front. It would be nice if there were forms for each of the individual services and areas of the site so that things like this could be reported privately without going through the spam report form. That would make your lives a lot easier too, I’d imagine; you wouldn’t have to deal with all the non-spam issues and be able to focus on the spam issues.

    In particular, I’d personally like to see forms for non-spam issues in Webmaster Tools. There’s a particular question I have about apparent data inconsistency within Analytics and Webmaster Tools, but I don’t want to raise it publicly. I don’t want to use the spam report, either, because it’s not spam-related. But that’s just me with a wish list.

  107. How about rewarding the ‘Good Guys’ instead of always going after the ‘Bad Guys’? Those sites that year in and year out play the game and stick by the rules, only to see sites that openly flout the rules rise to the top.

    Let’s have Brownie Points for having original sites, not joining spammy link farms, or creating hundreds of ‘bogus’ sites simply to get inbound links, for not buying links or hacking vulnerable blogs to get them.

    I’ve reported a site I considered was hacking for links and it was pointed out that a competitor could easily have done this – so let’s have the power to obliterate/discount links we haven’t requested or are deemed to be ‘wrong’ – that would soon sort out the Good Guys from the Bad Guys – and save the Google Spam team a lot of time!

  108. Hi Matt,

    My wish and probably of many other self-employed persons, web owners or companies who detect spam is that Google work more closely with these people and for people to have no fear that Google could misunderstand something.

    The spam-report could be divided up into more criteria and could only be done from the Webmaster-Tool, in return each website owner would have the allowance to examine if necessary in greater detail suspicious websites without the fear that Google could come to the conclusion that someone’s spam-detection was only achieved by using tools not admited by Google.

    I have learned my lesson from this misconduct and realised that the only real solution to fight spam would be closer interaction between Google and the website owner, if Google policy would allow it. Of course my website must comply whith the Guidelines before I complain about another site.

    Google could launch a request in all Google pages “Register in the Webmaster-Tools, benefit from the advantages and fight with us against spam if you discover that your competition is causing harm”. Or something like it. Competitors causing harm or let us say “spammers” would see this request and it would make things very much more difficult for them in the future.

    What would be the advantage? The advantage would be that every registered website owner who try to keep the results clean in his area could help Google to stop and prevent spam acting at the same time in his own interest. If the competitor does not stick to the rules, this is reported to Google.

    It would be also a preventive measure, so to speak; website owners would no longer trust themselves to creat i.e. Link-Farms, because they would know that his competitor which is registered in Webmaster-Tool act like Google to a certain extent. Probably many will now say that it is taking place anyway as the competitors are those who are sending the most reports. However, if Google were to make it official that a new type of report was being made, this would lead to a situation where:

    1°   Reports unnecessary until now would no longer be sent for something minor as the registered webmaster have to comply with the Guidelines – The webmaster has to be sure that it is a breach of rules and should have a genuine reason, i.e. if the website owner is put at a serious disadvantage by the competitor. It would also be helpful if a confirmation cross was made at the end of the Spamreport to expressly confirm that one had read the Guidelines and there was genuinely a breach. If an website owner notices that his website has no chance at all without spamm, he will not register in Webmaster-Tool. On the other hand, the honest website owner who registers will get to know the Guidelines perfectly well will be supported by specialised SEOs and Goolge employees and will gain a feeling of justice for Google. In that case, he acts not just for himself but for Google as a whole.

    2° The Spamreport department would have fewer complaints to deal with. Google could also be sure that this report did actually report misconduct.

    3° No website owner would be able to perform prohibited practices, because the relevant webmaster would be aware, even if only one of his competitor is registered in Webmaster-Tools that this competitor knows the rules that have to be kept. As already said, Google could ideally release self-designed tools which would make detection even easier. Everybody knows that the normal search is frequently incapable of detecting damage. That would save a lot of time for Google employees figthing against spam.

    It would be helpful if Google were to develop a program that does not yet exist, that puts other programs in the shade and if this program was available to each member in Webmaster-Tool to fight against spam. Using the saying “ Don’t give spammer a chance and use our tool in Webmaster-Tools”, there was more talk about it to get more people on the right side and to register.

    A huge company like BMW was probably punished because the competitors like Mercedes Benz , Volks Wagen or another automobile giant considered themselves to be at a serious disadvantage and have, for this reason, worked very hard to prepare a founded report and therefore effect a reasonable punishment until things got modified by BMW.

     4° Ever more proper companies that are commercial but also non commercial would register in Webmaster-Tools and automatically more SEOs would offer help so that the Guidelines would be increasingly respected ; apart from that the Community would grow constantly and, therefore, fight spam ever more.

    For people who cannot speak English or Spanish, the company could start a call for bilingual support with the relevant languages.

    Google knows exactly what criteria is the most important to be used to fight spam and punish incorrect behavior, for this reason, this post is just an idea thrown into the big Google’s space for ideas in this thread which I really like a lot, but this post is also a small cry for justice for all those people who clearly see that some competitors unfairly enjoy good positions despite their misconduct. It is not easy to have to suppress a feeling of unfairness and to have to ask yourself why all this is happening.

    I am aware that Google is working on an endless number of cases in many departments and that not everything can be done at once. – That gives spammers the chance to continue doing prohibited things, even if a report has been sent, at least until the report is acted on. – Others are not even discovered, because nobody notices them or ignore the misconduct. If ideas or approaches from my suggestions could be implemented, spammers would have no success over time there would be ever more quality information. Honest website owners would not feel disadvantaged because they would be increasingly aware of the Guidelines and because they would be offered better protection against spammy webmasters.

    Matt, it would be great if any of these suggestions for improvement or approaches could be helpful for you in order to fight against spam. It would also be in the interests of all self-employed persons, webmasters or companies that adhere to the Guidelines and, above all, the website owners who know only 90% but not 100% when a breach of the rules takes place due to lack of experience but not due to bad intent, that Google somehow finds a way of motivating them more and encouraging them to register and eventually finding help in Webmaster-Tools.

    As I indicated at the beginning of this post, I am writing as a person who is personally affected. With the punishment, my most important keywords have become as good as lost since 6 June and I now know how it feels to make a really stupid mistake.

    What has that got to do with fighting spam and suggestions for improvement?

    If I had registered in Webmaster-Tools right from the start and had taken advantage of the support from SEOs, I probably would not have made the mistake that has put me in the present situation. I would like to take this opportunity to apologise also to you Matt for my misconduct in accordance with what I described in im my reconsideration request sent on 6 June. I have subsequently corrected things that I first overlooked with the help of SEOs.


    Juan Padilla Sánchez

  109. OK, a lot of comments here. I wanted to go through them all before posting to make sure i am not double posting, but here is my 2c.

    As a webmaster I find it extremely hard to test my website against Google’s guidelines. I mean, I make sure my code validates at w3, and do the obvious about no hidden text, etc.. but, how can I test this? Can Google post there “approved” list of 3rd party silutions for testing your website and code for anything that Google deems bad?

    I had a job where I needed to try and clean up the website code and make sure there were no bad seo tactics, and found the best way to do this was to go through each page. This can suck on a huge website.

    Searching Google for tools is crazy, there are way too many and who knows how up to date they are.

    ~ Jim

  110. get the spam that’s flagged by the akismet plugin in wordpress to be automatically deleted or at least flagged for manual deletion from googles index.

    Deletion, not just pagerank back to zero and for the spammers to be banned from using adsense immeadiately.

    and to get this done quickly vs a 1 month delay, which would reduce the incentives to spam blogs and clog up google index.

    I’m sure Matt from Automattic would be happy to work with you on this.

    maybe someone from google could get paid to hang out on the blackhat blogs and blackhat forums (surfing from outside the googleplex of course), learn the spammers tricks and act a little quicker in countering them.

    Take websites that sell spamming tools and remove them from the index, so they’re not findable via google.

  111. I’m frequently frustrated by searches returning resellers etc. and not the original site. Any search for [Hotel ] used to be a good example, but it seems you have improved (there are now two real hotel websites on the first page, which is better than none on the first 5 pages as it used to be).

    The issue is still quite relevant for the software industry, though. The hundreds of download sites offer no value at all.

  112. Hi Matt,

    I must say you and the webspam team have done a great job the last few years. I believe the results really do get better in most cases. Still, there is one theme that really annoys me; shopping sites, or websites where people can sell their own stuff. I do not know if this ‘problem’ occurs worldwide, but i live in the Netherlands, and we do have a lot of those websites.

    The problem with shopping sites or websites selling 2nd hand products is that the results or product pages often do not exist anymore, because the product already is sold.

    Do you happen to know if more users experience these kind of results with low user value?

  113. Matt,
    This is not exactly spam but I’d like to see more clarification from Google regarding how Google ranks domains on various Google country sites. For example, it is unclear to me why we rank #60 on and #11 on for the same term. If we are relevant for Canadiens why not for Americans? We have customers from all over the world and it would be nice to understand how to target all of our customers without creating a specific website for each country.

  114. The Google search mechanism can be greatly improved in my opinion if you take into account different security settings using popular browsers.

    By that I mean what if a visitor blocks cookies and/or javascript or any other form of active scripting when he/she enters a site from the Google results?. There’re plenty of sites where different content is displayed (if any) when cookies are on/off or when javascript is on/off.

    I understand cases where an account will be created or a checkout process must take place, but in terms of content, sites that come up on the top results of Google, should support all possible browser configurations.

    Therefore if my current browser security settings, say block javascript, I still want to see relevant content from a site I enter, using Google. Seeing the content first like price tags or model or description makes me decide whether or not I will buy something and in general whether or not I can trust the site.

    In many cases all I see is a blank page and so I need to go back and pick up the next result. What is the purpose of bringing sites to the top if a viewer cannot see anything under these circumstances?

    For obvious reasons a viewer cannot simply drop the security settings of the browser when searching for a product or an article and Google should take this into consideration.

  115. Matt:

    I have noticed that in Google Maps, many businesses are allowed to submit multiple listings for the same business at the SAME ADDRESS, so long as they are careful to create a different web site URL that is “rebranded” with a false business name and a different phone number.

    This enables them to have three or more listings in the 10 pack and to dominate the 1 boxes.

    You will recall my listing was completely removed after I spent a fortune paying some a-hole who did multiple bulk uploads for my business without my approval. I went on Mike Blumenthal’s forum and you sided with Maps and insinuated I was some mere “fellow” getting “free airtime”.

    I agreed it was wrong, but in mitigation, I asked that I be given a chance a re-inclusion due to the vague and ambiguous Maps Rule and my circumstances (I asked for you to exercise your subjective reasoning.)

    I pointed out several instances of an unfair application of the Maps Rules that allows MapSpamming to continue and none of those businesses were removed.

    My request is that if you are going to enforce Map Spam Rules, that it be equally/fairly enforced. As it stands, the rules are not enforced fairly, and that will/has led to problems with public perception. (eg. Google is being “evil” and favoring some over others).

    Please look at these posts and you will see what I mean. My request is that you enforce the rules in a manner that won’t ultimately force the FTC to act. I hate government intervention and want to keep the internet free.

    Google is begging for the government to step in when it does things in an unfair manner.

  116. I’d like to see action taken against many of the ‘SEO’ companies out there. Google staff could pose as customers of them to learn more about how they operate, and then penalise them and their content. Too many of them are basically paid for links. Eg companies like Itscoldoutside in the UK. They have created a mass of their own websites, which they then add a whole bunch of paragraph content too with a link to their customers website. They add these links progressively over 3 – 4 months so as not to be too obvious, so google sees their customer gaining a steady increase in one way inbound links – but what has actually happened is the customer has paid good money to this company to create all those links. And the created content is not useful to people.

    I’m sick of constantly being called by these people, I quickly learn that how they propose to ‘get me on the 1st page of google’ is by created / paid for links – so I decline – but then they approach my competitors who go ahead to increase their links. The end result is my competitor gets placed higher than me, when I’ve done my best to not partake in any dodgy practice.

    Its more subtle spam but should be seen as spam none the less. You could then publish a ‘blacklist’ of SEO companies to avoid.

  117. @Robert:

    I am not so sure you are right. If there is relevant content being added that is helpful and it also contains a link to your site, how is that selling links?

    I write articles and post them on e zine for example. What you suggest is that is selling links? Please explain? How are you so sure that these other sites aren’t just adding relevant content to help their clients gain ranking.

    Isn’t that what SEO webmasters are supposed to do?

  118. Dear Matt,

    In the past our site has been punished by google because some things on our page where against the guidelines.
    It was a heavy hit for us as for sure the traffic from google is important for us as it is for almost any web company.
    I found it extremely frustrating how google treated us by not communicating at all and not saying clearly what we did wrong. We had no intention of doing something wrong, and we would have loved if someone or some tool could have pointed us at least which guideline was leading to punishment.
    Believe it or not, but not every site that is filtered out by the spam filters is belonging to gangsta-like spammers. The people working at my company are serious and ambitious and just lovely people with families they work for. Somehow, google just kicked our butts for a tiny reason and failed completely to communicate the reason. Google destroys jobs of honest people by doing so and i hope that is not your intention.

    So, if you ask me what you should work on next, i would ask you to find a way to communicate about whatever you find wrong on a site so honest people that are no seo gurus can fix it and everybody is happy.

    I can also not believe that a multi billion dollar business like google is not able to answer requests for inclusion in a timely manner. It takes in general more than 4 weeks to get any response, and most of these responses are standard emails that leave us more in uncertainty than helping out. For small businesses thats just deadly, and IMO google has to take responsibility and should encounter other businesses with more respect as it does now.

    I understand that you might fear that you give to much know how to spammers by doing so, but compared to destroying honest companies and workplaces, i would say that this is the better option.
    I discussed this with a lot of other (honest!) webmasters and almost everybody i speak to that once have been “caught and punished” is angry about the lack of communication, so i guess you will get a lot of new fans if you manage these points.

  119. It is obvious that major changes are occurring in Google (since February), for some they are major and for others they are minor. As always, it will take time to understand what Google is trying to accomplish. This time, the WMC isn’t replying, no G blogs are explaining the changes, and even here it seems quiet except this post to learn about the spam.
    It may be a minor change, but it seems strange to have sites rank #1 for a term and then be no longer listed in the top 100 for the term; visitors still get to the site through long tail searches, but it is wrong to not communicate. The WMC doesn’t report problems or issues.
    Spam was definitely rampant through June, but now it has been pushed down to the 30+ spot. A new form of spam seems to easily trickle to the top fast and stay for 3-4 days, in the past it was gone in a day. The sites in top 30 are much less relevant than they used to be for popular terms. Seems that 2-5 sites of the top 10 are high media marketing sites, which seem to get an SEO pass or get value from AdSense or other marketing.
    I know you are asking about spam and not relevancy. For the spam, it will always be an issue. Put more value on sites that have been around or ranking for 5+ years that will stop all the fly by night spam sites that easily rank now. Be aware that any site that has been in the top 10 for the past 2-5 years will be targeted by other sites and learn what is spam from those sites that are targeting top sites. Learn what is obvious link baiting and attempts to hurt those sites. It seems to be happening more — and sites that are in the top positions are using new tactics that are not a part of the guidelines, and definitely not relevant or updated sites.
    The most important thing the SPAM team could do is become a part of the WMC, instead of “passing that off to the other team.”

  120. Operation 2008.5:

    1) Identification of paid blog posts – they’re on the rise
    2) Devalue “donation” links from open source type projects that aren’t relevant (if not already done, as it’s easy to catch)
    3) Devalue sitewide juice passed within authority link/partner networks (read: not farms). Example:, and
    4) Re-evaluate links from article submission and directory sites. These sites aren’t used by real people for the most part.
    5) Holy smokes, don’t show PR in the toolbar or directory! (I had to try :p)
    6) Officially tell everyone that comment / forum / press release link spam is near worthless. This hopefully would discourage and stop a lot of spamming and webmaster grief.

    Operation 1984:

    1) Ninjas with budgets to penetrate text link advertiser (aka “paid links”) networks.
    2) Evil alliance with Google Toolbar and Analytics to track bounce rates as a measurement of quality and trust
    3) Can you say blog & forum honey pots? Pooh likes honey, so should you 🙂
    4) Punish the PR hoarders! Identify sites that don’t pass PR naturally – extremely low inbound:outbound, unnatural external site:page link ratios, too much nofollow, (FYI – I don’t argue that this should construe spam, but more like applicable 2nd/3rd pass type filters when the first has already been triggered.)

  121. I almost forgot:

    Make diff comparisons of cached deep pages vs fresh pages. If there’s some new and juicy SEO links in the most recent version, that should be a smoking gun right there for a deep link buy.

  122. @David:

    Yeah I was noticing a crummy site who spend a ton on adwords that uses multiple mirror sites all to same url is ranking number 1 in natural language search.

    And at least one of these sites has no authority and contrary to the news interview I saw with Matt, it truly looks like this particular site is somehow benefiting from adwords gives it to stay at number one.

    I figured that this site is probably forking over at least $3,000 per month. My guess is that spam sites who spend a lot on adwords not only are getting a pass, they are actually getting a boost.

    Can someone explain or advise as to my theory being sound, or flat wrong?

  123. Any chance you guys could work on the crack download sites? One thing they come up for are their previous searches, so even if there are no cracks they pop up for your software’s name everywhere. It’s a real bind if you’re trying to make money by writing and selling your app.

    It hasn’t affected me, really – but I just read today of an author who has a crack site on P1 when you search for their app – it’s very harmful to their business.

  124. I would like to know why the heck VIAGRA ads are in the PPC search for SEO.
    Out of all the people I speak with they think the organic are the real results and that stuff on the side is the spam. I wonder why? So I as a Google advertiser sent an email and got back the generic we don’t care response.

    Em, Pardone…. but is’nt that Google’s revenue model we are trashing with by allowing spammers to trash out revenue producing search terms?

    As for the organic listings, there is nothing worse that getting the myspace keyword squatting as the listings are all junk is a prime example.

  125. take a look at blogspot, Matt I feel like the garden isn’t being tended over there, tracking and penalizing scrappers (ok penalizing is a subjective word, you come up with the choice ‘G’ safe adjective)

  126. I noticed that the Googlebomb of “failure” and “miserable failure” is starting to yield results of sites like and again. Again, IMO, these sites are not authoritative on their own for that phrase, and if you look at the inbound links, you’ll see that they’re from the same kinds of sites who started the first Google bomb, but now they’re altering their anchor text–which evidently is all they need to do to get around whatever you did to “fix” the problem.

    As an independent, I don’t have a horse in the race politically (although as an American I admit I am a little tired of the name-calling both ways).

    But for Google’s sake, since this is the most famous example of Googlebombing there is, I would really recommend getting a handle on it, or Google will look pretty foolish for declaring the phenomenon “fixed”.

  127. Not sure if this is a google SPAM problem or an algorithm problem but over the past 2 months webmaster world has been dominated with complaints from long standing, well established sites who receive 10s if not 100s of thousands of visitors a day.

    The complaint is basically this. Over the past 2 months these sites are seeing massive fluctuations in google organic traffic, something on the order of 90% up down up and down again.

    This is borderline irresponsible on the part of Google. If a site is well established enough to gain that much traffic there should be checks in place within WMT to help these webmasters if and when they have issues.

    Receiving 10s or even 100s of thousands of visitors a day requires a significant investment in hardware, software and man power. Something most webmasters are willing to invest in if Google is generous enough to send it. However if this trend keeps up it could be disastrous not only for Google users (who won’t get consistent results sent to sites that can handle the traffic) but webmasters unwilling to build out infrastructure to handle the traffic.

    It could be that SPAMMERs have managed to pollute the SERPs and cause this problem.. no one on webmaster world seems to have nailed down the cause. For now we sit and wait.

  128. In the past couple of months I have seen spammy sites come out of nowhere to dominate rankings for lots of my keyword targets. Many of these sites are new with little history. Some don’t even include contact information beyond an online form. What they do have are thousands of back links from reciprocal link schemes or triangular linking (mini-nets). One site in particular has something like 200,000 back links, all through reciprocal linking, directory submissions and link farms. Another has seven nearly identical web sites (all eyewear) linking to each other – site 1 links to site 2, site 2 links to site 3…all the way around back to site 1.

    So, definitely take a look at link spamming and mini-nets. I understood that reciprocal linking doesn’t work like it used to, but I guess when you have 200,000, it still does.

    And, I can understand someone having multiple domains with different content that link to each other, but all selling the same thing? Smells like Spam to me!

  129. Matt,

    A new post with some conclusions would be very much appreciated!

  130. I’m not sure if this actually qualifies as spam, but I’m pretty sick of 10 or so comparison shopping site results every time I’m looking for products.

    What I would like to see is Google starting to use the Google Product feeds in SERPs like how Google Maps are displayed for local search results, where the results are filtered by location. That way users could get results for local vendors (based off the business address for the feed) instead of a million billion overseas results from comparison shopping sites.

  131. Hi Matt, thanks for asking!
    Here’s my 2 cents – you should work on –
    # reducing the weightage to spam links(reciprocal links etc) – in fact the weightage for links as a ranking criteria itself. Your focus on links and more links have given rise to link spam everywhere. What I mean is reduce the weightage – not do it away!

    more later 😉

  132. Maria Vanessa Souza

    It would be great if the Google team found a way to effectively fight those people who use social bookmarking spam to get higher ranks for their sites. They use automated bookmarking software like Bookmarking Demon, and some of them even write blog posts to persuade more people to spam social bookmarking sites as a way to build more backlinks.

    By the way, those who teach search engine spamming techniques to their readers should be penalized as well. An example would be, but there are many other blogs like this out there. Of course everyone is free to write whatever they want (free speech should always be respected). But if people teach others how to spam Google, then they shouldn’t deserve to have their sites listed at Google’s search results pages.

  133. Paying more attention to, not only your search users, but people on the internet in general.

    – Social ‘digging’ websites, what’s hot and what’s not? Sure, people can spam these too, but it would be good to think if you guys were to look in to social networks like this in more depth… Possibly an algorithm that crosses over multiple sites so you know there is set quality standard?

    – I think you guys are already doing this, but also looking more indepth at how people fiddle your results in the way of setting up IP proxies and then trying to mimic natural searches, but targeted on certain keywords and websites, to help them rank better… … …

  134. I feel that there must be a serious unresolved flaw in the Google algorithm, causing a site which is merely “large” to appear to be more relevant than a small site that has real expertise on a subject. For example, a Wikipedia entry written by a person who knows little about a subject will almost always be higher in Google (#1 or #2) than the writings of a known expert.

    One instance is where searching for Martin Luther King gives you Wikipedia first and the really reliable sources after it.

    Another instance is in searching for Aesthetic Realism in Google. A very large site by a Michael Bluejay–about the subjects of cycling, vegetarianism, nudism, economizing on electricity and so on–really has comparatively few pages, all consisting of terrible smears, on the subject of Aesthetic Realism. Yet it is #3 in Google–higher than the writings of Eli Siegel, who founded the philosophy. The only Google-worthy attribute of the Bluejay website is that it has many pages, and many incoming links to them, although every one of these pages is on some other subject.

    Surely all this is the result of some unresolved flaw in the Google algorithm, in which size and incoming links matter too much and authenticity and real knowledge matter too little.

  135. Make links worth less in the ranking algorithm. Google really let the cat out of the bag when it became known that links are gold to Google. All the linking schemes, link brokering and all that wouldn’t happen if links weren’t king. I don’t know what you do instead, but I’m so skeptical of most sites that have tons of links. Especially with anchor text in the link. People don’t like like that, they link with a url: go visit for great search results!

  136. 1. You really don’t have the manpower to handle the massive volume of spam sites in the results.
    There needs to be a positive drive to do something about this. Google should start to use community resources here, every other industry does now. One way to do this would be to set up user groups in each market segment, to assist you. Of course there are potential problems here, but what is the alternative? A few teething troubles or the continued monster levels of low-quality sites in the results? I know about the outsourced teams but this would be different – regional or national groups who know their markets and the players in them. People very close to the action.

    2. Do something about the reinclusion request disaster area.
    This is one of the biggest blots on Google’s reputation currently, as mega businesses with inside contacts can easily get their problems resolved; but small, honest businesses are being broken, as there is no way to resolve technical issues they cannot fix without your advice. If you don’t like an aspect of somebody’s site (which is perfectly OK with other search engines), and they need help fixing it and ask you repeatedly for help, then what exactly is the problem? Staffing levels? Costs within the business to action these communication requests? No one knows, but it leaves a very bad impression concerning both morals and efficiency.

    3. Get rid of the link value of site-wide footer links.
    What can the possible value be of site-wide links, especially cookie-cutter footer links? Many sites that top the rankings for no apparent reason are found to have tens of thousands of links, all cloned. There may be hundreds of links or more from any one site. Any more than a set number of links from any one site should be disallowed. Don’t tell me the algo does this now because the evidence directly contradicts this.

    4. A form within WMT to report problems that may be associated with your own site that have been generated elsewhere.
    For example, if you are attacked by someone creating ten thousand scripted links to your site; or generating a bunch of links from real bad neighbourhoods to your site. A Search Quality Issues form for your own site.

    5. Pay-access websites appearing in search results.
    If it’s hidden, it’s cloaked. They can appear on page 2 or 3 maybe, but not on page 1, where people want to see a route to quick, honest answers. It seems people are dishonestly promoting paid services as public information, which cannot be right. I don’t know how hidden information can appear in search results – but by definition it is wrong. Does this mean if we spoof Googlebot as the user-agent, we could perhaps get to see the information that we are being directed to? Hmm…

    Chris Price (the Kent, UK one – please note there are several)

  137. Nice post Chris (the Kent, UK one)

  138. I think that everyone should stop harping about blogger. I think that the gmail situation should be worked on.

  139. Hey everybody, thank you very much for the suggestions. There are too many suggestions for me to reply to each, but I’ve distilled the suggestions so far down to about two pages of requests. There are a lot of good suggestions for core webspam (e.g. more efforts on artificial/gibberish text), but I was also struck by the number of votes for stronger spam action on other Google properties. Historically my group only worries about spam in Google’s web index–other groups normally handle their own spam/abuse issues–but this tells me that it would help to assist other Google properties more than we do now.

  140. Google should work on improving their customer service. If I email google I’d like a responce back from a human.

    Currently I get nothing back 99.9% of the time.

  141. Google should have a “Report Spam” button on its web browser / firefox tool bar, much similar to Gmail and is only visible when a user logs into his/her google account.

    When ever a website is being surf and the user never wants to see the site again because of extremely irrelevant contant and user gets extremely fustrated with that, user marks that website / webpage as spam so that google will never show that website to the user ever again, unless user unspams it.

    Google SERP results should not have the spam button to prevent abuse, it should only be enabled after the user enters the site and decides it is extremely irrelevant to him.

    After many users mark the same site as spam, google may consider looking at the site and should it really be irrelavant, sandbox it.

    just my 2 cents. too many rubbish websites nowadays, time flies when I surf the net, just busy clicking the back button.

  142. 1. Adword Spam…. As someone who pays $$$ on adwords, I hate to compete with spammers whose landing page itself is full of ads

    2. Please do not pay so much attention to a domain name… Squatters have registered top level domains and these domains do not necessarily contribute to the user experience. Searching for “blue widgets” should only bring back only if it is not a spammy site… no matter how the keyword blue widgets rank

    3. Start concentrating on Ecommerce sites. Many Ecommerce sites do not naturally get any inbound links unless they have a very very (may I repeat myself) very (This ain’t keyword stuffing is it?) very 🙂 unique product. So they either build links through reciprocal links or… may I say it outloud… Buy them :p …

    4. Place more emphasis on local search and clean it up! My wife is a Dentist . When She who has no idea what the difference between a and a is, wants to promote her website (designed by yours truly), she should have a way to do it other than by using adwords… and no she ain’t selling male enhancement potions or …. She just wants to know how to promote her business in her local area with a more than 10 yr old website whose green google thingy (PageRank) shows a value of….. drum roll …. 1. And when she searches for relevant keywords in her local area, her site is probably on page 200K+ with the first page filled with all the big guns that she cannot compete against. So pleas practice what you preach and Do Not Be E…

    5. Maybe it is time to remove that green colored tab… Green reminds me of $$ and if, we take the Big G word, that is just for entertainment…. then I am fully entertained. Please remove it from the tool bar. Otherwise I will be confused as to what to believe … whether Paying $1000 a month for a site whose green thingy says they have a PR of 8 or listen to you saying I should never buy Links!

    enough said

  143. Ability to flag on google webmaster tools which keywords are NOT related to my website.

  144. I have revisited this post because I’m increasingly coming across a form of spam which I thought was dead, but that I’m starting to see again.

    Basically what I’ve been seeing is webmasters using and to show one version of content to bots and another to humans. Generally speaking they will have a short, catchy piece of content for human users and a very long copy with keywords repeated throughout.

    My assumption was that the competing sites were very weak, but they appeared to be okay and in one case the competition was very fierce.

    I have reported a couple of instances but I can still see one of them in the SERPs for that result – it is particularly blatant and I’m really suprised it’s still there. I’d be happy to provide a sample of what I’m talking about via email.

  145. In my last post you can’t see it, but I was trying to show ‘frame’ and ‘noframe’ elements, so it should read –

    Basically what I’ve been seeing is webmasters using ‘frame’ and ‘noframe’ to show one version of content to bots and another to humans.

  146. Dear Matt,

    I refer to my last coment of 7 July which appears on this thread and to my proposal regarding spam and competition.

    In this connection I’d like to say that I wrote to the Google-department and reported a misconduct sendig it from the Webmaster tool on 7 October.
    If you look at my report, you’ll see that the urls specified ( more than 30 mostly repeating the same link text on each site ) are linking to a main target site. Before I informed Google I got in touch with the responsable webmaster as I thougt that the owner wouldn’t know the incident but It seems that they know it quite well – Nothing changed so far.

    Perhaps my report is still underway but I am not sure whether I should have sent it directly to I preferred to use the main-tool like I prefer to use the main webmaster help because it has much more threads and problem solutions.
    Furthermore my former reconsideration request was processed using the english version.

    What I’d definitely like to avoid is a situation involving a mixture of information in different languages or the same information arriving in different departments.

    Is there anything we can do to prevent more misconducts and protect affected competitors, something similar as I describe on my proposal of 7 July?

    P.S. I’d like to thank you for finding the time to deal with my previous concern and for your comprehension.

  147. Dear Matt.

    In the past few years I read a lot about your work to enhance Google-Listings and give users a better surfing experience. So – good work until now 😎

    I write you in that place because, as I – as a programmer – am getting a lot of headache from Google results. Many times – and I mean MANY times … the site of is turning up in the SERPs and it seems that they have just the solution, I am looking for.

    A few years ago – they were hiding text and always asked for that subscription, where they actually make money with. To me this was very clear cloaking – as I never saw that results, that I saw in the SERPs.

    Then … some month ago – they really did post their solutions. They were at the end of their pages – but … when I scrolled down – I could find the correct text, that I was looking for. So that was OK – as I had to skip that massive amount of self-promotion 😎

    But today they are cloaking again … and funny enough – they even tell you that a page is cloaked 8-))))) They do this by their own footer-line, which looks like “20071212-EE-VQP-20 – Google – Hierarchy / EE_QW_2_20070628” for the Google-Bot and like: “20080716-EE-VQP-32 – Hierarchy / EE_QW_2_20070628” for humans. 😎

    To prove this cloaking I did some PDF-Printing, but unfortunately Firefox and IE are not printing the whole html-document well … so I did a MHT-File from IE (save as archive). Also I did a PNG-Screen-Copy to show the JS-Boxes that were closed in the print. Look at that URL. It is a link to my public Media-Center, that I get from my email-provider. The link is open for 30 days.
    Sorry for german GUI, but that was the easiest way to publish that documents. Click on “GMX Media-Center starten”. You can view the documents in the browser – or download under “Datei – Download”.

    I personally think that it is time to ban these guys. They did fool me many many times with their cloaking in the past – and I dont know how many million other programmers …

    Have a nice day.

  148. The main problem I find frustrating is when looking for an answer to a question only to find loads of pages where people have already asked/posted the question and yet no-one to answer or provide a solution.

  149. I would like to see Google work on faster removal of deleted webpages from websites,this is especially relevant to ecommerce stores with discontinued products on their dynamic websites.

  150. Not sure if this constitutes web spam, but I was wondering what if anything Google does about all the “work from home” scams that are really aimed at google… For example, is supposedly a get rich from Google work from home thing where it looks like it is a Google Business Kit, but really what happens is they charge your credit card $70 per month, so they are the only ones making money. I can’t believe how many people are fooled by this junk. My Dad called me today to say he heard about this great opportunity and thought since I work on the web I would be interested… Yeah – I’m interested… To get them shut down! There are gobs of these scams listed in Google search results:

  151. I desperately want a pair of buttons next to each search result. One button flags a domain as spam, the other flags an individual page as spam.

    I don’t care whether my reports are ever investigated, but I want the power to tell Google “never ever ever show me this domain/page again.”

  152. I think spamming – blogging – adsense will always exist. Good written blogs are a goldmine for spammers! If only they could stop spammers that ruin it for everybody who wants to do backlinking the normal and educated way…

  153. Stop spamming, that would be the greatest but most impossible job of all times!