Webspam projects in 2010?

About a year and a half ago, I asked for suggestions for webspam projects for 2009. The feedback that we got was extremely helpful. It’s almost exactly the middle of 2010, so it seemed like a good time to ask again: what projects do you think webspam should work on in 2010 and beyond?

Here’s the instructions from an earlier post:

Based on your experiences, close your eyes and think about what area(s) you wish Google would work on. You probably want to think about it for a while without viewing other people’s comments, and I’m not going to mention any specific area that would bias you; I want people to independently consider what they think Google should work on to decrease webspam in the next six months to a year.

Once you’ve come up with the idea(s) that you think are most pressing, please add a constructive comment. I don’t want individual sites called out or much discussion; just chime in once with what you’d like to see Google work on in webspam.

Add your suggestion below, and thanks!

365 Responses to Webspam projects in 2010? (Leave a comment)

  1. 1. Google Chrome Web spam report tool extension.
    a. upon Google spam SERP, click button to launch mini widget window with form fields from spamreport.
    2. Setup email account where i can forward those spammy link building emails i get instead of forwarding to webmaster@google.com and getting no-reply fill out form message. (if i have to fill out the form , i’m just going to delete the email and move on.. but if you have someone chasing down these peeps, then make email addy public or have them send me an email, etc.)
    3. Informative Blog posts with examples of Spam, to better help us report it.
    4. URL form for scraper sites as in .. here is google result.. see all these worthless pages using clients name in 7 of the top 10 results that have inaccurate information that leads the searcher astray?

    Thanks Matt! PLEASE, PLEASE keep up the good work. =)

  2. Hmmm this isn’t an idea per se, but please give us an idea on the roadmap for changes to Google Webmaster Tools… I did a seminar a few months ago to explain some of its core concepts and it was April 15th! The same day you guys changed everything in it with the more Google Analytics type reports… I love it! But MAN I looked like an idiot briefly with my client. 😀 At least push out a warning a week ahead?! -Tom

  3. Michael Hoskins

    Please work on penalizing scraping sites purporting themselves to be useful. As an example, say I aggregate specific forum topics (such as Javascript programming), and present this content on my site. Without adding any real value, I present the content surrounded by my ads, and a link to the scraped content. What I am essentially doing is making myself the middle-man and skimming ad revenue off the top of what Google would normally do without providing value to the user.

    I have seen numerous sites do this, and they seem to keep climbing the ranks in search results, despite being directly against both the spirit and letter of Google quality guidelines.

  4. I think 2010 should be a good year for Google to put penalty on non English websites as well. For 6 years I’m member of an online community that fights email / SEO spam and other types of “bad conduit”, however, looks like on the SEO spam area, Google was not a good support for us. All the sites that the community members reported over time for doing seo spam are doing just great and sites that are following the white hat and so on… are left behind in the “dark”.

    Do you have any plans to improve this any time soon? 🙂

    Thanks and have an excellent “new year” in your fight against web spam.

  5. When I get Alerts, at least half of the results are posted by “admin” using machine generated content. If you wanted to make that service much more useful, Google could figure out a way to filter out the blog-spam.

  6. One thing that I’m amazed still seems to work is publishing many pages of identical content with the exception of a city or town name. It’s not easy convincing clients not to do this sort of thing by telling them that Google is bound to crack down on the practice one of these days.

  7. Well I don’t know if it is strictly a spam problem, but it is definitely a search quality problem : when you search “nameofaproduct review”, results will very often be filled with results that are optimized for “review” but are not actual reviews. These pages frequently link to reviews or automatically compile some review excerpts but do not provide useful information. This is really a pain as the user need to go back to SERPs or to click on these pages to access the real stuff. So I wish Google could find some algo to boost actual reviews based perhaps on reputation analysis and time spent on pages (as a real review should show longer reading time)

  8. Malware Malware Malware. Every day I get Google Alerts pointing to stuff that’s scraping my content for their snippets, and serving up spam and/or malware. EVERY DAY. I know the footprints by site by now, so Google sure oughta. I’m still fighting that firework/july crap that flooding Google right now.

    Not to mention dealing with clients who unwittingly click on stuff that causes smoke and fire and jello to come of their computers.

  9. I would be glad to see “Google Me” – that would be my prefered project 😉

  10. +1 to Michael Hoskins suggestions. I also see it as a big problem.

    Also: I’d love to see spam controlled in Google Groups.

    And, it probably doesn’t count as webspam per se, but the spam rampant in reviews for apps in the Android Market is pretty embarrassing. And it’s often promoting sites that pirate paid apps!

  11. Presuming that most (if not all) URL’s flagged up on blog comments by the likes of Akismet are legitimate spam – as they are manually verified by bloggers as such…

    …would Google consider taking in information from “accredited sources” and use that data to downgrade websites that have been reported as spamming blogs?

    Currently, Google can’t see comment spam links that are deleted by blog owners, so the only penalty to a spammer is to have wasted a few seconds of time. By tracking the deleted, aka, “invisible” comment spam links, Google becomes aware of which websites are behaving in a manner which is naughty and can react appropriately.

    This takes a passive approach (some inbound links = poor ranking) and turns it into a proactive one (some inbound links BUT people are also reporting the site as a blog spammer, so inbound links might be suspect).

    Such an approach might (laughs) even help kill off the comment spam industry as they learn that spamming blogs no longer results in just a waste of effort as the links are deleted – but could actually harm their site ranking as such deleted links are flagged to the search engines.

    Obviously, much tighter quality control of blog spam reporting would be needed to be accredited as a spam source reporter and prevent malicious reports, which in itself would benefit the industry.

  12. Make it possible to follow a spam report. I’ve collected a list with about 150 spamblogs but since I don’t get any response on my current reports I don’t see any point in continuing spending my time reporting the rest of them.

  13. For the most important thing is: GMail’s Spam Filter isn’t effective anymore. I’ve been testing it, and the result is I gets 2 spams everyday. And also make the spam report more user-friendly, not just report the phising message. Google Webmaster Team should fix these problems.

    Thanks if you reply my feedback.

  14. Good question, thanks for asking.

    There are 2 areas that I think you’re already working on, but would likely benefit from more attention.

    The first is sites that simply copy content from other sites and present it as their own. I’m sure you understand why that is a problem, but will elaborate upon request.

    The 2nd is sites that present one fact to you (google) and another to the actual visitor. I know you already frown on that, but it happens often enough that I think it’s worth more attention.

  15. Providing a regional targeting option within GWT would be great (Asia Pacific, Europe, North America, etc). I have some domains that I cannot geo-target because they are relevant to an entire region, but since I can’t regionally target, I have Europe results displaying in Asia.

    I know concern has been expressed about allowing multi geo targeting because you don’t want webmasters saying they are relevant in every country, but allowing us to choose a single region would help to not display the un-targetted as noisy results.

  16. Abnormal link profile cleanups. I’ve got White Hat clients. I come across competitor after competitor using blindingly obvious dofollow undeclared paid backlinks with keyword rich anchor text, ranking highly – techniques from years and years ago. If the link profile is inspected, they have keyword laden backlinks on sites that are shared with many other unrelated keyword stuffed backlinks. Report the sites, and the competitors may drop rank or disappear for a while, and then they’re back with a new set of rubbish links in weeks.

    It’s pretty disheartening, and has clients saying “why do we have to do this work to get what you call good quality backlinks, when we can find 40 sites in an hour with undeclared paid backlinks on keyword rich anchor text, supporting our competitors, and in Google’s index? It’s clearly not important to Google. Why can’t we do the same?”

    Part of the problem is that the sites hosting the undeclared paid backlinks appear in search results and often host AdSense; zero-weighting the spammy links isn’t enough. The sites, or at least the pages, offering the spammy backlinks should be vanished from the index, and AdSense usage revoked. It’d be a pretty big inducement to stay clean, and would encourage businesses to at least *pretend* to have a normal link profile.

  17. Paid links of course. This is still not fixed. Obviously that is the hardest one but here’s one idea that can at least lower the volume of link trade.

    State in your terms that linking to link brokers is equal to spamming yourself. I mean crack down on the people that spread the word and promote services like textlinkads. Sure that would not stop them from operating but they are somehow dependent on referrals and advertising. Knowing that linking to such sites is against Google’s T&C, people would stop promoting them directly at least. Once the links to them are removed they would also stop ranking for related queries and will only rely on direct hits and word of mouth. This would also act like some market entrance barrier to new brokers.

    Of course there’s no way to fight the personal link deals but you can at least remove the large scale players.

  18. Clean up google news, some of the things in there are very spammy

  19. I’d also like to see less of the scraper/aggregator sites, less of the MFA sites and less of those pages that are stuffed with content cut, copied & pasted from “Press Release” sites. The dead giveaway is often a poorly-designed site (e.g. a bog-standard WordPress template) crammed full of stuff that a quick look at Copyscape will show up to be one of many duplicates…

  20. I think Google needs to better examine the value of links, especially those links being built automatically via spamdexing: forum profile links, automated blog commenting, etc. I have seen the quality of results go down and more people succeeding at boosting their rankings. Frankly, I think caffeine has a vulnerability in this respect.

    A good example is the search “P90X Workout”. Those of us who monitor that keyword have seen something alarming in the last 2 months: websites and url’s less than a few months old, building thousands of links and taking over the serps. And whats even worse, they seem to stay there. Most of what’s on page 1 and 2 are a good example of spamdexing. Some point to their sites, others to pages on web 2.0 like hub pages, squidoo, free articles, etc.

    Note: the results of this search terms includes a listing removed under the DMCA. That site (google techs will be able to look it up – i wont name the domain) built an average of 10-20k links per month and stayed in the #2 spot for months until it was removed.

    I can share more details and examples if anyone care to know more.

  21. Better detection of spammy content injected by hackers into legitimate websites and discounting it. Better distinguish between legitimate and injected content – penalize/discount only illicit content. Make it not worthwhile for hackers to spoil life of normal webmasters.

    +1 for Ian Mansfield’s suggestion on utilizing data from “accredited source”.

  22. Regarding Michael Hoskins’s comment; isn’t alltop.com doing the same thing? Would Google ever ban alltop.com because they scrape content off of API and RSS. I personally think that these aggregator sites do not provide much use to the overall online user 90% of the time but can Google mark all of them as “spam” ?

  23. Matt,

    This is great – whether it helps or not, here, just two of the areas on my wish list…

    1. Article Marketing. It’s just gotten completely out of hand and I can’t see where there’s any really high quality content out there. It’s one of the lamest concepts on earth. Personally I don’t fear it from an SEO perspective as relates to my work and my clients, as much as a problem with pollution when I’m a consumer looking for quality information as it applies to my own life.

    2. Brute-Force volume of noise.
    Example markets include health topics, insurance topics, and legal topics, to name a few….

    Article marketing spam as just one aspect.

    Pseudo Non-Profit or “news” sites run by and for the purpose of lead conversion and sending links back to their parent site, or for AdSense.

    These are then combined with very low quality high volume “news articles” posted to blogs and given “news” RSS feeds (which themselves end up polluting Google Alerts).

    In some markets, (legal and insurance for example) these are then combined with the “free page counter” sites that many of them generate for the sole purpose of embedding links to one or more of their sites, generating thousands of low quality inbound links.

    By themselves, any of these tactics would probably be discounted, however I think the sum-total of all these efforts drives up really low quality results in the Google SERP. It’s like the combination makes for brute-force weight and it’s why I think I see so much garbage on the 1st page of Google in some markets.

  24. Matt, one thing I am seeing working very well for a vertical we are in is this:

    A site is buying expired domains, mostly on education topics (i.e. has a lot of inbounds), and then instead of 301’ing them over they are putting the old content up – and making themselves the “sponsor” of the site with banner ads…nothing wrong so far, right? But on every domain there is always one link (in content) on the homepage that does not have the affiliate code and is always anchor text. This is working very well for them. How I was able to find how vast their network of bought domains were was by searching for the phone number on their pages. They did a VERY good job of covering their tracks. I think that by putting up most affiliate looking links on the site they bought it would look to a human like a rogue affiliate is linking to them, but the fact that one link on the homepage with anchor text always flies through was an interesting find.

    I don’t know what you guys can do about it, but man it sucks watching this fly.

  25. I can live with the quality of results g currently has. What drove me to use BING for the last year was all the changes to the layout. If I have to learn a new system, I might as well learn something entirely new like Bing.

    There is no spam. There are only bad searching engine algos.

    > wish Google would work on.

    1) SERP design Spam: I’d liek to go back to the clean interface we knew and loved. Call it “Classic” or “Traditional”, but I wish it worked like it used to.

    Specifically:

    – Universal Spam : I don’t want “universal” spam results. I don’t care if it just happened and is in the news. I want to look for websites. If I want news, I will click the news tab.
    – Video Image Spam : like uni spam, I can do without the images. It just slows down the display and is overy prominent.
    – AdWords Spam : Can’t tell the diff between adwords and universal spam.
    – Layout Spam : I really miss the classic interface. The left side column is totally useless page spam.

    2) WebmasterTools.

    – The quality of data in these utils bad. Particularly the keyword and referral data. I am going to stop there before I say something I mean.

    3) Search Results:
    – Seem to be in good shape. When I generally go to Google, I know enough to skip the first 2 results (outside of deep long tail, I find they are most often not what I was looking for, so I skip over them to start).
    – I’d like 2 modes of searching, “research mode” and “purchase mode” where I would get either heavy news/edu/wikipedia results or “purchase mode” where the results would be ecom/spam heavy.

    Jon 😉

  26. Some of the autogenerated spam stuff that is plaguing the top 10 serps, or stuff piggybacking off of high authority domains. I know you said you didn’t want specific sites, but for some example queries see [better mortgage], which has a parked domain ranking and had a shine.yahoo.com spam site for ages (still in top 20, Yahoo removed the page, G didn’t automatically catch it). I also found lots of long tail stuff recently that was ranking that shouldn’t be, sometimes 4 or 5 out of the top 10 being fluff pages. I know that Google might place less “importance” on these non-popular queries, but I think it definitely speaks to G’s ability to determine quality.

  27. Matt, personally I’d like to see you guys work on making rules and guidelines apply to big sites as well as small. Large sites that have empty pages, low quality questions or just all scraped content continue to pollute the SERPs while the average webmaster is penalized or banned for the same infractions.

    We see sites like Mahalo (a great, but not no means the only, example) making no doubt millions in AdSense revenue for themselves as well as Google while guilty of countless infractions against Google’s guidelines and I believe it sends the wrong message to webmasters.

    Along with sending the wrong message, it also often frustrates webmasters to the point that they’re willing to break the rules themselves just to compete with this nonsense.

  28. Link buying.

    If Google can improve their ability to sniff out sites that engage in link buying/selling and penalize them accordingly, that would be great for new sites that are trying to earn PageRank, keyword search juice, honest link-backs, etc.

    Over time, I know that link buying doesn’t really work, but in the short-term, I have seen newer sites suffer, even though they have consistent relevant content, because they don’t accrue 500 links in month one.

    Just my $.02. Thanks!

  29. I want to suggest using Akismet or other community monitored spam detection services, but that would make blog spamming into a Google-Bowling activity. If it became known that submitting links to blogs was cause to get suspicious about a site, then falsely submitting sites in blogspam would become more common.

    However, the biggest risk appears to be sites with blogs/forums with no comment protection and who aren’t using nofollow for unmoderated UGC. The miniscule fraction that do allow the crap through, causes the blogspamming industry to keep churning. Like email, it only takes a few buyers/dofollows to justify the effort.

    So… use Akismet and other sources, to zeroweight pages that host spammy comments, and report those pages to site owners in the webmaster tools. It’ll act on site owners as a way to understand that they need to moderate UGC, and will decrease the value of blogspam more effectively than anything else I’ve thought of… though there’s lots of stuff I haven’t thought of.

    Marmite coated weasels trained to attack blog spammers. Just thought of that… Damn but I’m good. Trained sharks, with dry suits and caterpillar treads, armed with skunk/chilli spray, suitable for that PacRim spamming industry? Whew, on a roll! Nah. Maybe some report in the webmaster tools that your pages are filled with essence of spam might be better, and the implication that the more spam you got, the lower the weight…

  30. Jonathan Mills

    I wish Google would do a better job of cracking down on SEOs who are using Google’s name in promoting their business and making guarantees. (master google.com) for example

  31. an easy way to report webspam, something that can be integrated to the browser, like a chrome extension ?

  32. Morten Bock’s comment from last year still rings very true. I realize it’s in Google’s interest to show the adsense-saturated sites as opposed to the official site, but hunting through the search results for the authoritative result shouldn’t be part of the Google search experience.

  33. i’d have to agree with letting us know about webmaster changes. my boss who knows nothing about what SEO/SEM asked me to show him what exactly I was doing. i had trouble navigating through webmaster tools and it was kind of embarrassing. a heads up would be great

  34. I’d love to be able to (as a user) rank or penalize sites. When I get results that turn out to be articles saying approximately nothing and surrounding themselves in ads, I always wish that I could do something to push them down in the rankings.

    Is there a way users could let you know whether the information was useful?

    This would also be helpful for many of my clients who have bad content on their sites: having a wake up call in the form of dropping in the rankings might be just what they need. Of course, I’d also want to be able to notify Google when we change their content so that they could climb back up.

  35. What I call 3rd Party website content spam, which includes posting meaningless, yet coherent, duplicated keyword spam content articles across many (hundreds of) free blog sites with a link to the e-commerce site in question, should be looked at more closely for spam filtering. These typically also have a laundry list of keyword stuffed anchor text links to the same page or site as part of the links section of said blog post page.

  36. I think Google’s number one webspam priority should be to address the quality SERP results being attained by sites built around keywords for no other real reason than to generate adsense revenue.

    The adsense component leads to the appearance that there’s a conflict of interest for Google so it hurts your reputation.

  37. Google needs to realize that it is their algorithm which has created the link-spam problem. The algorithm is too narrow and it places too much emphasis on links. It is due to this emphasis that link-spam has become so prevalent.

    Google has created this beast and it will never slay it. There are a infinite number of pages created every day through automated content creation tools for the purpose of generating links. The tools with time, will just become better and better and Google will need to constantly upgrade their spam detection method. It is a battle that will never end.
    Your job is very secure Mr. Cutts!

    I think we need to stop focusing on links so much. Links in the wild are no longer natural. There is not a webmaster on the web who is not aware of the “value” of a link. What kind of crazy Internet has Google created? Nobody links any more without at the very least being aware that they are passing link juice.

    So what is the answer? Facebook probably has the answer. Social sharing. The universal Like Button. If real people are liking it and sharing it then it is good content. How to verify that they are real people? I’ll leave that to you Mr. Cutts as you will no longer be needed to head the link-spam department because link-spam will no longer exist once links are no longer part of the algorithm.
    Cheers !

  38. I think in the next 6 month Google should work on taking out of the index websites that are there JUST to make money via adsence on the page. These website have no value for a user and definitely not what the user is looking for on his search.

  39. Matt, something I am sick of seeing in the top results are non-sensical pages. Obviously they are scraping random content or simply having non-English speakers generate it. But it seems to me that if MS Word can detect incorrect grammar when writing a document, why can’t Google detect pages chuck full of random words that have no grammatical value at all? I mean, I know I may not use words correctly all the time, but some of the sites are obvious. They aren’t even close to proper grammar. Where is the value for the user here? I would love to see those go away through some sort of grammar checking technology similar to MS Word or something.

  40. I think that given the whole Buzz thing where people I blocked were able to re-follow me the next day, maybe you should beef up the security there rather than focus entirely on webspam. Just sayin’…

  41. One thing I often find is that, as much as Google says it discourages it, keyword stuffing still seems to work for at least some websites that do it. For instance, If you search for info about a car in India (by typing any car model name in the search box) the first site you get (I won’t use the name here) obviously does a lot of keyword stuffing, unnecessary bolding etc.

    Of course, it might be the case that Google is actually penalizing it a bit for the same, but the other ranking related factors are so strongly in its favor that it still comes out tops. But it certainly is far from the best site for finding car related info/reviews etc, so I am suspicious about how it manages to rank highly.

    So I think you could investigate this particular case, and in general just put a lot of energy into improving your keyword stuffing detection algorithms.

    All said and done, you guys still do a better job of these things than your competitors by far, but still there is scope for improvement. Hope this suggestion helps.

    Best wishes.

  42. Integrate a tool in webmaster tools to see if your website is blocked in various blocklists and why. Then make it easy for the use to fix the problems and remove themselves from that blocklist.

    This could also be done with email spam as well. Have a tool which shows if that domain is blocked.

  43. Add a settings page on GMAIL to filter on accepted languages. I should only receive emails in English and Spanish. Any other language is SPAM to me.

  44. The biggest issues I can think of are:

    1) Web pages that don’t deliver what they promise. The page doesn’t have what the title and description say it has.

    2) Pages that don’t have real content. Whether they are scraper sites, or empty review pages, or computer generated tag pages with no content, etc.

  45. For our small company, we spend a tremendous amount of resources on fighting spam, and as an open publishing platform, we want to do more and would love Google’s help. Here are a few areas we would like help with that would allow us to do our part in cleaning Serps up – most of these suggestions could be delivered via WMTs.

    1. I’d like a service that identifies content on our site where people are spamming other sites to promote it so we can take down the content.

    2. I’d like to get a list of links from our site that are to low quality domains or part of link spam.

    3. I’d like otices of pages that were removed from the index. We’ve seen some pages appear to be permanently removed from the Google index. We would love to know why so we can prevent similar pages from getting published and indexed in the first place.

  46. Matt,

    I’m sure this isn’t your specific area, but something needs to be done about Google Places content submissions, verification and maintenance. Unfortunately I don’t have any suggestions, but in recent months a lot of good profiles and accounts have been blocked and the owners are not being notified why. It would seem that there are issues with multiple people trying to owl one business / location, but I cannot say for sure.

    Anyways, if you pass this along or shed some light on this matter it would be greatly appreciated.

  47. It will be great if Google implement content signatures in Google Webmaster Tools, something like signed emails. For example I create private key and store it on my website, same key I send to Google in Webmaster Tools.

    When I post new article, blog post, and something like that I can add signature for this page using some meta tag

    meta name=”google-site-content-signature” content=”XXXXXXXXXXXXXXXX”

    signature can be created using MD5

    md5($domain . $content . $dateTime . $privateKey);

    Now when I create this signatures for some posts on my website, it will be great if I can set some parameter for my Website in Google Webmaster Tools, for example
    “Prevent to copy signed content on websites outside this domain”.

    So when Googlebot will find scraped content on some other website it will be obvious that website is useless.

  48. If we consider copy cats as spammer, Google seemed to fail handling them in my instance. I have seen copy cats being top in the serp than me. Where should we report then? Should we directly go to police? please help. I even contacted some of copycats to remove my content but got no reply.
    There are also some auto blogger who take contents from many feeds stil remain top in serp and even serve adsense.

  49. Reply to 1st comment, i created a blog a year ago (not revealing the link) which contained email spams as examples to be aware of. But i was afraid if Google would consider me as spamer and delete my account. Should i continue that blog?

  50. Local, Local, Local, Local, LOCAL, LOCAL, LOCAL….

    Seriously, it’s much better than it used to be but local business listings need to be steal-proof and as accurate as possible knowing that the bulk of local business owners will never register their business with Google.

  51. Discussions / Bulletin Boards on the Web

    I think Google should work on the way they handle discussions on the web. There are so many bulletin boards and mailinglist but I really think google does not handle the discussions very well and lists a lot of useless stuff because of that.

    Google tries to list threads of bulletin boards with a special snippet in search. This works quite well for most software (phpbb, vbulletin, ipb), but I have seen some sites where this went completely wrong. Google often does not get the post count and the date right and it seems to be hard to get what’s the next site of a thread.

    I miss ways for webmasters to tell google that the content of a site is part of a discussion. I would love to see e.g. a microformat proposal for discussions and related discussions. For a start Google could at least tell webmasters what format they should use for ther custom forum to get a correct listing for their threads.

    Best Regards
    Tobias

  52. I think one of the biggest problem are the false negatives. You filter many pages out for more than one year(Filter, Sandbox), but that are often pages that are useful. Here you should do some finetuning.

    In general, I think your way of filtering or sandboxing is wrong. If you see that somebody buys links, if you find dc on the page or if you see that someone makes blog-spam you drop his page down to the 10 or last Serp-Page. But it can be a real cool site. Instead of dropping down out of ranking, you could reduce the score of the site, so that if the site is much cooler than the competitor-sites it has still the chance of ranking.

  53. One word: Mahalo 🙂

  54. Hi Matt,

    I’m the editor of a large travel website. Since 1995, we’ve written and published thousands of articles about tourism destinations worldwide. What I’d like to see Google tackle, from a spam perspective, is the amount of plagiarism that abounds the Internet today. Time and time again, I see other websites pilfering our content (articles, paragraphs, business listings, photos, you name it) and claiming it as their own. Often, the offending sites even have the gull to put a ‘copyright’ notice in their footers!

    My question is, can Google come up with a system that better recognizes pilfered VS original content — and rank accordingly. When another site steals our content, I’m always concerned that it looks like we’re publishing ‘duplicate content’, thereby hindering our organic search engine exposure. Some initial thoughts: Can Google somehow date-stamp content? Or can Google assign a higher trust value to established sites, while maybe better scrutinizing newer sites — somehow validating/cross checking info? I know this is a huge undertaking, but I just want to throw the idea out there.

    Thanks for the opportunity to chime in!

    Mike

  55. Made for Adsense sites are clogging up the web. The content is copied or purposely thin and they just plaster the site with ads hoping to trick unsuspecting searchers into clicking. I see these type of sites quite often on thee word searches for popular terms. I know Google makes money off these spammers but it has to stop.

  56. Definitely filtering out results where sites have pages and pages of almost totally duplicated content that just has a different town or city name. For example a carpet cleaning company that offers services in southampton, portsmouth and london having 3 pages each largely the same, but with southampton, portsmouth and london in the relevant places.

  57. +1 for Hunter… it’s exacly what I think about it!
    Hunter June 30, 2010 at 9:40 am
    … The algorithm is too narrow and it places too much emphasis on links. It is due to this emphasis that link-spam has become so prevalent…
    … I think we need to stop focusing on links so much. Links in the wild are no longer natural…

  58. Please clean spam in Google Hot Trends and in Google Trends-related SERPs…

  59. I’d like Google to devaluate sites that present a signup/login page, or an obnoxious full window overlay ad, or a redirect page, instead of the actual content that was suggested in the search results.

  60. Well, I didn’t read all the comment people left, but one thing you can do is avoid indexing “UNDER CONSTRUCTION” web sites.

  61. Must must must kill scraper sites! Those copying my content should never outrank me cz they are better at seo. Ideally, they should not even be on the same results page.

  62. Hi Matt
    The solution for this webspam ( and I am talking about spammers who posts in blogs, website’s , forums, comment areas to gain backlinks to porn, Viagra etc…) in my opinion can be managed focusing in 2 tools.

    1- Make the Google spam reporting system available as a button/link “API” to be integrated for bloggers or webmasters to send 2 parameters username and Email used for the post ( I think webmasters can manage to feed the email to the API link to be sent) , that way either the webmaster or even visitors can click and report it back to the Google spam reporting systems server where it is going to be processed and stored in database, and the record is scored higher depend on the reporting( which will have to have some criteria to be considered as legit reporting like ip address of the sender etc..) and that increases the priority of the record to be moved to official Spam database.

    2- Make an API that consults the database above provided to webmasters to be integrated in websites, blogs… registration process of websites, blogs, forums… to prevent spammers to re-register in other websites and eliminate farther more damage in the global network
    This can be profiting all the websites , forums, blogs etc… the idea is not detailed but I am sure you see where it is going, another menu in webmaster tool where members can find the 2 API codes to integrate in their websites, good for WMT and good for webmasters.
    The database can be used for filtration in Gmail or other services too…

    Cheers from KY

  63. 1) paid links – too many top-ranked websites use paid links – the whole concept of links as important ranking factor is misused. i think google should define more and better trust signals (social network reputation, branding factors, real authoritys)

    2) expired domains: a expired domain is expired – why? because the owner dont like to continue this project. the trust of an expired domain is currently too big. that business is a mio dollar business which leads to bad organic results

    3) feedback in gwt for reported domains: what happens?

    4) feedback of filtered or blocked own projects in gwt

  64. If webspam disappear, what will be the reason of keeping Matt Cutts at Google ? 🙂
    Come on ! You know how to better fight spam:
    1) just READ the thousands webspam report we sent you, and take actions.
    Or maybe you never received the french ones ?
    2) Take more people on your team to analyse the spamreport and to remove the spammy websites. I don’t want to work for Google now I know people don’t spend 50% of their time beside the coffe machine.
    3) Have a coherent politic with clear communication. Engeneer saying “we have no problems with duplicate contents” is not understood by everyone.
    4) Buy Yandex, they are doing great job with duplicate content.
    5) analyse better the information provided by a whois. Most spamdexers having duplicate websites do not even change the name of the registrant, it is easy to get a listing of their domain and realise it is the same content. Don’t make assumption like ‘webspam happens with .com and not with .eu’

  65. Replace all results for experts-exchange.com with stackoverflow.com. 😉

    Or more seriously, try and figure out when sites are being deceiving. e-e always says at the top of the page that you have to log in (and pay) to see the answer but it’s way down at the bottom of the page.

  66. I see a ton of web developer type sites that have scrapped articles (from forums or other sites) and then just re-publish them and post ads all over the place. These sites seem to come up quite often. I do a lot of searches for various topics on Javascript or PHP… so search for any “how to do x in javascript” type search phrase and you’ll see these sites.

  67. Scraping sites that bound their information with tons of ads and general junk but have such high pagerank that they get listed for lots of terms. ie: mahalo type sites.
    ie: http://www.seobook.com/matt-cutts-eats-mahalo-spam

  68. I know these have been mentioned above but blog spam and scraping are exceptionally frustrating. I work hard to develop my content and ranking – I don’t want others benefiting from it to my detriment.

  69. You can’t ban spam. You have to facilitate it.

    One thing i would like to say is that captchas is definitely NOT the answer. Spammers use decaptcha services, who take advantage of people in less fortunate situations. Outsourcing is one thing, exploitation is another. So human challenges is definitely not an option.

    Further more, whatever measurement you introduce to penalize spammers, is actually a weapon you give them to use against there competition. So you can’t take action against domains were you can’t verify if they are responsible for spamming. Meaning you have to find away to let people verify there links.

    But the core question is why do spammers spam? Because they want to rank obviously. But if they know they might get a penalty or “slap”, why do they still want to use the Xrumers, the Senukes and the Allsubmitters of this world? It’s doubt. Search engines don’t provide them with a clear method to rank. But underground forums do. And usually they get pushed into these methods by peers confirming there workings.

    So like i said, if you can’t stop them, you need to facilitate them. There needs to be a standardized protocol for web commenting that is broadly carried throughout the web community. Either by the W3C, or a search engines collective.
    Containing;
    Method for users to verify, or vouch, for links.
    Standard web form with inclusion of, “name, email, website and anchor text”.
    And a clear guideline “If you want to rank, only use the anchor text field on comments/contributions that are relevant to your niche”

    You would effectively reduce spamming to nihil levels. If you get a spam report and you’ve vouched for your links you have a problem. If you don’t vouch for your links, treat them as nofollow. This would endorse linking to relevant content and leaving relevant comments.

  70. Paid blog posts are influencing rankings more and more. These quite easy to identify posts, with followed links included have seen some of our competitors leapfrog over us for our core terms again and again.

    So Google either needs to:

    A) Allow sponsored reviews, which are impartial in nature across the board.
    B) Devalue all of these kinds of links more agressively.

  71. I don’t know if you mean webspam as in the kind I get on my blog which is typically blocked by Akismet on WordPress or if you mean email spam.

    If you mean webspam then I think social media networks should have some kind of authenticated (OpenID) way of whitelisting people and blacklisting as well.

    If you mean email spam then you should consider finding some way to give brand new addresses a probation period before they can email more than one person at a time unless it’s a business domain. Most of the spam comes from free accounts such as Gmail and Yahoo! Even so with business domains, an email sent to new people outside the business should be considered a violation of probation and Google should flag these automatically. A new business email address has no business spamming.

  72. One of the most problematic issues I see regularly with Google search results is location based searches in places other than popular areas in the USA. A great example is Puerto Rico. It is virtually impossible to get relevant results for queries like “grocery store San Juan Puerto Rico” (in both English and Spanish btw). The lack of relevancy is significant. Of the first ten results for the above search one provides a real answer, and one other is somewhat relevant.
    It isn’t just for grocery store however, just about anything else lacks relevancy as well when combined with “San Juan Puerto Rico”. These searches have clear intent, and the SERPs simply are not useful in this case.

  73. I think you should bring back the cross beside google search results to remove them, and take note of what sites users are removing and if enough users click a cross beside a result it should be deranked or permanently removed (or some kind of manual review is triggered).

    That way you can crowdsource removing spam rather than relying on algorithms (and tweak the algorithms based on crowdsourced results)

  74. I also agree with what Sébastien Billard wrote, review spam is getting horrible. Nearly the entire first page is sites saying ‘buy product xxx and read reviews’

    I (and most people searching for reviews) would rather see proper cnet style reviews of products rather than these spammy sites.

  75. A Google Chrome or Firefox or IE Web spam report tool extension could work if you did it by invitation only to a select group of invited participants, i.e., school teachers and librarians from everywhere from grade schools up though special and corporate libraries. There are 300,000 mastered degreed librarians in the US alone. If you let anyone install the extenstion, you’ll just get company A reporting competitor B for spam, where there is none. It’d be a free for all. At the same time, it takes more expertise than you might expect to spot a scraper or MFA page. We in the biz spot the spam instantly, but most folks can’t.

  76. Find a way to manage SSL certificates online. For example, allow for upload of different certificates by different people, and google compares those certificates to ones online. Validate the strength of those certificates as well (removing insecure algorithms such as md5).

    Google could also store the certificates when it crawls the web, making your validation process more efficient when crawling (Time/memory trade off).

    This eventually could be implemented into chrome, whereby dangerous certificates can be blocked automatically by updating chrome. Much like an Anti-virus updates for viruses.

  77. I’d definitely like to see websites with a lot of adsense and no content be taken down. For some reason these sites always seem to appear on page one of a search result even though they have no real content of their own. Very annoying.

  78. i would rather want to see google concentrating on web master tools……and spam handling capabilities.I would also insist it should also work on it email service that is gmail because it is still having some loop holes exposed to hackers…… 🙂

  79. Website Review is starting to become popular where people just signup for an affiliate and they create website review and use their site content. Example when you search for web hosting
    4 – 5 out of 10 is website review which have the same content.

    Like someone mention last year, Google should have prioritize the Official Site, i notice become more and more spammy website doing website review and which is not relevant anymore because those website review is created by mostly the company itself and this website review is consider an affiliate site.

    I think this is a new concept to gain an affiliate without getting Google consider affiliate since its review but its a fake reviews.

  80. I’d like to see spammy blogs dealt with and those that are stealing copyrighted content to put on their spammy sites. They’ve both been mentioned before, but I think these are major issues, not just in terms of SEO, but also in terms of brand reputations.

  81. I think paid links are as actual as ever.
    I see every day – especially on business sites – paid links.

  82. Not strictly spam related, but It would be nice to see some work from Google on blog comment algorithms. Let’s say you can associate an email address with a Google profile, then bring in all comments into Google profiles. Positive and Negative sentiments left on blogs could reflect a users mood in your profile.

    Perhaps this could be extended out in a more social way to help cut down on comment spam. i.e. discount followed links in comment (names), especially if they don’t match the users profile name and let the community police things. i.e. useful comments pass maximum weight.

  83. Hi Matt,

    There are many website which display content based on USER-AGENT.
    If a bot visit the site the content is different and if a human visitor visit the site the content is different.

    If is there any way where Google Bot can crawl the site not-as-a-bot but as a Human Visitor and than compare the difference, it will help a lot.

    Keep up the good work, “Lets Make The Web A Better Place (:”

    – Joydeep Deb

  84. I’d like to be able to follow up and see the progress of spam reports. I can appreciate that the Google team will get a lot of them, but I work in a highly competitive keyword niche and big well known brands rank 80% on paid links. The deterrent to buying PR links is far, far too low. If each competitor can report paid links to each other, that’ll fast increase the deterrent as it would exaggerate the short term, high risk nature of the strategy. By adding a status for paid links, it’ll communicate the effectiveness of reporting links and certainly make buying links far less attractive. Thanks for your ongoing efforts.

  85. 1. The major “spam” factor that impairs my experience as a Google Web Search user is actually from Google itself, with the inclusion of other search results in to the Web Search. The penny dropped for me with the inclusion of Twitter results, but for Twitter read real time, I guess. It’s highly unlikely that I would ever value a real time result in this was over a page/site that has gained reputation over a period of time. In short, let me choose whether I want you to include results from specific search types in web search results – I already know (and love) that I can choose from lots of other types of results.
    2. Paywalls. There seem to be some sites with paywalls that have figured a way to rank well for content that the searcher cannot reach, and I suspect this will increase with more newspapers going subscription only. Most frustrating. A valid response to me would be marking results as paid-for as much as delisting.

  86. Couple of issues:-

    1. Google local in the UK needs a ‘claims management company’ category. At the moment claims management companies categorise themselves as ‘solicitors’ because there is no other suitable category. But claims management companies are not solicitors and cannot run claims, they simply sign clients up and refer the case to a solicitor for a referral fee. But when looking for a solicitor I want a solicitor not a third party who can only refer me to a solicitor.

    2. Paywalls – I search for content I want to be able to see so flagging results subject to a paywall as such would be useful so searchers don’t waste time clicking through to a site they can’t actually see.

    Thanks

  87. It seems to me that Google is working with a hidden agenda and blacklisting legitimate sites. I started an SEO campaign in March with a recognized SEO company, the campaign went well for 3 weeks and the results were fantastic. I had and attack by hackers on my site which took 12 hours to repair since then I have lost 99% of my ranking on Google. So Google has penalized me for being hacked and has wasted all my money that I have spent on SEO the interesting part is that my rankings on Bing and Yahoo are doing well and have not been effected. My concept is to rate and review by country and I have this in all my domains Eg RateYourCourse.ie, RateYourCourse.co.uk,RateYourCourse.com.au,RateYourCourse.co.nz, RateYourCourse.co.za the difference is the content and all websites are owned by local business people but are being hosted in the UK due to speed issues in some of the other countries. So my business model cant be used in its current form

  88. I should be nice to have a tool to send all your mobile information to (tel/email/contact information). So you can change/add/delete information on the website and synchronize it back to any phone new/old/friend.

    Now I always have to copy it somehow to my sim and then in to the new phone and back again. Its then also a good backup for if you phone is stolen or broken.

  89. Hi Matt,
    I think Google should be more strict towards following points:
    – Implementation of Black hat techniques such as hidden text and cloaking (again hidden text comes under this) use of noscript tag
    – Website with lots of high PR paid links (inbound or outbound) which is actually ruining the organic ranking
    – Healthcare and Adult industry are some of the industry which needs more attention, like filtering the genuine results
    – Google bombing still exist in some of the less competitive industry where webmasters are attempting to rank well a site without covering specific keyword nowhere in the website

    There were some other suggestions I had in mind but recently cleared with Google Caffeine.
    Thumbs up to caffeine update…looking for more faster and accurate result.

    Thanks

  90. When you put in a search in Google which yields results that look useful, but then you actually go to a site it will just be another type of search engine (or link site) with results for your search phrase. Very frustrating! It’d be great if that could be tackled.

  91. I don’t have a problem with mail spam in gmail anymore but I get a lot of spam comments in blogger. Sometimes I’m not even sure if a comment is spam or just a little off-topic. To find out, I google the phrase and see whether it occurs on other blogs frequently word-for-word. I’d like to see blogger come out with a spam probability system.

    For example; If more than 50 people rate something as spam, then it is automatically rejected as a comment (if you’ve turned on the anti-comment-spam function in blogger). If only a few people have suggested that a comment is spam then it still comes through for approval/rejection or for marking as spam – but with a “probable spam” warning label.

  92. I am still seeing people with locations and keywords repeated 40 times in the footer area doing well. I see sites with stolen content from various sites and keyword stuffed pages doing better than real sites. I also see people building sites with a 4,000 pages of nothing, but a link in the footer to get tons of links and google responds to that. They have mass #s, but not diverse linking. All of this bugs me!

    Lastly, real estate webmasters…waiting for 3 years to see what the big Goog will do with them.

  93. Actually start penalising SERPS for paid links, just have a look at the .com.au “Car Insurance” SERPS – 99% paid links.

  94. Matt,

    1. Detect scraped sites. These are the bane of Google and it looks like they are coming back in a big way on caffeine.

    2. If you aren’t already, detect and ignore link chains. These are all the rage right now (plenty of “gurus” out there recommending them apparently). Not a week goes by where someone doesn’t ask us about them. They are trivial to detect anyway, so this is low hanging fruit.

    3. Paid links – I don’t think you guys are doing a good enough job here. And I know you know the main offenders (on the sell side)…

    4. Universal search – if I had my way, I would get rid of it entirely. Lots of spam creeps in here. And from a usability standpoint, if I want to see images, news, or whatever, I’ll just click the link and get 100x better results than what you can provide on the primary search results page.

  95. Without looking at anyone else’s comments (although I’m almost positive someone already mentioned this..)

    There is way too much repetitive content showing up theses day. I’m not sure how to pick said content out from the rest in an automated fashion – but there are some characteristics that stand out.

    For one, these are usually found in the form of text articles with no image and all as one paragraph (sometimes two, if you’re lucky). As a fact of matter, you actually pointed one of these sites out in the Google IO’10 site review.

    What else?.. OH! I could be bias in this next idea, but I think my point is valid…

    SEO companies are usually web design/dev shops that start doing SEO (most of the time). These companies have control of 10’s if not 100’s (and maybe 1000’s?) of websites.

    I’ve seen these “web dev turned SEO” companies set up what they call directories – then link to these directories from the 100’s of client sites they already control (usually from the footer, without the permission of the client (who doesn’t know any better))…

    Then they turn around and link out to their clients that are paying them a monthly retainer for SEO… and the worst part: this works really well.

    I would be more than happy to provide an example of this “link farm” type activity for your review and comments… I would love to know what you think about these types of sites Matt.

    All the best, Arsham

  96. Allow logged in user to mark sites for non inclusion in future searches. Usually if you do a lot of searching you come across the same low quality sites appearing high up in the rankings. We already know it isn’t worth clicking on them and don’t bother but it would be nice to mark them so they never show again. I’m sure you could even use that data but then you would have to rank the people doing the marking..etc.

  97. Robots create forum accounts just to leave 1-2 irrelevant posts in a random topic, with a link to the resource they want to get high in the results. Google could employ some technology to value links by individual forum members differently. When a forum moderator or a long-term user with a lot of posts links to a website, Google should value that link much more than the one left by a new member with less than 10 posts.

    This measure will make forum spam much less effective and hopefully its intensity will decrease. Forum admins will finally be able to breathe calmly.

  98. Here are my 2 ideas:
    1. I think it’s time for Google to take better action into global spam, meaning that Google results in other countries and languages are somewhat spammy and not relevant for a country.
    2. For product search try to find websites that will be able to fulfill to the market where the user is in. How good a result for a laptop is, if the website won’t be able to ship it to my country or accept my currency?

  99. Matt,
    With Caffeine Google seems to be getting over run with Webspam. The chinese are becoming increasingly adept at leveraging botnets to rank hundreds of sites with nearly identical content. I think your team should start looking at Google Analytics IDs because spammers often have the same ones for multiple sites.

    For example:
    http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=juicy+couture

    Another big problem is that affiliate shopping engines are buying blog networks to generate lots of links spam from a diverse array of crap content domains. Soon the primary factor in ranking will be who has the biggest blog network.

  100. Hi Matt,

    This is my first post but I think that there are a couple of things you may consider to clean up the google index. I am a physician but my bachelor degree included lots of computer science since I have always love that field. Second, as a researcher I will provide you with ideas based on my on observations gathers from loyal google users and what are their opinions about the search engine.

    First: It is a fact that google is the only search engine that provides users with the most relevant results on any given search. It’s algorithm is very good. Just compare any result to bing or yahoo and you will see that Google is a first class search engine.

    Second: Affiliates and small websites most of the time (I am not saying always but if we do a research we may find a p<0.05) does not provides users with what they are looking for. Plenty of times when searching for long tail keywords you will see rankings dominated by those kinds of sites. Long tail keywords is where most of the spam is taking place. For example, as a physician I may be looking for long tail keyword that includes a prescription medicine. When google returns the result I get many sites that provides crazy drugs online to buy them. So, Google can decrease the weight of documents on a small websites (if the website has <100 pages then all documents inside the website will have less weight) and move affiliates sites to the shopping tab on the left column of the search results. You can create a new tab for reviews.

    Third: Page Rank & links is what makes Google a First Class search engine. That provides the best weight on any document on the web. No matter how many spammy techniques any webmaster use, gaining links from authority sites means a lot. No body here can argue that a link from cnn.com is spammy, or any other high PR page. It only has one meaning and is authority. I am not linking to a website just to link to them. I link to websites because it will provide more information or complement the information I am providing to my users. It is like writing a scientific paper, you will provide references on your paper to the best published papers available and that are pertinent to your original paper.

    Fourth: Twitter results are really annoying and not worth Google's time. That live feed in the middle of search results are not useful. Stop them! I use buzz all the time but it is something social. If you think that it is in the best interest for Google as a business to include those then you can do something like social.google.com where users can look for twitter, buzz, digg, facebook, etc. Just like Google Scholar. Once you have google social other search engines will follow but it is late for them, you will dominate social searches.

    Remember Google will be number one in our browsers as long as you keep doing the hard work of keeping the index as clean as possible.

  101. Steve’s first comment nails it on the head. Give me an email address where I can forward all the shady SEO emails I get. I inherited a former Internet Marketer’s position and the solicitations I get for link building are brazen about how they violate your TOS.

  102. I think a spam report extension is necessary but not one that is super easy to submit. People could potentially abuse the spam reporter if it was too easy to report. A well thought out spam reporter would greatly help to clean up the web.

  103. Provide some feedback on progress of reported spam within GWMT.

  104. “Parasite hosting” is probably the biggest problem in google search at this moment. Depending on the niche you’ll see that most of the top ten rankings are redirected profiles or blogs in .edu/.org/.gov domains that are link spammed like crazy and google takes months to take them out. You should find a way to prevent that profiles and free blogs in reputable domains that all of the sudden build thousands of links from spammy comments and forums profiles to rank. It doesn’t make any sense that the same black hat techniques used by spammers for several years still work and you haven’t been able to issue this. You are too worried about complex things such as trying to discover paid links (which is virtually impossible) than to prevent simple black hat strategies that are openly discussed in black hat forums! Have a few members of your team join the known black hat forums and figure out a way to block them, cause it is a pain for honest webmasters to try to compete with them without doing anything sneaky. It’s frustrating!

  105. I’d have to go with making sure product related searches turn up product pages. I did a search the other day and of the top 5 results 4 were SEO’ed content pages. Not even a product to be seen.

  106. Also, ignore the domain with hidden whois info.
    If they have something to hide, it is suspicious. The law in Europe even force people who edit a website to publish clearly the name and adress of the editor, the VAt number and so on.
    Some people are competing even without VAT numbers, they are hidden and they have a job beside. For them, being banned is just part of the game, and if that happens, it is not a problem. Cheating with Google should become a very dangerous game where nobody wants to play.

  107. BTW! I completely agree with Hunter also, you have always talked about quality content, but as you showed us in the Google SEO Report all you need are links to rank even if the page is filled with errors and without any content. Not sure about the “like” idea, but definitely as long as links are 99.9% of the 200+ ranking factors than there is always going to be a technique to fool Google. Why don’t you lower a little the link % value and increase the content % value.

  108. Stolen ranking. i recently launched a new business website. after several weeks i found out my site was not correctly indexed. To investigate, on google i searched for web buzz media indiana. come to find out i found another domain name ranked under my website.this domain name had the same content as my site http://WWW.webbuzzmedia.com the other site http://WWW.cn026.com. the domain name cn026 was registered in china. also the domain name was registered couple years before my way domain name was registered. Is this a common scam or is there a flaws and googles algorithm? I have found numerous posts with similar scams.

  109. Hey Matt,
    A few things I suggest (many of these you probably have already implemented in some way) Note: As I type this, I realize the more I type about ways to prevent webspam on searches, the better understanding I have for good content. Weird.

    1. Put more weight on retweeted links on Twitter. You are more than likely doing this already in some way, but just to throw my .2 cents most people will not retweet something they personally don’t approve of. Of course there will be spam accounts and retweet wheels, but you can easily discredit that by the number of followers/history of profile they have and strength of them (current Twitter PR on each profile).

    Another thing with Twitter is, maybe even note in the retweets that aren’t necessarily links. Maybe if you see something being mentioned a lot (not an active link, just text), you can search thru their previous or future tweets for a link to what they are talking about. There are a lot of good stuff people refer each other to on twitter from retweets that don’t have a link, but it should in some way send credit to that site?

    2. Associate, record, and apply a brand or company name to a set of keywords. I don’t know if you guys do this or not, but many times when folks find something good; they will try to search for it by the genre or niche it was in (general broad keyword terms). Often times, the rankings are cluttered with junk that wasn’t what they were looking for. After they see the results aren’t what they are looking for, so they re-enter a broad keyword term with an attempted spelling of the name or brand of the website they can best remember. Even then it doesn’t pop up; if there was a way to associate some brand and names to certain keywords and google could figure it out, that would save out the junk and rank those pages that people truly find better up towards the front of Serps. Maybe even Google be able to credit brand/company name mentions on sites (even though they aren’t links, just the word) and it acts like a link, not as much, but gives some credit back to the site where it came from.

    3. Less weight on useless mini-sites that are one pagers with barely any info. Unless it’s a really good site or the best out of a niche where there’s hardly any info; I’ve seen numerous exact keyword one page blogs that outrank a ton of really good, content filled sites.

    4. People are still trying to create link wheels using social bookmark sites, some are even succeeding with useless articles that help no one.

    5. Add weight to sticky threads on trust worthy forums. These things have a ton of good helpful info. And highly moderated, but I sometimes see those useless 1 page blogs come up before these.

    6. Less weight on exact keyword domains. Yeah, even though I slick have one myself, these things are just eating up good space where better sites can show for searchers. Don’t get me wrong, some (like me) have great content and a great site built on it, so please don’t discredit that, but a lot of these sites are just junk. One or two pages with keyword density out of kazoo and a $5 non-helpful article they bought from someone in India that only describes what the definition of that keyword is and nothing more to benefit the reader. If people want summarizes/definition, wikipedia is alot better for this.

    But these mostly one pagers are mini sites and just another doorway page. And now with domains going for $1 sales, super easy to set up hundreds of pages. They just flood up the top search space with spam. Not nearly what the Google users truly want.

    7. Paid links. You guys are definitely working hard on this, but there are some that are blatant obvious are paid for links. Those claiming to be text link brokers, etc. are all paid links in my honest opinion. These links are weighing a lot and enough to manipulate these spam sites to the top. These spam sites then make revenue and able to pay for more links. So the rich get richer, and poor get poorer in rankings, and there are lots of sites out there that will help searchers a lot better. I am even at odds with Yahoo directory links; it’s a paid link. People understand they will get no traffic and only buy it purely for the SEO factor.

    Right now, the thing I have been approached with a lot is emails from companies to host an “ad link” on their site. And when you ask they can they be ‘no follow’, of course they say their policy is no. And they pick the page with high PR. They stay low key with their operations though, so you have to find the sites that are all connected.

    8. Kind of a personal rant here, but less weight on older sites in the “software” niche. Almost everything is going towards web based (as it should) and these older, crappier software sites still dominate search results. It doesn’t give us a chance to be able to give to the world a looott better software to the world. I bet millions of searches are done to find software, and people give up because they can’t find something ‘good enough’ that works great within our times of today (people expect great web apps now a days). But these good, newer, better sites just sit back behind the old, stagnant sites. For example, my site and web app employee-scheduling.com is leap years better than what comes up when you search for employee scheduling software but old, stagnant, download software sites show up. My site is all white hat (seomoz advice dear to heart, using exact keyword domain only because it’s so easy to remember for business managers), great content blog here, and site written for users, but having trouble climbing up. Granted, it’s still early and hopefully things will change as I go public launch (in private beta) and get more people talking. But I spent 8 full months making this good for the ‘end user’ and so hopefully that pays off in the end and Google will eventually follow?

    Sorry for talking your ear off Matt, but I hope this helps at least some. You and your team probably are highly aware of these already, but maybe I (spent an hour writing this!) sent some helpful perspective and insight.

    Sincerely,
    James F.

  110. Ability for a user to block domains in a click of a buttons from SERPs in main google.com search. Similar to what can be accomplished with Google Custom Search google.com/cse/ excluded sites feature.

  111. Take into consideration onsite factors. Spamdexers usually do not have time to spend on their websites. My website is full ok with W3c xhtml, has robots.txt, has sitemaps, has canonical urls, original content. My site load fast ( in less than a second ) I should be rewarded for that.
    The site which has 545.000 backlinks does not smell very honnest in my opinion. What can I do to compete him ? Cheat like he does ? Make 600.000 artificial links ? Spamdexing is like doping in the sports, if there is not a clear rule with sanctions, we will get cheaters like we all know in sport.

  112. Playing the link game with rankings has become just that, a game. Counting a person’s “friends” online vs. counting how many friends a person pretends to have… c’mon. The same goes for all this keyword density c$ap. This idea may be a little out there, but by now google’s crawlers ought to be able to read a page, see how good the English (or whichever) is on that page, compare that to fifty or a thousand other sites to see how clear, complete, length appropriate and readable the information truly is, and then rotate the best of those pages through the front few SERP’s in order to see who clicks on what.

    Google has gone as far as it can with the statistical stuff, methinks. Its time to step up to the plate and measure quality in a qualitative manner. If the case is that “computers won’t be able to do that for another fifty years”, then perhaps Google’s job in this world should be to make that happen, somehow, in two years or five.

    My .02

    Mike

  113. Give bonus for webmasters who are friendly with Googlebot ( good communications with a google webmaster tools account, good structure of urls, goods robots, good sitemap, no suggestions in google webmaster tools like duplicate title, compressed sitemap because this use less energy from google)
    URL Rewriting should not be an obligation. One of my competitor has
    keywords.com/keywords1-keword2/keyword1-keyword2.html and I just find it ridiculous.
    My website also use 100% green energy 🙂 this can be a parameter ( not for webspam but for the earth ) Google can change the world with its algorithm !
    Bonus + penalties can be very good combination.

  114. Sites with duplicate content all ranking. A perfect example are the so called consumer advocacy sites. Somebody posts a complaint, the other complaints sites recieve the complaint in a feed and before you know it your (often fake, but I wont go there) complaint is now on 6 sites, all loaded up with adsense and all ranking. They make it look like a real poster posted the content wiht a poster handle of x3yv6 or such computer generated crap. They must think we are stupid. These sites are simply evil, they are all well seo’d and rank highly. I just dont get how these sites pass Google’s quality signal and why they all rank with the same content. They are totally abused by people who want to do you malicous harm and its reached a point where I believe nothiing in a review I have gone back to asking people I know or I only use a site that allows dispute resolution. Bing and Yahoo do not give these sites the time of day I never see them there are at least 10 of them.

  115. 1. A better way to report 3 way link exchanges. I get maybe 30-40 of these a day. I would be good if we could just forward the email to google for investigation. What about a specific form just to report 3 way exchanges

    2. A better way to report sites that have lots of other sites that they use for passing page rank. In my niche we evaluate all our competitors backlinks, and over time we see patterns of multiple sites that they have control over and using them for pagerank. Its only because I know my niche so well that I am able to identify this. I’m not sure that the goolgle algo is so clever, if it was then they would not be top in the serps.

    I have an excel sheet on one competitor that lists all the sites that are interlinking, all on different ip’s, different hosts, different platforms, different whois etc. But what do I do with the excel sheet ? .

    You could ask how do I know they are all owned by the same firm, the three way link exchanges, following smtp patterns, small parts of code behind the site easily give it away after a while.

    We have filled out webspam and link buying reports in the past, but never see any evidence that action was taken so no longer both.

  116. Hi matt,

    i would like you guys to attend a certain aspect in Google ISRAEL. today most of the academic and special domain names (such as : k12.il,ac.il,gov.il etc) don’t have moderators for their forums.

    the result is major spam on these forums. as a professional SEO firm, we do not use these tactics to promote websites.

    sadly i see that Google Israel treats those links with respect and sites that create multiple academic forum links in excess on a regular basis get a pretty good ranking in the search results even in a competitive industry.

    i know that Google can distinguish forums (most of the time) from regular sites. what i suggest is so : since the weight of academic sites referring to other sites is important and reliable, Google must work on identifying forums on these domain and lower the link weight to almost nothing.

    i’m sure this happens all around the world and i don’t know if in other countries you already handled this , but in GOOGLE ISRAEL i can tell you for sure, you haven’t !

    hope to hear from you talking about this subject sometime.
    Thanks, stav from the DURAN SEO TEAM.

  117. @Peters why does W3C have any bearing on anything ? the sort of people who rave on about how IPV6 but have very little knowledge of how things realy work – like the Saint’d Stephen “iPhone” Fry 🙂

    And its not a requirement to be VAT register’d to run a business (makes sense to be though) and would you like to quote the law that says editors have to publish ther real name? some more authoratain european contries may still have some to UK and American eyes draconian laws about licencing newspapers so what those laws probaly violate a number of EU laws as did Italys actions over youtube.

    There are good damm reasons for anonymous speach has some downsides to be sure but I on balance prefer the anglo saxon model to the “you have been controlled” germanic model.

    Your falling into the same trap that Google some times makes of not acoutning for country based differences yes some european sites have “impress” pity that compared to the equivelent Uk or USA site they are very very poor qauality -even for major companies.

  118. Mr. Cutts,

    I was given the address of your blog in response to an inquiry, and it fortuitously happens that you’re most recently thinking about the same topic about which I had made an inquiry!

    Google News is an incredible tool, however, lately, I’ve noticed a number of results popping up in both my Alerts (“news-clipping”), as well as in searches via Google.com, which purport to be “news”, when, in fact, they are not. Often, the pages I am taken to are gobbledygook, looking as though they’ve been manufactured by a large committee of monkeys who’ve been handed a typewriter, though, just as often, the “news” that I read is easily discernible as a bad “re-write” of something that I’ve already read, elsewhere, from a reputable and credible news entity – often it’s even obvious that the piece has been cobbled together by someone (or some thing?), with no understanding or interest in the story, having apparently used a thesaurus to change a few descriptive words here and there or who has merely shifted some paragraphs around, or changed the order of some clauses – with results running the gamut from “the facts” now being simply wrong to important shades of meaning and tone having been lost in this “process”.

    The “clearly useless” entries that I’m sometimes taken to are much less troubling to me, due to the ease with which they can be spotted (and navigated away from), than those sites who masquerade as “official sounding” entities – and do not even have the courtesy to simply just plagiarize real journalism, instead hacking at a quality piece that someone has put energy into until it sounds as though it’s been written by an 8th grader on a book report deadline, racing to find new ways to, ever-so-slightly, not “directly” lift from the encyclopedia. In these cases, many of the offenders are denotated as “blogs”, however, many are not. In particular, I found a copy of a friend’s story “rewritten” (as above, by either a harried middle schooler or an automated “thesaurus” program), on something billing itself as “American Banking News”, whilst recently researching a piece on financial reform.

    My initial thought, since the piece was in my “News” alerts, and since the name of the site had “News” in it, and certainly sounded rather official, was that this was perhaps the online arm of “American Banker” – a quite well established, long-running, and truly credible outlet.

    After finding the article to be a “rewritten” article of someone whom I know, I sent an email to one of our tech people (I’ve long since given up notifying legal, when something like this occurs), asking how this kind of site could manage to get gain entrance to news feeds, just out of curiosity. His answer was quick and to the point: “Uh, he probably just signed up.” – and was supported with a link to a page at Google News where apparently anyone can, in a few moments, become a news source worthy of inclusion in America’s inbox and in searches for particular topics. (He also pointed out to me that this site had sister sites, which were doing similar hatchet-jobs on real journalism, over a variety of other topics – and that this was becoming a well known and rather common thing to be found, online.)

    I do not know what to do about this problem, but I do think that society would be better served, if a workable solution cannot be found, to err on the side of tolerating plagiarism, rather than providing opportunities for entities to make a buck on “dumbing down news” – and thereby the populace. The waste of time in clicking Back seems to me a lesser issue.

    Sincerely,
    Tom Brokaw

  119. Matt,
    Thanks for asking, it’s great to see Google asking its users how they’d like to see their product improved.

    Here’s my tuppeneth worth…
    Some price comparison sites can feature highly in SERPs
    In some cases higher than regular places selling the product, or even original equipment manufacturers for some terms.

    In many of these comparison sites you have to pay to have your brand/product included or get listed on their site.

    The end result is those that pay feature higher up the natural SERPs than those who don’t.

    This seems to go against the idea of truly natural results
    But I do appreciate a change could put some comparison sites out of business.
    So that’s a tricky one for you – hmmn, how to do no evil ?!

  120. Also, ignore the domain with hidden whois info.
    If they have something to hide, it is suspicious.

    I will fight that one tooth and nail. I may lose, but I’ll sure make a big honking stink about it while I’m losing. There are PLENTY of valid reasons for keeping WHOIS information private. I’ve had domains since 1991. I’ve had to take out THREE personal protection orders over the years directly because of WHOIS information enabling stalkers and threats. I shouldn’t have to put my life at risk in order to have an internet presence.

  121. Awesome post, Matt. Good to know that Google’s actively checking with the public to see what issues they have.

    The biggest issue I have is one that I really am not sure can be easily remedied with an algorithm. It’s finding information that’s ranked well written by people purely aiming to get ranked for that keyword. The information is often not quality content. However, this could be argued subjectively, so I don’t know that it will ever be able to be completely fixed.

    Aside from that, on the same topic, there’s often spam sites with duplicate content outranking other sites with higher quality content (that may be the original creators of the content). Not sure how well duplicate content and dates of creation can be tracked, but that would be nice.

    Aside from that, I’m pretty pleased overall with the updates Google does to keep results high quality. As for Caffeine, I’ll consider it still in *early* beta before making judgment calls on it.

  122. Paid links would be a big one, Also link acquisition time, I don’t thinks it’s fair that someone can acquire 500+ links in the course of two days,

  123. What a great question: For me it would be.

    1. Do not weigh generic keyword in domain name as high as you do at the moment. (www.cheaphostelsrome.com is probably not the best hit when you search cheap hostels rome…)

    2. Agree with people of giving some kind of status report over spam reports, not 1-1 but a tool in webmaster tool where you can see who you have reported, and the status of each spam report. I know you like to automate everything, and I understand it. But for the foreseable future I think you NEED the spamreports, and it is really dishearting to not see anything happening. And when the spammy sites sometimes dissaper you don’t know if it is because of your hard work in reporting)

    3. My favorite hate at the moment is sites that use the ACE format in the domain name. (a spesific case of point 1 that you don’t seem to handle at all. Let’s say a site like http://www.lån.dk is registed as http://www.xn--ln-yia.dk and those sites rule supreme when you search for lån. (This is a made up example) See http://www.whois.biz/convert_ace.cgi?dn=l%E5n&type=toascii for a ACE converter. It really don’t seem like you guys have caught up on this special scandinavian kind of spamming at all

  124. Arriving at sites with nothing but Google ads all over the place – mainly in and around the title. A sure indication that its just a spam site.

  125. Add a report-spam console to G-WMT’s “Links to your site” so the webmaster can report specific sites we find spammy/distasteful/inappropriate.

    While the sites can be reported via other means, it must be hard for G to discern a real report versus competitors reporting each other’s sites.

    Reported within WMT, the spam report would represent the most reliable tool for determining the spammy sites as what self respecting webmaster would report desirable sites?

    Would webmasters use the tool? I say yes. I am tired of tracking backlinks only to find the backlink came from spammy (scraper) sites that add little if any PR value to my site and potentially confuse my customers.

  126. Well as i speak my site is getting spammed with nonsensical links like http://azgpewgacnvq.com/ Maybe if there was some sort of Google comments that we could add to our sites (like analytics) that shows Google websites that are spamming in real time and would notice 1000 or more back links in less than an hour that some of these sites seem to be able to attain.

  127. @Maurice
    see here:
    http://eur-lex.europa.eu/smartapi/cgi/sga_doc?smartapi!celexapi!prod!CELEXnumdoc&lg=FR&numdoc=32000L0031&model=guichett

    the document is available in all languages.

    If you are anonymous, you are not serious. Or u can better accept to
    be on page 4 or 5 of a “competing keywords”. How can Google know there are diversity if there are no clues. For very competitive keywords, we have to be serious.

    As I said before, if Google fight spamdexing on “very competitive keywords”, and organise a sort of random ranking amongst the 10/20 first results ( a sort of rotation ), they can expect a HUGE increase of their adwords because there will be not 1 or 2 people willing / able to make adwords, but *20*.
    And this will be reccurent. It is a good assumption to say that the guy on position 1 does not have a big incitant to spend on Adwords, and that the guy on position 20 does not have the money to fight. Spamdexing and fixed ranking is not good for Adwords.
    Maybe I don’t express myself good but there is some truth in what I say.
    How can I plan to have an Adwords budget if Google always ranks me outside the first 10/20 results for weeks and my competitors gets free clicks ?

  128. My biggest problem with web spam comes from Google itself. Google search results are becoming polluted with search results from specific niche sites like Twitter and Youtube. If I wanted to view scrolling spammy Twitter posts I would just go to Twitter itself and if I wanted to see outdated Youtube videos about my search phrase I would just go to Youtube. (Feel free to check the dates on the videos you have chosen to mix into your searh results. I keep seeing the same two videos dated from 2008 when I search for a particular searh phrase).

    Same thing goes for showing news results mixed in to the search results. You already have a nice separate news section already made up to show me the news of the world including a search option where I can type in a search phrase if I’m looking for specific news results related to my search inquiry. There is no need to also show me news results in the main search results.

    Please wake up Google and realize the reason that people come to Google in the first place is to search for websites relating to their search phrases not to see spam results from other niche websites you have chosen to include in your search results.

    Please return to just showing straight up website search results instead of all of these results from specific niche websites as it just appears plain old spammy on Googles part.

  129. Matt,
    It’s me again, just thought to myself, I wrote a crap long comment and it dealt more with frustration than suggestions to stop spam.. so wanted to give you a good webspam example:

    There’s tons of these link farms and network of wordpress blog sites. These sites are like PR1 or PR2, and they just publish like 2 or 3 posts a day; very, very random subjects. It’s because these folks are creating a bunch of these using different C-ips, get paid subscriptions from buyers to post links; then they post a quick non-sense post with their buyer’s keywords and targeted anchors on each one of their blogs in the network. Most of the sites are bare bones with basic template and ton of non-relevant, non-matching topic posts.

    Not sure how to catch this except to maybe give credit to blogs that a percentage of posts actually match in content and relevancy, while penalizing blogs that never match content at all.

    Another example is of course, big, established, credible sites selling links under the table per say, where they only deal with each other in email. But the more I think about this, a lot of times, the links they buy on their pages are usually highly relevant and the sites it goes to is actually somewhat good for the readers. So is it really paying them for a link or really just “here’s some money for your troubles and doing this favor for me; cause both of our users actually benefit”..

    James F.

  130. I’d love to see some sort of spam filtering which stops hotlinked images ranking higher than the source website.

    The algorithm in many cases seems to actually favor the hotlinker’s site – especially in the prized ‘preview images’ at the top of the web search.

    Could it be that because google likes fast load times, it sees these sites which use hotlinking and serve up images from multiple domains, as somehow being optimized for speed?

    Jeff

  131. I want Google to develop a pattern for junk content. Content with 20 rows in a paragraph is what everyone hates and nobody is going to read, and they’re usually on sites which are Made For Adsense and contribute very little value.

    As much as I am for making money and all that, I want people to do that decently and produce things that are of value for web visitors.

  132. I think the “nofollow” tag is a big mistake.
    I think all it does is give webmasters a way to manipulate linkjuice. It actually does exactly what G says it doesn’t want: present one thing to the surfer (a viable link) and another to the SE.

    Specifically, I think that if you ignore the links in blog comments by default you may be missing out on some good quality control measures. Here’s what I’d do instead –

    1. Penalize sites linking out to bad neighborhoods.

    2. Do look at those links.

    3. Let the bloggers approve only the links that are good and legitimate in their eyes.

    Is it a perfect system? probably not, but I think it’s much better than what we have going on now. As was mentioned before – use the information from these blogs, via Aksimet, and see who’s really spamming blogs and who’s a legit player interacting on the web. Let bloggers mark the bad commenters that pass through the Aksimet filter as spam and add a layer of human quality control.

  133. Steve Jefferies

    Control Amazon. Clients are beginning to ask if the first 1 – 2 spots in organic results and shopping results are reserved for Amazon regarding keywords that include a product name. While typically a good place to direct a user who is shopping, many times there are better, more informative sites that would provide a better experience.

  134. Okay, here’s my wish list for the web spam team.

    Splogs
    Spam blogs or Splogs generated from automated RSS feeds via news alerts persist and still seem to drive significant backlink value in the form of total links, link diversity and anchor text. While they may be updated often, and the snippets used subvert some duplicate detection, it would be great to eradicate this blight.

    Microsite Generators
    There are a number of fringe companies who run microsite generators, particularly in lead generation and local spaces. These folks churn out cookie cutter sites across verticals or geographies which still seem to help them obtain decent rankings. Just Google ‘Microsite Generators’ and you’ll find some of the bigger folks in the space.

    Article Marketing
    I’ll chime in with Alan on this one, but also link it to local directories. There is a coMpany (note the capitalization) that fuses article marketing and local distribution. The practice works but I can’t imagine this is the experience any of us really want for these types of services.

    Parent Company Link Rings
    I might be on an island with this one, but links from a parent company or sister company still seem to pass trust and authority. But should they really? I mean, it’s a little like Google linking to Orkut. Is that PR10 link get full credit? With the amount of consolidation going on, these ‘company’ links seem to be skewing the link graph.

    Content Link Farms
    The content farms that are out there wouldn’t be doing nearly as well if it weren’t for their inherent ability to generate links, whether it is between their owned sites (see above) or through direct solicitation to build links from their content creators.

    Bookmarking
    If social bookmarking is still delivering any value it is overrun by more and more bookmark spam. Figuring out the real accounts from the automated or ‘managed’ would be nice.

    Exact Match Domain Shopping Conglomerates
    There are a few of these out there – buying up exact match domains for niche verticals and then slapping a (sometimes very nice) cookie cutter UI on each of them. Now, these guys are transparent – they tell you that they own say wooden-bar-stools.com and propane-gas-grills.com. So they’re not hiding anything. But the exact match seems to tip the scales substantially in favor of these sites – and the cross links between the sites could – once again – pervert the link graph.

    There’s probably a lot more but those are the ones that stick in my craw.

  135. How about getting legit websites accidentally penalized by competitors to come back faster. Our website got penalized and we MADE the brand with hundreds of other competitors and we had applied google reinclusion request and received a reply that it has been reviewed but it has been a month with no change at all and fake fraudsters are ranking in our place and misleading our customers and hurting us very badly. I think a lot of other legit websites might find themselves in similar situation with no recourse?

    I believe if you are creator of a particular brand then you should never be penalized because there are hundreds of fake website owners faking to be you and they will try to destroy you? but if you are stuck there so far there is no way that I can see to come back…..

    Maybe we should improve that?

  136. Hi Matt,

    I hope you’re doing well. I think that it’s great that you’re asking for feedback. Here are few things that I would love to see in the next year:

    1. A feedback link for every web page result in search results, requiring a login to a Google Account, which allows someone to fill in a detailed report, along with a place that helps people track their reports.

    2. A duplicate content search engine that helps people identify where their content might be duplicated, as described in Google’s patent application 20080288509.

    3. Implementation of some of the processes in the patent, Identifying Inadequate Search Content (20100138421), that you co-invented, such as showing searchers when they query they searched for has inadequate content, or creating an inadequate content search engine that can be used to identify unserved and underserved queries and topics.

    4. A syndication meta tag, which someone can use to identify the URLs where they have allowed their articles and pages to be republished.

    5. A link rel value of “paid” to replace the use of “nofollow” when a link is a paid link. I think the use of “nofollow” is a point of confusion, especially since “nofollow” is also used in a somewhat different context within robot’s meta elements.

    Thanks.

  137. Hi Matt,

    I’d like Google to work on finding an banning “autoblogs” which take posts from RSS feeds and post the content on their websites.

    I’d also like Google to make use of “Trust Rank”. I’ve read a lot on the subject and it looks like a great idea. I asked on your Google Moderator for the videos about what factors Google uses to calculate trust; I’m just interested in Google improving how they calculate trust (So sites which copy content can’t rank high).

    So I pretty much want Google to give us a visual representation of our site’s trust (or authority) and I want autoblogs to be detected and penalized.

    Thanks.

  138. Hi Matt,

    “Surprise me” button
    Most of the time when people open their browser, they don’t really know what to seach for. I usualy use StumbleUpon to find interesting content and most of the time it’s great content, but I would like suggestions even more user targeted. Here are my proposition:

    When logged, provide a : “Surprise me” button beside the “I’m feeling lucky”. This would display google random results that fit the user’s interests based on their research history and on which sites they have visited before. This might require a nice algorithm but I know you guys are good at it 😉

    Keep doing your nice videos, love it! Thanks 🙂

  139. Act more quickly to stop people that are gaming pagerank. I can think of a dozen or more sites that are pagerank 7 or higher that clearly bought their pagerank and have been reselling it to the highest bidder for the last 3 months. They are making a lot of money, and it just encourages other people to follow in their footsteps. There needs to be a better method to help google employees to identify these people, and then remove or reduce the visible pagerank so they don’t keep selling links.

  140. Morris Rosenthal

    Matt,

    As a publisher who also publishes eBook (without DRM), I hate seeing new rip-offs listed in Google every single day. There are entire networks of sites that exist to do nothing other produce endless lists of pirated eBooks available on file sharing networks.

    Morris

  141. Morris Rosenthal

    Matt,

    Oddly enough, I was drawn into a discussion about webspam at lunch today by a friend who has been so frustrated using Google lately that she now tries Wikipedia first whenever she’s looking for information. The reason she gave is that Google “wastes her time” with pages that anybody could have written given the question. I won’t give site names since you don’t want them, but she named a couple content mills. I asked why she didn’t just look at the URL under the search results and not click on them, since she knows who they are, but she said she just keeps getting sucked in by the headline match to her search query.

    I thought it was fascinating that she knew names of a couple content mills and quasi-directory sites (can’t think of how else to describe them) off the top of her head, but apparently for her style of query, questions about family health issues, etc, they always come out at the top.

    Morris

  142. Matt

    I like the idea of rotating the top ten results and if able to, the time spent on a page has affect on rankings for particular key word searches.

    Pages that have a large part of their site covered in adds, should have their rankings affected in a negative way, as a lot of duplicate content seem to have this characteristic.

    In my experience a large number of spam pages are to get users to click/impressions of their google ads so making this less profitable for web spamming pages, will probably decrease web spam. Factor in Wiki and Facebook increasing popularity is they don’t have any noticeable adds and web spamming.

  143. It’s time to detect spam, which is hidden via CSS tricks.

  144. a ‘report this’ function for website owners that will go to a centralized google database, maybe using webmasters/tools.

  145. Do not show other websites search results anymore.
    Explanation : if I’m searching a mid-low popularity keyword on G, it will sometimes return another set of websites which are simply doing a internal search on their own websites and returns 0 results in most of the cases.
    That’s annoying!

  146. 1. Google IPs have been used, and still is used, to send comment-spam to Blogs. It is *very* important, that when you launch a service (google Code in this instanse, but goes for all services), that you think about how this service can be abused, and take action to prevent this abuse.

    2. Making a complaint over these Google IPs to App Engine Dept. (think thats the name), you did act fast and shut down access to abused applications (thanks for that, I appreciate it), but I need to verify somehow that you can remove the spammers customers websites (canadian drug spam/porn) from the result pages as well. It will greatly help reduse spam over all. How do I know, when a complaint over a spamming site has been taken action?

    3. Too much confusion as to where to report what. Make *one single place* to report everything. You have several place now, That is one home page only, then you can have possibility to chose kind of complaint on that page too, but only have to remember one URL. I had to ask in a forum where to complaint abuse of Google Code, because I couldn’t find it.

  147. Kick scrapers out of AdSense and save honest folks—advertisers, publishers and readers—from them!

    Cheers!

  148. Please clean spam in Google Hot Trends..Thanks

  149. Google really needs focus its attention on the web site content scraping issue. It’s embarrassing to see a site which has blatantly scraped content from a site where the original content was posted to generate revenue via ad words plus the scrapper site ranks higher than the site which owns the original content. There must be a way for Google to get a grip on this issue, with some sort of site / content verification via Google web master tools.
    Thanks
    Sam

  150. The problem with many of the spam reporting techniques, is that people can then use them to try and penalize their competitors. If all it took was a “spam report” or a bunch of “blog comment spam” to penalize a site, then what would stop someone from artificially doing it to hurt their competition? And if that was to happen, it would be pure mayhem IMO. I don’t think spam reports or penalizing sites is the answer.

  151. Many good ideas already presented:

    +1 Martokus (paid link & link brokers)

    +1 penalize scraper/aggregator sites *much* more than you do currently

    +1 MAS (Gmail accepted languages feature to filter out spam)

  152. Hi,

    There are a few things that I would love to see tackled, but I’m not sure how many of them really fall under Webspam. A few things pertain more to relevance between outpaced by volume of information, as well as new directions for determining search relevance. But a few things that I’ve thought about I think constitutes or comes close to “webspam” – but I’m not gearing comments exclusively or specifically towards “anything deliberate” by webmasters.

    Content that gets repeated on pages seems very challenging. In some cases it may mark importance. In other cases its distracting. One example – which may not be the best one at this late hour – is 50 states appearing in a side bar, repeated on hundreds of pages (i.e. side bar / every or most pages on a site). I see many cases where including the name of a state in a search returns a site in search results where there is either 1) no relevant information specific to the state or 2) the site lists similar pages for 50 states with no significant content value specific to a state. Somehow, I would love to see such “side bar content” be excluded or probably better, have value calculations reduced – where such content does not add any value.

    It seems like too many “aggregator sites” are appearing high in search results. I think this is a tough one too because there are some such sites that are useful. There are others that provide no value or service. Basically, some intend to provide a service, others are there for search engines. The latter simply clutter search.

    High ranking in results of specific terms found in large sites (traffic and/or content) where the actual applicable page is old, not updated, low value is another issue I’ve seen. And I’m not thinking of the “top ten” sites. I can think of one particular site – which is actually a scam hurting many innocent people – that does very well in Google search results even though specific pages on the site returned in search results are very stale pages. Yet, because (it seems) the site itself has many pages and is updated often from accepting user submissions (very, very freely and pages built in a very questionable way), these stale pages are shown as far more relevant than I think most would think deserved (objectively). And sadly, in one particular case at least, people are being harmed from a disguised scam. Relevant terms on a particular page on high page volume or highly trafficked domain does not automatically equal search relevance. I hope this sounds fair – I know “one size doesn’t fit all because there are examples where that works well too.

    Ok, I think I made this long enough for a blog comment – I’ll stop here 🙂

    Sincerely,

    Matt

  153. Better penalties for sites with links inside [noscript]. Didn’t think that worked anymore, but indeed it looks like it works (got competitors that uses this trick and ranks very well) . Many blog toplists uses this approach to add links to their own affiliate sites.

  154. Hi Matt,

    I hope you are doing well.

    I think spam policy for Greek language websites works terrible…

    If you try a search query with the keyword «γάμος» (It’s the word “weeding” in Greek) (http://www.google.gr/search?sourceid=navclient&hl=el&ie=UTF-8&rlz=1T4GGLL_elGR335&q=%ce%b3%ce%ac%ce%bc%ce%bf%cf%82 ), all first results repeat this keyword in any possible way on their title and description… and they are still there!!!No ban for them…
    The same problem exists for almost all search queries for Greek keywords. I don’t mention spam situations concerning categories that my website belongs, because I don’t want to think that I reporting this just for my personal website. It is a general problem that needs you attentions.

    I know that Greece is a tiny country and certainly not a priority market for Google, but I think that we all deserve better results. Spam results make my job as a SEO to seam meaningless and I cannot convince my clients that black hat seo tactics could lead to a penalty, because they keep on seeing the same spammy results appear in the first top 5 positions.

    There are many other problems worth mentioning, regarding the “Greek search experience” but I would be glad, If you resolve just this at the moment…

    I hope you will try your best as always.

    Stratos Dimopoulos

  155. Hi Matt,
    I would love to see better quality in the image results, this seems to be an area where very poor quality and sometimes virus ridden sites get very good results, even by hotlinking someone else’s images.

    It would also be great to see an improvement on the quality required to appear in Google Serps, I still see many “made for Adsense” spammy websites appear very high in the search results for longer queries, along with sites that you have to pay / login to see the answer for.

    In my opinion, the whole Page Rank system is abused to the point of failure and the percieved importance of Links mean that people are going further than ever to buy / aquire links. This means that often a site with lots of links but poor content is displayed far above sites with good content but fewer (or less “optimised”) links. A system that relies less on PageRank would be great to see, but I appreciate that it would be very difficult to implement.

    Lastly a more visible “report this site” system would be great, the link appears to have been removed from webmaster tools and this makes it quite inconvenient to actually report spam in the index, maybe if it was highly visible to report we would not see as many occurances of spam in the index as we do presently.

  156. @Peters

    That’s a “directive” and not a “law” as much as the eurocrats would like them to be considered as laws. They are supposed to be enacted by national governments however national considerations override them – the French farmers have a riot and bur down a town hall if they don’t like something for example and the directive gets ignored.

    Its about as legitimate as an early day motion or a spoiling amendment put in by an opposition politician which will not pass.

    I would also draw your attention to para 14 in particular “this Directive cannot prevent the anonymous use of open networks such as the Internet.”

    I am sure that some would like the internet to be as closely regulated as the press and media is in Italy and France would stop those unfortunate expenses scandals like those cigars a junior minister put on expenses (makes duck houses and moat cleaning look tame) eh 🙂

  157. Hi Matt,
    Really nice post. More information about web spam, but the report and explanation.
    Google webmaster tools page report also found spam to be nice

  158. Promote sites that are registered with webmaster tools/google local business as these sites have already been verified. The spam will then get pushed down the results.

  159. Paid Links! – still a big problem, and still see many sites using them. There are well known companies supplying ‘SEO’ services that use (abuse) Google’s name saying they do legitimate work, but all they do is create hundreds of useless (to user) mini sites etc that include links back to their clients. Become a ‘client’ of these companies to find out what they are doing, then ban all their work from the index.

    Useless min-sites – one page sites that are set up only for paid links or for adsense and provide no real quality information to the user.

    Exact keyword domains – seems to have become the new spam favourite recently – a ton of new domains with only 1 or 2 pages that end up ranking well on long tail keywords when there are actually better sites with lots of content that are under more traditional (non-keyword specific) domains. There appears to be way too much ranking power given to the domain name at the moment.

    de-rank ehow and similar content sites that do not provide any proper or in depth content on subject, just brief summary articles to get themselves ranked.

  160. @Maurice

    It is in the French law, in the law from Belgium too. Speaking abouts laws u can make thousands of pages and I don’t want to discuss about that. It is in the law, it is a good law, if you are not happy, you can vote for other political parties, that’s the democracy.

    I think you have problem with democracy, transparency. If u want to stay anonymous ok, but do not provide a service anonymously because it is just ridiculous. You do not know who is behind. For instance if you manage a dating site, and if problems appears on the site, where is the contact ? Same for classifieds, there are so many problems with such site that u can not be serious with an anonym editor.

    If u want to hide, create a company, and operate behind this company, this can avoid you o be on the front line.

    Look at spam in the emails, the Sender Policy Framework and Domain Keys are just based on the “anonymous problem”. If there was no free mail services and SPF applied, the mail spam would be erradicated. But we are far from that. Last year, I realised Google was throwing to the junk mail box emails with “SPF: pass” ( and no other signal spam assasin score of 0.5/100 no blacklisted IP and so on ) and showing me emails in the inbox with “SPF failed” :-/

  161. @maurice

    the question of auto-regulation you mentionned is very important. If a sector of the economy is not able to auto-regulate, the the politicians will do that for them, and sometimes it is a very bad idea because politicians do not always write good laws.

    The question is: Will Google be able to auto-regulate and solve spamdexing problems ?

    I hope they will, and I hope they are aware of the situation.

  162. I’m getting really tired of sites that don’t have any real content showing up in SERPs. I’m talking about sites that either index message boards or mailing lists and the like, as well as sites that have a placeholder page for every topic imaginable but have no actual, useful content about those topics. IMO, any site that blindly indexes discussions and then just links to the other sites through an iframe should be considered malicious spam and removed from SERPs.

    At the very least, we should have the ability to log in to our Google account and block certain domains from appearing in our SERPs. These sites are taking over the SERPs more and more, and Google becomes less and less useful to me because of it.

  163. “I have seen numerous sites do this, and they seem to keep climbing the ranks in search results, despite being directly against both the spirit and letter of Google quality guidelines.”

    That is what aggravates me as well. While honest webmasters exhaust hours upon hours in writing original content, they don’t get the kind of rankings they deserve, while those “parasite sites” get all the high rankings without any hard work at all. I sincerely hope Google does something about it. Even as a user, it is tedious to wade through pages after pages of bogus sites that rank for a given search term. The mayday update was a good step forward in cleaning out the junk sites, but still more algo changes are needed.

  164. Instead of controlling the web, why not let it live, breathe and evolve. Majority of spam is to get one up in the Google SERPs. You could focus more on user trends and personalized search and sneak in results from new and related sites. i.e try to determine their search path by using past searches and get importance by user votes and involvement instead of what the webmaster/seo as optimized for on the page. Anybody can put a false title, heading etc but they can’t control the user. This will stop people getting desperate/lazy and spamming the hell out of the web.

  165. Without a shadow of a doubt – PPC.

    Example would be “passport” in the UK .. half a dozen adverts to premium lines, probably all the same company. Disgusting in my opinion, the same with benefits etc..

  166. My suggestions:
    Many business are not making it and closing because times are tough. However I can still find them in google local and google map.

    For example if you google “soccer store in vista California”, it will show you two stores, but I know for fact both of them are out of business and have closed for quite a while.

  167. One More sorry

    I think Google is putting way too much focus on site history and age and I see websites coming up with no recent or updated contents that can be useful to me and often as a user they can be misleading. I think if a website doesn’t update content at least monthly, it should drop ranking. Everything chances everyday, no point to keep content of a site still.

    I know caffeine is geared toward faster indexing but I haven’t seen much results in ranking improvement of those who have fresh quality content daily. Many of my clients provide good information weekly to their clients, except that they are ranking much worse than their competitors because of domain age andor website history of their competitors.

  168. Thierry Le Fort

    As I was mentioning to Stephane de Billy, for some example queries which has a parked domain ranking and had a shine.yahoo.com spam site for ages (still in top 20, Yahoo removed the page, G didn’t automatically catch it). I also found lots of long tail stuff recently that was ranking that shouldn’t be, sometimes 4 or 5 out of the top 10 being fluff pages.

    Additionally, spam is really frustrating

  169. I would like to see much less shopping type sites on the first page..sites like Amazon,Nextag etc. A lot of the listings on these sites are identical and if I am just looking for reviews from real users or some information about the product then I have to forget about the first page of Google.

  170. After reading through most of these comments it is pretty clear that all of the “mechanical” solutions are flawed and can be gamed by clever spammers. To me the answer is pretty obvious, put more people on the job. No disrespect to you or your team Matt but, there are more spammers than there are spam cops working for Google and until the spammer’s efforts are matched in man power hours, it is a fight that Google or any other search engine for that matter, can’t win. Now we all know that it isn’t a question of money as Google certainly has fairly big coffers so it must be a case of pride. Perhaps it is time to admit that the problem is a bigger one than Google can easily solve with programs and just hire more spam cops to weed out the obvious spam. No one will think less of the mighty Goog but will more than likely appreciate the effort and honesty in admitting that sometimes people need to help computers solve problems. Go for it and toss Lionbridge (Nasdaq: LIOX) some more money to help clean the indexes.

  171. even the issue with false positives… i agree with @steveplunkett and want to see a chrome extension ive already identified 10 different methods for hiding text on a page but im still looking for more samples… so send me all the bad seo on the web you can find! http://j.mp/cMkaCN (please submit samples here)
    im looking for instances of hidden text, malicious doorway pages and user-agent/ip based cloaking…
    im thinking that because a extension has dom access it may be easier to test for things because it is actually interpreting all JavaScript and building dom
    bother me on twitter @cartercole

  172. Matt, thanks for giving this chance.

    You knoıw dmoz.org, (i’m an editor too ) people tries to make web better and nothing expecting while doing this.

    So, lets create a project the name can be Open Clean Web Project. People wants to join as editor. And there would be a “Report Spam” link in the footer of search results page. When visitors bury the search result page, The editors can review the results and “Mark This Web Site Spam” after this, a request sends to your department. After this step, thats your choise to clean that result.

  173. @ Josh L
    Great plan!

    I think a button like the favorite button but opposite (unfavorize) could prevent results from showing up on your pages.

    -> Maybe even importing/exporting these unfavorized sites (as a lists) should be possible.
    -> If some group of users could then continuously update a such list, I should only subscribe to the ‘wehatenocontentwebsites’ list, and the hiding/updating could happen automaticly.

  174. @Peters

    I think political risk Google faces is unlily to come from spam in its index its more likely to come from political presure from the usual EU or old media suspects (but they are busy attacking the BBC at the moment) 🙂 or from a senior Googler doing a Ratner.

    for those of you who dont know Gerald Ratner was the CEO of a large uk company he made some off the cuff remarks at an IOD function.

    “We also do cut-glass sherry decanters complete with six glasses on a silver-plated tray that your butler can serve you drinks on, all for £4.95. People say, “How can you sell this for such a low price?”, I say, “because it’s total crap”.

    This almost caused the company to collapse destroyed around 1/2 billion of value and lead to him resigning.

    Though for US readers doing a “Steele” might be a better moderen reference.

  175. The Chinese Spammers that at one time dominated the top 4 organic positions for polo shirts are back with the same companies using different URLs.

    Now they hide their back links so you can’t see they are still using the same techniques of thousands of pages of content that contain utter nonsense and gibberish except for the text links that link to them.

    I have seen this technique used in many other SERPs and as long as Google lets this continue, it just encourages other people to do the same.

    In addition, these sites engage in trademark infringement and sell counterfeit name brand polo shirts as well.

  176. Fixing phone number look up SERPs. Anytime a user searches for a phone number i.e. “305-555-5555” we are inundated with fake directory listings. This pages disrupt the user experience. When I search for a number I would prefer to see results that contain it, not just spam pages w every number imaginable.

    Thx

  177. Try to find people who register multiple variations of a domain name for their company (as random examples: site61.com, site1sixty1.com, sites6one.com), and then construct a website on each domain name that is just different enough so that the search engines don’t catch it, but all sites are selling the same service so they dominate results on an SERP. I see this quite a bit.

  178. Hey Matt,

    Foreign spammers were mentioned numerous times, and
    what about ALL THOSE SCAM SITES!?!? For instance, when you look up “green card lottery,” you get a page full.

    I think people should be able to “flag” sites in the search results or “digg” them if they’re relevant. That way you’ll also rely on user-generated search results (to some degree), rather than strictly depending on Google’s algorithm, which will never read things the way a human does.

    Either way, I think you should find a way to get users to play a role in search results.

    Best of luck!

  179. Today most SEO’s are placing similar content in article, press releases, blogs, comments to get links from other sites. All websites are not able to find duplicate content so that they do not consider it spam but if google track such text with same links at websites then it will be great success for google spam team.

    thanks..

  180. Is this real Matt Cutts’ blog? Where’re you Mate (Matt)? Webmasters are waiting 🙂

  181. Create Google Web Spam Team Call Centre at Country & Top City-wise (Toll Free) – To register complaint on the spot and give instant solutions to everyone. If you do this, based on the conversation and type of complaints, undoubtfully you come to know what your next move…is

  182. An option to remove inactive social media pages from the SERPS would be nice. (Or perhaps all social media?)
    Nothing worse than searching for something, only to find a Vimeo page with one six month old video on it that just happens to have been tagged well!

  183. Hello Matt, if Google will make something like VOTES against spam links. Be sure, next day some one open a small business when 1.000.000 ppl from different countries will vote for or against some one for 1cent. for example.. Very good idea, but not easy to release it in the same high level

  184. I think one of the biggest areas that should be looked into is how spammy domain names are now ranking really well after the caffeine update. I see several domain names that are targetkeyphrase.com or targetkeyphrase.org that are created by larger companies for the sake of getting traffic for a money making phrase so that the searcher clicks through to their main site. Affiliates and made for adsense sites are abusing this too. This goes against Google’s best practices and still seems to be FAR too prevalent lately. I hope the webspam team is currently working on this because since caffeine and mayday happened, it has been a big problem. This doesn’t give the searcher a good experience and allows for lots of web spam.

  185. Why does Google index UTM tracking urls?

    I would have thought your algorithms would know to immediately discount them as simply being for tracking purposes?

  186. Results for individuals are including more and more high-ranking spammy aggregating/indexing sites that confidently purport to know everything about search-term-person, but know much less than Google.

  187. I would like to see google give more recognition to websites that continually change and update their content. I have several competitors that haven’t changed a word on their websites in years that are higher than me in SERPS. It is so frustrating.

  188. Set up a white hat link exchange with quality control like you already do with Adwords (landing page quality) and Adsense (approved Sites + competitive Sites filter). Why ? There is a need out there for quality sites to link to other quality sites but today this topic is very complicated. So if you would organize this link exchange with a “google approved link” seal this would help very much. This way you could reduce paid links and low quality links very fast: a noise reduction by design.

  189. How can Google possibly fight MFA sites when this is the example they set for webmasters: http://i.imgur.com/WWaUR.png

    9 ad results above the page fold and 3 results? How is adding more ads in more prominent positions and adding “something different” search terms helpful to the user? People who aren’t familiar with SERPS or internet advertising would have no idea the first 3 results were advertisements. Show that search result page to an internet illiterate person and see if she can distinguish the ads from the search results. This isn’t even the worst result page I’ve seen lately.

  190. Everybody is going to have their own gripe about what they want and reading the many different replys I can see that. Many of us webmasters/site owners dont like when our content is scraped which seem to happen alot and also sites with no content that add little value to the user experience. The one thing I can say is that the results are so much better now then they where a few years ago.

  191. Hey Matt,

    I generally think you’re doing a really good job on SERPS right now – I’d suggest:

    * Knowledge Transfer within Google (Groups and Youtube spring to mind as needing some spam detection love)

    * Sharing knowledge about bad sites with other sites / companies – could you make any lists of compromised ip addresses / domains public? I’m sure a lot of us would love that data.

  192. Your team does some amazing work – I don’t think people realize how much & how many types of spam you fight. There’s a tendency for people to only look at their own situations and think to themselves, “it’s easy, all you have to do is this ____.” I think it would be interesting for you to share some numbers that would shed some light on the battles your team has fought.

    You may already do this, but I think you should take your Chrome data, toolbar data, and 3rd party data that you already subscribe to and look for sites that get 90-95% of their traffic from Google. It’s a clear sign of a site that is exploiting something specifically to Google. A quality site should get direct traffic, traffic from other search engines, and traffic from other sites.

    New stuff to keep an eye on: real-time search spam. Occasionally I’ll come across a search that has universal search results with scrolling twitter updates or latest news and I’ll see Tweet spamming (usually multiple accounts) and news spamming (usually lower quality “news” sites) pushing out tweets or news pages to flood the results. I saw it a lot initially (single user tweeting over & over) which has been addressed, but there’s still opportunity to improve and real-time spam will only get worse unless sites like Twitter improve their anti-spam mechanisms.

  193. I would like to see Google adopt a more balanced approach to dealing with suspected spammy links. Unfortunately, there are plenty of people other than the owner of the site who can easily and quickly build these kinds of links. The way it is now, I’m almost afraid to rank too well for any given search. The way your system currently works, it is way too easy for a competitor to put you in a position where your site gets a ranking penalty by leaving a ton of comment spam or garbage links. I think it would be a much better idea to just ignore links that are suspected of being spam. That way the spammers don’t win by getting any boost in rankings while legitimate sites with quality content don’t have to fear showing up at the top of the search results where they are most likely to attract attacks directed at making them look like spam for the purpose of making them vanish.

  194. +1 on a few of the suggestions mentioned above for:

    Auto-generated Google Alerts that are utter garbage.

    Content change monitoring – per the comment from Carmen Brodeur. We work hard to give genuinely useful and valued content yet the reward, though there is one, is not what it should be when compared with other sites.

    Account for the fact that .co.uk domains cannot be extended beyond 2 year renewal dates so are unfairly treated in comparison to .com.

    Finally, I’d echo this from Bill Slawski:

    “1. A feedback link for every web page result in search results, requiring a login to a Google Account, which allows someone to fill in a detailed report, along with a place that helps people track their reports.”

  195. I think the comment about giving more emphasis to sites that continually update their content is fair. There are a lot of stale sites that have not updated in ages and continually rank high.
    Thanks for the good work.

  196. It’s amazing how many types of spam there is out there. Off of the top of my head I would suggest removing inactive sites (mainly social sites) from the SERPS.

  197. Google spam team should work on NOT SCREWING over the webmasters who have followed their guidelines and supported them since the start.

    The winds have shifted and many Google supporters have jumped ship myself included.

    You get paid the big bucks, so you figure out what Google needs to work on.

    Although I’m sure the lemmings here who follow you will be like little puppy dogs and give you all kinds of ideas.

  198. Malware notifications in webmaster tools are amazing because Google literally tells webmasters when it notices a problem with their website. I personally think Google should begin to add more features that give direct notifications of problems to webmasters.

    What if Google added a “suspicious external link activity” section to webmaster tools? If Google can detect malware, why couldn’t it detect spammy or “unusual links” placed on pages and notify webmasters of them. This would save webmasters tons of headaches in trying to diagnose lost rankings and also put more responsibility on quality sites to clean up their comment sections and abused sections of individual pages.

    But why stop there? Instead of dropping the ranking of sites Google doesn’t want showing up in it’s SERPs for whatever reasons, wouldn’t it be better to directly tell webmasters what you DO want and suggest changes to help them rank better? Webmaster tools already informs webmasters of missing title tags or short meta descriptions but what about showing pages Google deems over optimized or pages Google feels are lacking content or can be improved in some way to help them rank better.

    I think Google employees should try to put themselves in the shoes of a webmaster trying to build a well ranking site with very little advice from Google but 100 trillion pieces of bad information provided by SEO blogs and forums which almost all encourage black hat activity. It’s hard to build a quality website when you have no idea what Google actually considers “quality”.

  199. I would like for Google to create more alogorithms that lower rankings and keep hush about what those are other than ” do what is right and you will be fine”

  200. I have to agree with Boris. I run a Google News accredited site, Our content is being Scraped all the time. Obviously I am happy when people link to our article, but the wholesale scraping often with no credit is a terrible situation.

    Most of these sites are worthless, containing only scraped material. I’d like to see Google crack down on them. Obviously this is a complicated area, particularly in the News world. Many organizations have reciprocal arrangements, the issue is when is duplicate content for real, or stolen?

  201. I personally do not understand how and why Google have in the index websites with just ad links everywhere – especially parked domains and stuff like this. Take them out from the SERPs. This thing involves also the AdSense for domain issue. Why you index that pages? What’s the point?

  202. 1. Have Chrome/Firefox extensions to send a spam report with an (optional) explanation. More reports about a particular site and you’ll notice how spammy it can be.

    2. Focus on sites that scrap others sites info. Going to their sites is utterly garbage and can be dangerous.

    3. Focus even more on sites that buy links from hundreds of sites in bulk. These sites ruin the way links should be built naturally.

  203. Steven, you’re wrong, Matt can’t be telling you what to put on your page, it’s like with overtrained neural networks. Just write quality pages and the God will reward you 🙂 Moreover, Google and black hat webmasters are natural adversaries, why don’t you appeal to them to inform Matt in advance about their moves?

  204. Google News. How many of those sites are credited as news sources is beyond me. Most are simply scraping Google Trends and filling pages with nonsense scattered among Adsense with absolutely no added value.

    Thank you in advance for your time.

  205. Google needs to come up with a system of judging the true value of content on websites that goes beyond links and other algorithms that can be gamed. With your budget, I would think you could almost pay a team of experts to read sites to evaluate whether they are pumping and dumping generic content or if the site is a genuine source of unique, valuable information.

    For example, the website ambienoverdose.org contains what is likely the Internet’s third best source of information on the sleeping pill Ambien. The site is a huge archive of stories from actual Ambien users. The site doesn’t rank in the top 100 for the term Ambien, despite the majority of the top 100 sites being fluff and the website being in existence for years.

    This occurs over and over in Google search. A computer might never be able to distinguish between an offshore content producer writing low grade answers on a Q/A site for $1.50/hour and a doctor providing a detailed analysis of a problem. On the other hand, I can generally identify a site with useless content pretty quickly. I’d like to see Google become better at this.

    Thanks!

  206. For very competitive keywords: ability to fill a complain, then Google listen the arguments of both parties, and a decision is made. This way we know if such technique is approved or not.
    I posted a spamreport 8 months ago and still the guy has 3 websites in the 16 first results.
    I dont want to do like him and multiply my content on 3 domain names. I could, it is very easy for me to take a template, buy a domain, and imitate this guy. I consider he is cheating, and there is no reaction from Google. Spamdexing in french is very common. I believe Google is not aware of the problem.

  207. Would really be useful to reveal more forensic information for sites having difficulty, especially for sites that are penalized. We help in the recovery process and know that many sites that get penalized are actually themselves victims – of other seos, their own ignorance, etc., and that no malfeasance is intended. I understand that revealing too much is not wise, and that there are bad players. But for those who actually intend to be good web citizens but inadvertently triggered a Google penalty, there must be a better way.

    I’m particularly alarmed at the recent pullback on information available from within Webmaster Tools on inbound links. We used to be able to download the full list, and now we’re restricted to a tiny sample.

    Given Google’s emphasis on relevant, organic links, and the fact that Google penalizes sites with inappropriate linking, why would you intentionally neuter one of the most valuable tools used for the discovery of rank issues triggered by links? When attempting to diagnose a site’s ranking or penalty issues, we are no longer able to see the links that you are crediting the site with, only a minuscule sample, many of which are often multiple links from only a few domains, making this information useless.

    I strongly encourage you to revert the discovery metrics within WMT regarding inbound links, plus 2 other useful discovery items:

    Redirect Detector – sites get harmed by chaining redirects, but most don’t even know they’re doing it. Would be great to be able to see that from within WMT

    Penalty Confirmation – I know this is asking a lot. But it shouldn’t be. Can we at least know, from within WMT, are we penalized, yes or no?

  208. check recently expired domains and compare these to “new” websites that have a lot of outgoing links

  209. Do not rate the keyword in the URL so highly – its too easy for spammers.

  210. My case for making the Google Ranking Algorithm more open:

    Google keeps many details of their ranking algorithm as a relatively closely guarded secret. In theory, this makes it harder to game the algorithm because people don’t know what to do. But I think the actual effect is to punish website designers who focus on building good sites because it rewards SEO lifers who spend a massive amounts of time reverse engineering the algorithm. Some aspects of the algorithm may need to be kept secret, but I would at least like to seem some discussion of opening it up.

  211. To be honest, Google should remove backlinks from the algo. It’s not worth it anymore and if it gets removed all backlinks will be done naturally!

    Site’s should be viewed for the best position for relevant content for the USERS, it’s the only way to win a lead/buyer. I have several sites ranking well and converts extremely well and not one of them have any backlinks!

    Just my 2 cents!

  212. 1. Devalue article directories. Most of their articles are, even on the big sites, provide little useful information since they are mostly used to generate links. I hate coming across the same article on ten major sites.

    2. Better control forums. Perhaps, only index the first post in a thread as that usually provides enough information to know what the thread is about. I dislike it when I find 10 posts indexed and appearing for the same search terms because I read the entire thread after finding it the first time. I don’t want to keep revisiting that thread.

  213. Definitely would like to see more effort on the linking front. Far too much link spamming going on, and as long as links continue to be a significant ranking factor, then folks are going to keep doing it.

  214. I don’t have any idea why I can still find those lovely “What you want, when you want it” pages. It doesn’t very often but when it does it’s incredibly disheartening to find a page completely comprised of affiliate links in the index. It’s even worse when every link on a page points to the same e-commerce site. I’m not a programmer but I feel something that blatant should be easy to detect for Google’s algo.

    And for Google to take such a strong stance against paid links I haven’t the slightest idea why all these fine upstanding sites are even indexed Google Search

  215. TEMPLATE sites repeat canned information with slight modifications, and clog the web for surfers trying to find real source data. Can you detect repetition and dongrade template sites?

    Similarly, sites republishing and repackaging stale second-hand DATA (like real estate listings) detract from the source sites (like local realtor databases) that enforce strict rules on collection, policing, and freshness. Even realtor.com republishes data collected from local databases. Other sites (like newspapers) publish data that is even more stale.
    Local realtors provide DIRECT access into the local database, packaged with local assistance, color, data, and accountability.

  216. Hi Matt, maybe the web spam team can work out the issue of the local listings from being hijacked or merged which allows our listings to become hijacked. If I may provide a little example here, I know most people here could care less if the industry that I build websites for has spam or not but I do and It’s frustrating because I never get a response when I report a problem … I don’t if it’s because of the type of industry but I could really use feedback/advice from you Matt or anyone else that might be able to point me in the right direction. I have a local listing that is being merged with my competitor who’s able to change the phone number and now they’ve used the “tag” feature to add their site under mine but now chances are people would hit the tag because its more visible I’m guessing. If you do a search our site is ranked #1 organically, and was #1 locally for a long time until I believe they verified my listing somehow. If you search for “Las Vegas Escorts” you see our site http://www.lasvegasescorts.com organically and locally but they have they’re phone number there and now tag, is there a way to correct this? Isn’t this considered spam? dunno where to turn from here, any help would really be appreciated.

    John

  217. I know my thought would screw you up, but I think it would be amazing if you do something extra. Google presents 1-10 for any query; I want to see 10-1. Just reverse the listing on every month, because for business terms everyone should get some sort of fair chance to do business. The current system of making competition an unfair affair, giant companies are getting easy business as they have much revenue to invest. Everyone is going for the first page listing; no one bothered about 2nd, 3rd, 4th or even 10th…. I want to see, Google should create a platform for everyone, not for only those who are doing huge seo works and other online promotions.

    Amit,
    I love to think something “ExTrA”!

  218. Could you make a strong effort in removing some of the lower bulker backlinks? Free directory spam and forum profile spam is rife in the SEO community at the moment, it wouldn’t be so bad if these bulk bad submissions didn’t work but they do and its frustrating at best.

    Shoot me an email if you want to see any examples, I’ve got plenty!

  219. Within the industry that I work in, we’ve seen a huge shift towards companies buying exact match domains and being able to quickly rank well, often with little to no relevant content. We see sites littered with dashes, and non .com domains and it leaves us scratching our heads as to why Google will populate so many results with these mini-sites. As well, I think it shifts company focus towards domain acquisition and masking ownership when efforts could be better spent towards quality content.

  220. Navigaitonal nightmares – whether they are scraping forum content then linking using iframes, or scraping blogs and the only link is to another search based upon the title of the post, it becomes a really poor user experience, and in most cases the only way to escape is to click an Adsense ad.
    If you submit a report to the Adsense teams for a Adsense ToS violation, they tell you to file a DMCA… but that isn’t the issue – people can legally take my content, and other content that has some kind of “free” license and build all kinds of crap – the only weapons are spam or Adsense, not DMCA.

    I am quite happy for people to take my full content, even to modify it, even without providing a link as long as it still provides a good user experience which doesn’t replect badly on me or pollute the web.

    Oh & please… I saw a few suggestions to use Akisment (I know you asked people not to based things on others… but I peeped) – Akismet is terrible for false positives, and has privacy issues especially cross border.
    I know you would need a combination of signals, but Akismet is so easy to game.

  221. Hi Matt
    For almost 2 years, our site remains under a penalty being victims – of other seos and ignorance, but for those who actually intend to be good web citizens but inadvertently triggered a Google penalty, there must be a better way.

    We have being trying to resolve out penalty using the inbound links information provided within Webmaster Tools and not find that we cannot do so with the restricted to a tiny sample.

    To help us could you not restore the link information provided within WMT with the addition of :
    1. Tell us which links are bad so that we can deal wit then and
    2. Penalty Confirmation – Can we at least know, from within WMT, are we penalized, yes or no?
    Kind regards

    Harminder

  222. Apologies that I cannot think of a clearer way of showing the problem than calling the sites, but it’s pure and simple webspam, compounded by the fact the sites are revolting and you guys are still ranking all the sites highly despite the fact that all the following sites are one company:
    .bigredl.co.uk/links.php – here’s the give-away!
    .intensives.co.uk
    .2pass.co.uk
    .learnerstuff.co.uk
    .driving-schools-directory.co.uk
    If you take a quick look it’ll take you straight back to the ’90’s!

  223. Some way of finding out why a site would suddenly drop from 1st place to 45th in the SERP for a given phrase.

    It doesn’t have to be a detailed report – but a reason why, to give direction of how to improve.

  224. Hey Matt,

    I am a webmaster for a very large company and we always monitor WMT backlinks to make sure no one is linking to us from a bad neighborhood. Only offering a sample of backlinks makes it impossible to vet out undesirable links. Please rethink your stategy with showing incoming links, otherwise it makes it more difficult to remain compliant and probably almost impossible to recover from a penalty if the site gets penalized due to incoming links.

    Thanks for your time!

  225. I think Google should focus a little more on the following:

    1) Penalize Content scrapping sites

    2) Penalize obvious paid links (Specially those that have been reported!!!)

    3) Stop juggling with the results! Sometimes, sudden result changes simply make no sense. It doesn’t help visitors when low quality websites take over the spot of websites that fully comply with the guidelines.

    4) Reward authoritative links that denote the quality and safety of a website.

    5) Reward track record and quality!

    6) Being a bit more open about the algorithm would be great 😉

    7) keywords in the root domain URL shouldn’t be that important

    How can a recently-built, spammy looking website, whose backlinks come entirely from blog comments and forum posts out-rank well established websites that comply with Google guidelines??

  226. I agree with this guy (see below)

    Michael Hoskins June 30, 2010 at 8:28 am
    Please work on penalizing scraping sites purporting themselves to be useful. As an example, say I aggregate specific forum topics (such as Javascript programming), and present this content on my site. Without adding any real value, I present the content surrounded by my ads, and a link to the scraped content. What I am essentially doing is making myself the middle-man and skimming ad revenue off the top of what Google would normally do without providing value to the user.

    I have seen numerous sites do this, and they seem to keep climbing the ranks in search results, despite being directly against both the spirit and letter of Google quality guidelines.

    Here is an exact example that Google has ignored for over 6 months now:

    http://southbeachmanwhore.com/

    This site just scraps content from Yahoo.

    Stop penalizing innocent sites, admit and fix Google mistakes and treat all people fairly.

  227. Please clean up google maps!!! In Phoenix Az, and the surrounding suburbs, there has been a home based one man computer repair shop spamming google maps with 20+ fake listings. Weve reported him for over 2 years and nothing. We even emailed Mike Blumenthal about it, and he said send the info on the spam, if its good, ill forward it on. He agreed with the spam accusation, but google only removed a few listings, and now tons more have shown up……this guys listing 4-5 shops at one residence and another 10 or more at fake addresses, vacant office spaces, or using streets with no number listined…, and googles listinging them all in the top 4-5 for some searches, and hes in the top 2 for every suburb in Phoenix. He’s even posting fake reivews, 400 on one listing by mostly the same reviewers, and even posting one identical fake negative reivew on 15 or more other shops sites, and the review even told users to use his repair shop and even listed the address and phone!!!!!! This kind of stuff should not go unansered for 2 years while one scammer owns an industry in a city of 6 million people!!!!!

  228. 1) +1 to all suggestions to better identify and remove MFA’s, scrapers and autogenerated microsites.

    I noticed several comments about abuse of exact match domains. IMO, those are only a tool, not the root of the problem. Improve your means of filtering out useless content and you’ve eliminated those abusers (and many, many more.)

    Want to get rid of a lot of the MFA’s and mass-produced minisites in one sweep? Start cracking down on Adsense publishers without a privacy policy. I can’t begin to say how many of those kind of sites don’t have one.

    2) Be more proactive about penalizing paid links

    3) Get rid of toolbar pr – perhaps substitute a simple yes (has pagerank) or no (has no pagerank) indicator. Again, I’m thinking about the cause of the abuse – toolbar pr has become and continues to be an economy unto itself and the incentive for many a paid link.

    4) Penalize spammy link wheels – plenty of those still out there.

    Thanks for the opportunity to sound off :)!

  229. very good list of suggestions here already, but personally I would like to see more action taken on websites that display scraped content. The majority of these websites seem to have suspicious javascript wiles which could be classed as malicious.

    Thanks,
    Scott Mc

  230. I think Google should work more to fight Duplicate content as it is still seen that pages with similar content are ranked on page 1 on the same keywords searched for. For example if you search “aftermarket ink hp 5000” in google you can find “notionage.com/blog/…/aftermarket-ink-hp-5000-the-seo-challenge/” and “www.seostudiosix.com/aftermarket-ink-hp-5000-–-the-seo-challenge/” both ranked on 1st and 2nd position and both have matching content.

    There are several other examples i can show to prove that google is still pretty much ranking the websites with duplicate content.

  231. It would be great to see a resolution to the problems with Google Images and people hijacking images. Lots of my photos have been hijacked :/
    Kudos to the webspam team though, you guys do an awesome job.

  232. Product searches are being taken over by large 3rd part sellers such as Amazon, Buy.com, Overstock. So much so that Sears & Walmart are now letting sellers on their site like amazon.
    I would like to see sites that sell & deliver goods themselves to get back up to the top few search results. For example, our site, http://www.buymebeauty.com is doing a great job but is loosing much of it’s sales to Amazons monster ranking on all products we both sell.

  233. Easiest one to tackle the keywords in URLs. In 2010, the example sites given above keyword1-keyword2.com should not rank well if that is their only trick. I admit things got a lot better recently then 3 years ago, but still keywords in URL still has way to much weight

  234. When I’m searching for a product, I’m mainly interested in customer experiences and not in shopping locations.

  235. Whether Google runs spam projects or not but I still see a bunch of sites ranking due to paid links which are moreover hidden on the page and obviously belong to same networks as the code looks exactly the same everywhere… Anything being done about this except for having users report those links?

  236. Prevent Youtube spam – some people abuse Youtube to upload nearly identical or useless videos and stuff the description with tons of keywords, just like in the beginning of webspam. Google of course ranks Youtube-videos currently extremely well and they “stick out” from the mass.

  237. Can you build an extension that shows a percentage (or ranking) of spam websites or sources of spam emails on a specific IP address? This way, users can be aware if they are traveling to a website that is on a known spam-laden server. If servers are reported to have a high level of spam content, they will likely lose traffic for all their clients. If this persists, these hosts will likely lose a fair bit of business for allowing spam content.

    I believe this would be a relatively small project for Google that would encourage a big change on spam in the throughout the internet.

  238. I guess here is the best place as any to vent….You guys really need to work on all the spam websites being developed in China and the spam backlinks. Last year at about this time it occurred for a few months and then these chinese websites dropped off. Now, they are back with a vengeance. I just got canned from a client who is in the designer clothes industry. For the past four months these chinese websites have taken over the rankings within the designer jeans category. Any highly searched phrase is rife with webspam. C’mon, really. A domain registered in china and backlink spammed to the hilt is now more relevant than a US site that is trying to play by the rules. My client was doing well and his ecommerce business was growing, but not now. Take for example the search phrase “true religion jeans”. You have in the top twenty:

    Domain name: jeans-classic.com
    Creation Date:2009-07-04
    Registrant City: putian
    Province: Beijing
    Country: China

    Domain Name: COOLTRUERELIGIONBRANDJEANS.COM
    Created on: 09-May-10
    Registrant:shanghai, shanghai 200080
    China

    Domain Name: jeanscvs.com
    Registrant:Mingwu, Fujian 345411
    China

    Domain Name : jeansonfire.com
    Creation Date : 2009-05-09
    Registrant:City : PUTIAN
    Province/State : FJ
    Country : cn

    Domain Name : ENIKESHOP.COM
    Creation Date : 2009-06-27
    Registrant:City : putian
    Province/State : Fujian
    Country : CN

    Domain Name: JEANSCLASSIC.COM
    Created on: 17-Dec-09
    Registrant:Xiamen, Fujian 350002
    China

    Domain Name : MYLIKE123.COM
    Creation Date : 2009-06-29
    Registrant:City : guangzhou
    Province/State : Guangdong
    Country : cn

    DOMAIN: HQTRUERELIGIONJEANS.COM
    created-date: 2010-03-01
    owner-city: Guangzhou
    owner-state: Guangdong Province
    owner-zip: 510000
    owner-country: CN

    DOMAIN: EDHARDYSHIPPING.COM
    owner-city: putian
    owner-state: Fujian
    owner-zip: 216001
    owner-country: CN

    And that is only the tip of the iceberg…Can you really tell me that these sites are more relevant than the US sites that have been ranking here for the past few years. And this is only one phrase. I have examples of hundreds of phrases where this occurs.

    The upsetting thing is I have been telling my client since this happened last year that he needs to be patient and Google will correct it, there is no reason for us to do this backlinking spam. Now it has been 4 months and his traffic has dropped like a rock as well as his online revenue. Boy did I look like an idiot when after 4 months these chinese sites are still ranking all over the place in the top ten. Now his online business is failing and I lost a trusted client. He is now searching for an seo firm who will do these spam backlinks. You guys better get your act together or you will start to lose the trust of searchers when this chinese infiltration starts to affect other industries.

  239. Hi Matt. i don’t find a contact form so i will write here.
    i know you are not the boss of google but.. maybe you can sugest this to upper levels.
    “what about a day per year without “google search engine”? can be this possible? a day when people can’t search with google. 🙂

  240. I like Gigi’s idea of being able to mark a site as being useful or not

  241. Well, there are already a ton of filters, gadgets, etc out there to battle spam. A better idea would be to create a bigger and better database kind of like spamhaus has only add the possibility to block entire network IP blocks or specific IP as well. This just needs to be managed better and making it a one location solution is probably better.

  242. Please crack down on websites purporting to be info sites/directories which are plainly just link farms, and contain computer-translated content in terrible English, usually of Asian origin. They don’t encourage anyone to conduct their SEO legitimately.

  243. Here’s an easy one: Ban Cpedia.com. There is nothing original on this site, and it’s disturbing that their automatically generated wiki-style rubbish is clogging up Google’s search results and/or causing duplicate content problems for the sites they steal content from. Cpedia.com is nothing more than a spam blog, and they should be treated as such. Ban them from Google.

  244. I’m finding that many of my competitors are creating hundreds of sites with the sole purpose of creating backlinks. The content on these sites is identical, but it’s boosting their SERPs dramatically. It’s very frustrating because we’ve tried to build our SERP honestly, but then these companies come along creating these fake sites and they’re suddenly dominating the search results.

  245. Please, faster and deeper implementation of the List-Unsubscribe header!

    Also, don’t make users think that clicking This is Spam means the same thing as Unsubscribe. They are totally different and ought to be treated that way by GMail.

    Perhaps tracking clicks in emails received would be helpful as well to be able to tell whether people take action on things they are receiving.

    And please, just plain block and delete the pharmaceutical spam. That’s what my GMail spam box gets filled with. It seems that phishing and other spam besides the pharm spam are no real problem anymore… apparently those were worked on and solved pretty well. Why can’t the pharm spam be dealt with the same way? Are there actually suckers out there who click on that crap? If so, how about you take my emails and send it on to the people who want it.

    Also in GMail, if I make a filter for something to get put into a folder, I would think that would count for something. I still find those things in the spam folder many times. If you have some sort of rule that adds up points for things, please double the value of when someone creates a filter for it to be not considered spam. I’d rather have false positives for things I’m looking for than not receiving what I should get because it was considered spam by a machine, when I’ve specifically told it to put it somewhere else.

  246. Ah, by webspam, you mean bad results in searches… not email spam. Perhaps you should have defined what webspam was in your post…

  247. An issue we see clearly in our business, is that people reuse / steal content (images as well as text) and manage to rank higher for it than the original publisher. What would really help preventing spam is when Google would be able to have the original content provider rank above the websites that publish copies of that content. I am more than willing to give you some examples of this issue if that could be helpful to your team. Just drop me an email and I will provide you with some in depth examples of copied content that ranks better than the original.

  248. How about getting the ultimate social feedback on search results by allowing some kind of “it’s a dud’ button next to each result in the search results?
    I often search for things and see results coming up from news articles from years ago – Google hasn’t been able to figure out of the date of the article (often because it doesn’t include one), so a quick click on a – “I vote we get rid of this button” would allow a consensus to influence Google results. Sure it is open to abuse…

    It is similar to the idea of allowing social feedback on people’s driving. Everybody gets an electronic gun with a limited number of ‘shots’. If an idiot cuts them up or drives dangerously you shoot them with your electronic gun and it registers a hit. At the end of every quarter the hits are added up, those that go over a threshold are clearly doing something wrong so they get banned. OK, that has some flaws to it, but you’ve got to admin it is appealling 🙂

  249. Google should work on weeding out one page made for adsense sites. I see them everywhere. They buy the domain name of a keyword, write a single page of content and link to it a lot and presto, first page ranking.

  250. I love to see an extension for Chrome or even Firefox to identify what is spam and what is not with report function, so we can submit report right way and save us time too. Thanks for another head up on this

  251. Hey Matt,

    Here’s a few things which in my honest opinion can be changed:

    Problem: Spam comments on blogs to generate backlinks and improve rankings.
    Identifying factor and my solution: WordPress is one of the most widely used platforms. The comments section is easy to identify while crawling due to the structure:1) link comment 2) link comment. It is easy to understand if the anchor text is a name or a keyword. If keyword, can be classified as high risk or spam. Akismet says 83% of the comments are spam as of now.

    Problem: Hijacking Google places reviews section. Competitors post their own contact details , email id’s and promote their own business under the reviews section of their competitor.
    Identifying factor and my solution: Very easy to detect contact details etc on a review. Can block those reviews even before they are posted.

  252. hi matt,
    this is markus from germany. i’ve got two suggestions
    1st: for me it would be very interesting to find out, if my page is penalized or banned and some suggestions to fix it would also be helpfull
    2nd: when i check the pages of my seo-customers i often see 301’s to deeper folders something like domain.de/ => domain.de/folder sometimes even a secound one from domain.de/folder to domain.de/folder/
    mostly this pages aren’t rankend in google. but its hard to explain if its because of the 301 or if its another reason.
    a little tool wich shows me, what exactly is seen by google spider would be great –
    or can you gimme a short answer to this?
    thx from germany

  253. Hello Friends,

    1. Google should take an action on Twitter SPAM. Micro Blogging suppose be used to update of the friends’ activity and nowadays it becomes more SPAM.

    2. Link selling, buying websites should be penalized. Particularly Page Rank links been sold for $$$ from various sources.

    3. Duplicate content should be controlled. Google can prepare a tool with google toolbar, to indicates duplicate content. Say for example, If someone copied a content and placed in hie/her website, this tool should indicate about the duplicate content to the visitor when they open the page. So that the end user can understand the quality of the website.

    As of not these are mu suggestions. 😀

  254. spam and content being taken is a big issue as well with my seo firm that puts out press releases. Often especially in foreign countries they will take everything I write up to the end then put in their link.

    I think google should penalize those who take others content and make their own. Also spam is unfortunately a big issue causing a lot of companies to be taken down in rankings versus growing with good content. Hopefully the new ratings and ranking systems coming and beyond will help everyone trying to do seo with white hat techniques versus spam and blackhat.

  255. Google is doing an excellent job and i love the new look added to Google. i just wanted to say that Google captcha at times really annoys me by saying that i must enter the captcha to keep continuing my search. And this becomes worse when Google simply says that too much auto queries are generated by your computer and Google won’t search for you. Please don’t punish majority for some sick minded people.

    Thanks

    alam

    Lahore

  256. Matt,

    Don’t you call copied websites a super spam that are intentionally created to defame and demote reputed websites? Such sites DO come in search results and our main business website is the live example of the victim.

    Look at http://www.SEO-Factory.co.uk… a site that some sick mind created just to defame our website http://www.SEO-Peace.com
    The guy didn’t even spare the logo. When we reported the site to their hosting company, the guy added to his dedicated hosting.

    When we tried to contact and proved they’re fake via testimonials, they removed the testimonials page. The affiliate page in footer link is errored. Damn..

    They got us BANNED in Google and they’re running live and standing firm in Google.

    There is no reason that can get us banned.. no aggressive link building or SEO we did that can violate Google webmaster guielines.. No spammy cold emailing or calling, no duplicate content.. then WHAT?

    Just a big mess when my main business website is banned and life is messed up because of this sick-ass seo-factory.co.uk guy. Just look at it and you’d see who is the culprit and how could Google blacklist us?

    Isn’t it spamming?

    I’m sorry but I just tried to share this here so you guys know, it’s not just spamming the web but involves defaming many genuine services. Genuine people fall prey to such spammers and they suffer. Spammers and Copiers still enjoy Google.

  257. Within the industry that I work in, we’ve seen a huge shift towards companies buying exact match domains and being able to quickly rank well, often with little to no relevant content. We see sites littered with dashes, and non .com domains and it leaves us scratching our heads as to why Google will populate so many results with these mini-sites. As well, I think it shifts company focus towards domain acquisition and masking ownership when efforts could be better spent towards quality content.

  258. Blake Acheson

    It’s unbelievable that websites like kosmix.com can get half a million pages indexed in Google.

    http://www.kosmix.com/topic/randomPhrase

    I could write a application to detect this endless crawl loop in 10 minutes and Google can’t deal with sites like this over 10 years in now….

  259. I think the most valuable thing Google could do is to try to detect scam sites. But how can an algorithme do that? You would need to have human reviewers to manually classify sites. Human activity could not be useful either, as those scam sites do scam some people!
    Maybe you could also avoid ranking awful blogs on free blog plateformes that benefits from the ranking of the general platforme without having a valuable content.
    That was my two cents 😉 !

  260. I am also a mod for a HUGE site with LARGE amount of content. And i think google should help the original publishers such as my organization.

  261. Call out big brand bad guys …. Big brand link buyers, sellers, other big brand spam my guys … Show us that nobody is too big to fail

  262. I expect Google to implement a better link evaluation metrics (Particularly for paid links, scrapper sites, blog spams etc.) and come rather hard on spammers.

  263. Hi Matt!

    Maybe Google should invent an “iLike” button to help clarify what is good content versus useless junk. Yes it can be spammed but at least you will have user profiles to see who is spamming.

    Just a thought.

  264. Prasanna,
    The main issue is to scrap all the torrent websites, downloads websites, especially downloads websites when you go there and search for a term, it automatically creates a page,when a user types that similar terms, it appears on google results that this software is free , but when i enter click the link i see that it is a paid software, these things has to be removed. Enable easy way for users in sitemaps to report webspam and also a review team to confirm that it is a webspam.

  265. Mike Berezowski you are very very correct – actually i am also facing from same problems -yes same Question from me –

    I’m the editor of a large travel website. Since 1995, we’ve written and published thousands of articles about tourism destinations worldwide. What I’d like to see Google tackle, from a spam perspective, is the amount of plagiarism that abounds the Internet today. Time and time again, I see other websites pilfering our content (articles, paragraphs, business listings, photos, you name it) and claiming it as their own. Often, the offending sites even have the gull to put a ‘copyright’ notice in their footers!

    My question is, can Google come up with a system that better recognizes pilfered VS original content — and rank accordingly. When another site steals our content, I’m always concerned that it looks like we’re publishing ‘duplicate content’, thereby hindering our organic search engine exposure. Some initial thoughts: Can Google somehow date-stamp content? Or can Google assign a higher trust value to established sites, while maybe better scrutinizing newer sites — somehow validating/cross checking info? I know this is a huge undertaking, but I just want to throw the idea out there.

    Thanks,
    Jyotika

  266. Actually start penalising SERPS for paid links, just have a look at the .com.au “Car Insurance” SERPS – 99% paid links.

  267. It is not possible for Google alone to control webspam. Millions of user come across daily but they don’t find any link to report to Google….
    I have seen in many forum where people ask- How to report to Google about a site.
    So my advice is- Google should give a link on its home page to report any webspam .

  268. I think you guys should throw a little more of geo-locating into your general results. It would help tremendously. I often find results from China (.cn) and the Netherlands (.nl) that are on just about any topic to be gaming their rankings. Also 7/10 times when I use to visit any results from those domains, my Norton would fire up with obvious malware threats on them.

    Gaming the system and malware aside, think about the searchers. Isn’t there a better chance that if someone from Chicago writes an article about something I’m interested in, and I’m from the mid-west, that the information would be more relevant to me? To some degree you guys do that already. But, it would be great if you were about to give those more geographically local web pages a little boost over China.

    I have seen literally thousand of results over the past 2 years, searching for just about any topic you can think of. I can honestly say that every result from .nl or .cn domain has never been of any value to me.

  269. I think the biggest problem with Google search results is heavily scraped sites dominating because they have a strong site. Sites that just create pages filled with nothing unique at all. Just some dynamic text and scraped information from other sites. Many of these sites have strong domains so they get away with it.

    Basically what it leads to is low quality articles/information beating out high quality information. A brain surgeon could have his own site and write great content about his field, but will always be beat out by scraped content on Livestrong, eHow, and those other type of sites that either have no unique content, or incredibly bad content. I honestly can’t search for anything medical related without coming across those sites or some poorly written “answers” sites. I want to get the results from the brain surgeon, not the 14 year old kid earning points on some “answers” site.

    Finding a way to determine the higher quality content from the scraped jumbo sites would go a long way to helping improve results.

  270. I dislike searching for something I’m looking to buy and finding the first page full of review sites that have no actual reviews, just loads of affiliated links to sites that sell often don’t even sell it.

    Trying to find reviews, trying to find information on products and trying to find places to purchase said products is completely and utterly useless in Google these days. It’s gonna start to hit your traffic eventually as we don’t need loads of pointless review sites like nextag, shopzilla, kelkoo, ciao, bizrate, reevoo, etc., etc.

    There’s no value offered, they’re not helping, they’re just getting in the way and clogging up search pages. If they were like cnet then they’d be helping, but they’re not.

    Then there’s those massive forum scraper aggregation sites. They’ve just scraped loads of other sites and then filled their pages with others content, got loads of backlinks then got to the top of Google. There was one enormous one that had started to dominate anything PHP MySQL related, but fortunately you seem to have found that one out.

  271. Its unfortunate that most businesses or manufacturers of their own products or inventions have to battle the general public for natural ranking. The makers of a particular product or invention are undoubtedly the authority on the subject or the most relevant to the products. Lets say for instance I invented a new type of Q-tip called Z-tips. Google should allow me to be able to prove that I own the trademark or patent on Z-tips and place me as the most relevant search result. In addition with the popularity of valuable data sources like Wikipedia and Social media the top position should be broken up into 3 horizontal sections ie.

    Manufactures Website (|) Information (wiki) (|) Social Media (blogs, Twitter)

    Along with submitting proof of ownership to Google there can be additional language that obligates the manufactures or owners website to adhere to certain criteria to keep that position. This should reduce spam and undoubtedly provide the most relevant top results for searching. The search results top categories would of course change depending on what was being searched for (person) (place) (thing). The submission would of course define what the object was.

  272. Change the way google handles spamreports. Don’t just use it for
    polishing the algoritm but put actual people on those reports. So they can
    check what is going on and take immediate action and delete the duplicate content
    pages and visibility:hidden; etc. etc. stuff.
    And please give wiki less weight it’s getting out of proportion.

  273. Get rid of those ugly CAPTCHAs!
    Learn to better assess users’ risks and stop harrasing us common people…

  274. Do you guys really think google will remove scraper sites surrounded by adsense that must clearly make, google, money…?

  275. There are lots of suggestions already placed in the comments field. I go with most of them. Nothing specific to say more. 🙂

  276. Matt, I think the biggest problem Google has is with the maps. You need to verify map listings through state databases where the business is registered to return a real result. It is so spammy and really misleading to the end user.

  277. In the last few years Chinese factories have begun to sell occasionwear direct to women in the UK and US. In order to do this they don’t rely on goodwill, marketing or branding. Their sales technique simply relies on very low prices and misleading photos.

    For these companies, treating their websites as disposable is not only possible; its necessary. And its the disposable nature of their sites which allows these companies to engage in long term and successful black hat strategies.

    In the last year the UK and US SERPs for phrases relating to women’s occasionwear have been dogged by Chinese-operated black hat websites. Unlike their competitors, these companies can use spam to propel a disposable site to the top of the SERPs without any fear of consequences. After a profitable 6+ weeks at the top of the SERPs their site is demoted. Their webmaster will then just take another site (out of an armory of template sites built months previously) and black hat it up the SERPs. Ad infinitum.

    Please bring it to an end. These sites are not worthy of the kind of consumer trust that a top spot engenders. I believe that this is a real problem for Google, worthy of investigation.

    Good luck with the climb Matt,

    Vic

  278. Blog comments 😉

  279. Can you do something to get rid of all of the spam Google is putting on top of the search results?

    On many screens, the top search result is now BELOW THE FOLD.

  280. Hello Matt,
    Why doesn’t Google make two steps before kicking you off?
    Why doesn’t it give you a warning by e-mail or by message in webmasters tools account to tell you what is wrong (because it knows what is wrong for sure) and then kick you off?

  281. I hate the automated related search that some website make.

    Like: softonic ( http://www.google.com/#hl=en&safe=off&&sa=X&ei=DrRYTOTKF4bEONuw0K8J&ved=0CBUQvwUoAQ&q=softonic&spell=1&fp=cd3e5ac2a81f3124 )

    and especially i don’t understand why google put those old seo hack “key in subdomain” always on top

  282. The only one thing I do not like in Google search is:

    Suppose we are searching for any thing like suppose any cricket news like latest series going on between any two country, then all the top results which will get is from old posts.. Some of them are 1-2 years old posts but due to ranking they are still showing on the top. I think for such type of search you must show only the latest posts made.. Do not consider the ranking factor in such keyword search..

    Hope, my suggestion made some sense..

    Thanks for reading my suggestion.

  283. One more thing. Sorry for so many comments Matt but I keep coming across improvements so I hope you don’t mind. I would like to see web spam team taking 3010redirect abuse more seriously.

    I see many websites popping up under very competitive keywords but those sites lack good content or good links but they are enabled to do that simply because the redirect a bunch of sites to one site.

    That makes us white hatters that follow Google’s guidelines look bad to our clients because it is really hard to explain such things to them.

  284. Image Spamming. Why is algo for text search good at identifying who owns the content and ranks the correct website but images spammers can take over the original owners website image. Image ownership is broken…

    Fix removal requests tool. Example Google Denies removal of this url
    http://www.airseychelles.com/media/products/editorial/1/paris.jpg
    but yet it 404’s.

    DMCA complaints will only get image removed but doesn’t fix the problem of what Google sees in results. Image spamming is out of control. look at picsdigger.com
    http://www.google.com/support/forum/p/AdSense/thread?tid=414d25ee256d5f73&hl=en

    The overall issue is if Adesne didn’t make it so easy for anyone to make money then there would be less incentive.

    Yea it Google’s bread and butter and the underlying theme it just makes us all work harder while they use automation to control this. Automation will not fix the issue people will but I doubt people will be used to fix the issue.

    Blogspot and WordPress MFA sites. Too many to count..

  285. Fix the link results in GWT. My site went from 216 backlinks for the domain to over 1 million over night. Has been like this for over two months now.

  286. Hi Matt,
    I hope you are doing well in Kilimanjaro!
    This is my second comment…
    I can understand that detecting spam in languages other than English is difficult. What I cannot understand is that Google seem not to pay much attention to our spam reports. I have spend lot of time and effort reporting spammers, but I have seen not significant results ( I have noticed tiny changes to the reported websites positions, but I am not sure this was caused because of the reporting, but due to other reasons.) Furthermore I never got a simple answer “We investigate this website and we penalized it or we don’t…” So I don’t know if there I a meaning to keep on sending spam reports…
    So I have to propose two thinks:
    1. If you haven’t done this already… why don’t you find a way to distinguish between good spam reporters, so you can pay more attention to someone who always reports “cheaters” from somebody else reporting websites without significant reasons and wastes your time? After all spam reporters help Google for free making SERPs better. Give them a reason to keep on doing this… After all you have partners for other services in these countries (like certified google adwords partners like me), that you may trust them a little bit more than any “anonumous” spam reporter. Why don’t you give us a way to report spam results in a more straight way (let’s say to our account manager)? After all we spend almost all our time searching in Google. It’s easy for us who know well the language and the market, to detect a miss-behaving web page.
    2. Even better… Why not hire people or find partners for your webspam team in any country? I can do this job in Greece if you want 🙂

  287. Of course my biggest frustration my taking the time fill out a spam report and feeling like it just goes to the null file since the spam sites keep dominating the SERPs.

    2010 Project suggestions:
    1. I still see random keyword stuffing at the bottom of pages in small font. Fix this algorithm already.
    2. Cookie cutter doorway pages seem very effective. Register 20 URLs all with your keyword in them somewhere. Scatter them on different IPs. Write 20 copyscaped landing pages and point all the links to your main site. Works like charm Matt!

  288. Some want to report “junk results returned”. Try searching “employment law consultant birmingham” from google.co.uk and you are inundated with job sites, not many “employment law consultant birmingham” type people that can offer you services.

  289. +1 Will.Spencer

    All of the google spam in the serp’s is frustrating. If I want news, shopping or other I can click on the tabs.

  290. My concern is regarding the rankings in the Google SERPs and web spammers or spamdexing who deliberately scrapes other newly published posts by getting the headline titles and some snippets within the page of the original source. What happens now is that they rank high in the SERPs because maybe their PR are higher than mine. It happened to me and most probably to everybody too. I hope this specific problem will be fix.

  291. Personally, I’d like to see better protection against click bombing.

    I’m new to web developing and AdSense but in my research to learn the ins and outs of AdSense, I came across five times more horror stories than success stories. Of course, I’m sure many are fabricated and I still became an advertiser because the ease of use is remarkable, but the fact still remains. I realize the #1 priority should be those that pay to advertise but, if that’s the case, the advertisers should be #1a. Simply put, without advertisers, there are no ads. Also, I realize you can “disable” your account so the ads show but the clicks don’t accumulate, but the advertiser (and, frankly, Google) didn’t set up AdSense to provide free advertising.

    Two reasons why I think click bombing is such a pressing issue:
    1. It can happen ANY time and, chances are, you are banned before you can even act.
    2. So many solutions exist that simply haven’t been implemented. For example, if a suspected case of click bombing seems to be occurring, permanently disable all following clicks from that IP address and refund all clicks previous to the suspected bombing. If the IP address is too personal and/or can’t be found, use the said strategy on the region or service provider. It’s not difficult and it saves everyone in the long run.

    Just a suggestion. Hope you take it into consideration and keep up the great work!

  292. I like the idea presented about by Ian Mansfield. Akismet would be an amazing source of of webspam information. If there would be a way for Google to be alerted of sites that were flagged more than, say 20 times in a month period, that would really cut way down on blog comment spamming. The other alternative, would of course, for Google to build a comment spam detection plugin for the major blog platforms. I that case, Google would get the spam notifications directly. Of course, Akismet is already there, and seeing so much spam every day – it seems like an obvious partner for the web spam team.

  293. I would work on “trust rank” and downgrade all websites using SEO. We don’t want SEO because the very existence of SEO proves that search engines have failed somewhere. If Google was a perfect search engine, we, the webmasters, should ignore SEO, focusing on nothing but content. It’s laughable that Google says it wants to fight spam when on the other hand, SEO is so widely accepted. SEO is the main door to spam.

    Let’s say it clearly, SEO is corruption. Webmasters using SEO are bad guys, they want to be on the top of the list without having the best content. Google choose to ignore keywords meta-tags a few years ago, and that was a good start. Let’s follow that trend. There’s no white hat and black hat SEO, it’s all black and all webmasters using SEO should be penalized.

    I would particularly penalize websites with thousands of look-alike pages, where less than one-percent of content is different between one page and another.

  294. Hi Matt,

    Thanks for the opportunity.

    I came across two oddities when searching for “los angeles seo”. I see two results at the top, since MayDay, that are unexplainable.

    The first (#1) is a site with such obvious spamming and kw stuffing they even have descriptions and content that is not even “English” and they have their own internal link-farm of sites going that also have the same spammy incorrect language. These are linked to at the bottom of the pages.

    The #2 result is a 1-page result that refers to the site that is listed #4 below it, announcing that they’ve changed their site name. So effectively these folks have gotten 2 of the top 10 (#2 and #4) for essentially the same site and instead of using a 301 to refer or having Google get rid of the old referring URL and promote the new (referred-to) URL – Google is ranking both.

    It seems that after every update there is a settling-period and then the spammy sites that tend to rise up soon go away but I’m wondering, since it’s been so long since the update happened, if these are aspects that are holes in the update. If they are, they could really be taken advantage of by black-hatters.

    Not to mention that people contacting spammy sites to be their SEO providers will likely only get spammy SEO which further adds to the problem.

    Thanks and have a great trip to Mt. Kilimanjaro!!

    ~JC

  295. Matt, I’d love the ability to block a site.

    As a techie I’m constantly searching for fairly specific phrases and certain paid “expert exchange” sites come up as the first listings. Often they’re poorly phrased questions that are there for SEO pull but have no answers.

    The #1 thing that I’d love to see from Google is the ability to simple delete them from my results, so that I never have to see them again.

  296. Working in SEO, and refusing to use spam tactics, firstly because I don’t want to ruin the web that I grew up benefiting from, and secondly, because I believe in Google’s ability to devalue irrelevancy and automated junk, I feel passionate about webspam; so, when my “peers” insist that meaningless linking will accomplish what I do with good design, analysis, and hard work, I would love to be able to point them to Google in some way confirming, finally, that their techniques are algorithmically seen for the polluting they seem to me to be.

    Here are some quick and poorly thought-out ideas, maybe not technically feasible, or maybe already in place, but hopefully at least of a little help: when a site is determined, via manual review or algorithmically suspicious data, to be spammy, grab their link data and look for other sites with the same link profile. Many irrelevant backlinkers use the same list of sites for their “strategies.” Devalue links in blog comments, if they’re not already, to nothing, except for special exceptions. Have some sort of algorithmic way of determining the estimated “interestingness” of linked content to see if a sudden appearance of hundreds of deep-links to a site could ever have logically arisen. Give a link’s passed pagerank varied longevity based on the quality and nature of its originating page. Treat squeeze-page e-book vendors like Casinos and Pharmaceuticals. Go to blackhat SEO forums and download the scraping programs they use, and then determine what kind of footprint those links leave that could be mitigated. Then, when some of this is done, please publish that fact, let the spammers verify its ineffectiveness, and in so doing, remove the motive for spam 🙂

    Sorry I can’t put more time into thinking about this right now, but thanks for working on it!

  297. Keep killing that webspam. a few more killer ideas:

    Ding more duplicate content based upon duplicate images. Redundant articles tend to have duplicate photos.

    Ding sites that devote more space to advertising. I hate landing on websites with a postage stamp sized piece of content surrounded by AdJunk like a minefield. Sites with too much advertising space generally stink.

    Ding domains that people click on less (not pages, domains)… I certainly have learned in many of my favorite searches which websites are crap and I don’t click on them.

    Reward SERP domains/pages that engage visitors and have low bounce and high penetration rates. I bet most users back-out of junky/spammy websites more quickly when they do click on them.

  298. I couldn’t possibly read all the comments, so I don’t know if this has been mentioned. But we see people copying our site’s text word for word and put it on their sites. They even go further and add text-links to other sites using our text. These sites need to be severely penalized for SEO malpractice.

  299. Hi matt,

    Something to help adjust the spam filters in gmail – almost all of my inbound mail is suddenly being flagged as spam.

    Whatever you did a few weeks ago.. please undo it. I pulll down my gmail into my iphone & don’t get the spam folder – as a result my gmail is unusable.

  300. Apparently blatantly hiding links in scripts & widgets is a sure fire way to rank well.

    http://www.mybana.com hides links with their video widget, probably without the customers even knowing.

    Would http://www.aiafla.org/ really be selling those links hidden in the code?

    I think not.

  301. This might sound silly, but I’ve noticed a lot of spammy sites have weird syntax issues. Here’s an example:

    “The internet business for sale will be old and will also be well known. Internet businesses for sale will be popular among the people and will have a lot of repeat customers. However it is necessary that you should have experience about running a business. This is because even a little mistake can destroy even a successful mistake.”

    You and I can read this and know that it’s nonsense. To a computer reading this, it might look OK because each sentence on its own is “good enough”, grammatically speaking.

    This isn’t a direct “please do this” solution, but maybe someday Google will be smart enough to grade English papers, and can then apply that to fight spammy sites. I have to believe that someone there uses their 10 hours a week to get the bot to comprehend instead of just read. Thanks for reading!

  302. It’s straightforward to down-rate most junk sites. Analyze the site, see if it has commercial intent (which includes having ads), figure out who owns it, check out the business behind the web site, and down-rate the ones that don’t seem legitimate. The business directory information needed to do this is out there.

    Matching web sites to real-world businesses is quite feasible. The search engine for Google Maps is already doing it. The Maps engine uses more non-web data sources, which gives it some protection against “search engine optimization”. Try general searches with the Maps engine, and the results are often better than with the main engine. More of that needs to go into the web search engine.

  303. The first thing that comes to my mind is Google Alerts. I watch a couple of keywords related to my site and the things I am getting via Google Alerts are 95% spam. Maybe more.

    I know that it is hard to detect spam and deliver the news content as soon as possible, but something should be done on this field.

  304. Hi Matt,

    I would love for Google to get a handle on spam emails. Every single day I get them into my inbox, flag them and than….. what? What does Google do with them? Are they tracked down and do the senders get penalized in some way?

    I would like to learn more about how Google deals with it and see the procedure being more transparent.

    Thanks!

  305. Keep killing that web spam. I love this project.

    Overall, I’m very content with the progress Google has made to keep Gmail as clean and spam safe as possible for most users.

    A few more killer ideas:
    Work on cleaning up Twitter crap from being indexed. If there is a way that Google can only pick up Educational information vs “I’m sitting at the airport waiting for my flight” that would be ideal.

    I would also like to see some work done in detecting duplicated content on various different websites and bringing end to that so we can kill those article spinners. 🙂

  306. matt,

    I used my real name. You can google me:) Daniel T from your local team (he formerly worked at Endeca) suggested I post here. I also know the guys from Rusty Brick well.

    Feel free to contact me directly if you need any follow up info.

    There are companies that act under the radar by buying a couple of great links for established domains. Those are hard to detect especially if purchased from domains that don’t sell links to others. There is also less harm done by this as it is done on such a small scale so that it gives a little bit of a boost to a good site. Its like police patrolling my area when school lets out. Keeps things under control but won’t stop everything.

    No one expects you to catch everything.

    What I have been shocked to discover over the last few days is what SEO Moz confirmed.
    http://www.seomoz.org/blog

    The spam team has stopped taking action upon receiving a verified webmaster report of paid links. Large scale link buying is going undetected. Sites are buying hundreds of anchor text links and SEO companies are building networks of hundreds of bloggers.

    How many people would use gmail or hotmail if penis enlargement offers were allowed to get through?

    Why is the spam team not detecting this blog that has anchor text for “penis enlargement” and a blog roll link for a competitor of mine that used “air purifier” anchor text?

    http://ehealthcounselor.com/

    my competitor purchased links from 500 domains and used 3 variations of anchor text. It is no longer necessary to even bother hiding. It is hard to understand how the spam team does not find 174 domains using anchor text of “allergy mattress protectors”. Each of those domains have lots of spammy anchor text links. The only thing that the network of blogs did to hide it was by selling links to buy.com (who is listed as a client of triangle media – the seo company acting as agent to several hundred blogs).

    I have filed numerous reports over the past few months. Had anyone looked at the reports, the spam would easily have been detected. As SEO moz and the comments on that post pointed out, spam reports are no longer being reviewed.

    While no one expects a response back, it is shocking that no matter how obvious the spam, the search results are unaffected.

    Two SEO companies that I had long conversations with each have hundreds of bloggers they work with. They get 100-200 anchor text links per month per client and concentrate on very few anchor text. There is little variation.

    It is kind of hard to hide a network of hundreds of bloggers giving out lots of anchor text links. Especially as there is another end of the coin.

    For all the money these guys are making by manipulating the results, others are losing out. SEO companies need to show results and the white hat SEO firms and marketers trying to keep it clean feel frustrated as the spam reports are no longer being looked at seriously.

    If you were working at a SEO firm, what would you do? If you were at Endeca and your marketing team and sales were complaining that they need to compete on equal grounds what would you say?

    My son is getting Bar Mitzvahed next year and my wife is due to give birth. I have been blogging like crazy, tweeting and otherwise promoting my content all over.

    Hopefully, the spam removal is simply slowing down on the manual end to test better software. As SEO moz reported, it has been going on for a while. The network that I have reported is selling lots of health related anchor text. Allowing anchor text for diet pills, allergy relief etc can do a lot of damage.

    Even allowing anchor text for insurance and other financial phrases can cause a lot of grief to people.

    Using negative results (“reserved” result shown above the rest of the SERPS) for Viagra and any other drug brand or generic name as well as any health condition is helpful but not enough.

    From my end, it is hard to comprehend. I have good analytical capabilities and can easily filter lots of data to find patterns that should be red flagged. I used to handle payroll and time keeping for large corporations of thousands of employees and was able to create filters to red flag data. Unions had lots of different pay rules that came into play and mistakes were easily made.

    Again, the main issue is that Google needs people with good analytical skills and has to use the reports that people send.

  307. You know I wish google would quit rewarding websites that are stealing with positive ranking.

    For example http://www.mangafox.com, http://www.onemanga.com and about all the top positons for the keyword of “Manga” supply illegal plagirised copys of manga that have overall put the manga industry at a halt.

    These giant websites have made stealing books into huge profits, and do the writers and artist get a dime of it no. Because they have stolen so many titles the manga industry is dwindling, comic book companys are having problems selling them and yet every month they get the number one for the keyword, the best keyword and off of illegal merchandise. This is just one instance of where the bad guys are winning in the search engines.

    If you don’t believe me ask VIZ, or Tokypop, or any of the other manga companys who have sent countless cease and desist letters and it never works, because there hosts are in China or Tawiaan where you can’t persue copyright laws against them.

    Google is a American company that has spread out globaly and it should help in force american laws even if its just a little thing like helping companys make sure there software, books and merchandise aren’t beeing spread illegaly through google search terms. You can see more of it for tv programs, stolen downloadable books and many other products. Basicly if its media google is helping people steal it. Another example would be “watch true blood.” The whole first page is nothing but bootleggers. Basicly if its a form of media you can find it stolen through google, and I’m sure thats not what you all really want your search engine to be used for. It would be nice if google had a complaint feature for liscensced copyright holders. So the top slots doesn’t go to people dealing in illegal merchandise services and media.

  308. I could eliminate webspam at Google with a good team of programmers and some changes to the way iGoogle works. But I’m not going to tell you how to do it. Why should I? You guys run the world, and meanwhile I have to fight off competitors who are buying their way into the #1 spot every season (here’s a hint… look for people that “move up” just before a timed/seasonal spike in traffic…. then look at the new links they acquired). How would it really help me to teach you how to build a great clean search ranking improvement algorithm for free? No, I’d rather keep slogging away the way I’ve been doing. It’s not perfect, but it’s OK for now.

  309. J. Tristan brings up a point which may seem off topic, but penalizing sites for sending email spam is probably a very good way to prevent search spam. It’s really the same kind of person… and the URL’s they put in email are clean indicators of who they are.

    Of course this would open up the “can of worms” where someone could send spam to hurt a site…. but it’s not too hard to detect “legit spam” from fake. I did a lot of work with DNS-based blacklists (which are, by the way, rather well maintained sources of spam origination networks).

  310. Actually I think Google is failing on so many levels right now. A major, internal algorithmic change has to happen soon, or Google will get replaced by someone who does it right. You’d have to have a pretty short memory to realize that internet search patterns can change very rapidly in a very short time.

  311. Hi Matt,

    I think you might miss my comment in some post on how spammers are doing to get their sites ranked on top positions on Google’s search result page by spamming keywords on each entry of their blog and linking such an entry to the entry itself using keywords like some of them is doing here

    http://www.musicgigg.com/mv-%E0%B9%80%E0%B8%9E%E0%B8%A5%E0%B8%87-%E0%B8%AD%E0%B8%A2%E0%B9%88%E0%B8%B2%E0%B9%84%E0%B8%94%E0%B9%89%E0%B9%81%E0%B8%84%E0%B8%A3%E0%B9%8C-%E0%B8%8B%E0%B8%B5%E0%B8%84%E0%B8%A7%E0%B8%B4%E0%B8%99/

    Amazingly almost every entry of this blog has been ranked in top 3 of google though there is nothing inside the entry buy keywords and links. Could you take this kind of spamming into account so that Google can make better user experience. Thank you very much.

    Best Regards,
    Joe

  312. I supposed that every site can put as many terms as he wish, even if it is not his field. Also I supposed that Google search mechanism depend on these terms also.

    Iff me above lines is true then I hope to have any way to limit the keyword (Search terms) for the sites/blogs/forums OR limit the considering terms on the search technique.

  313. Even some one-minute Youtube videos which reference you to a website to “see the whole vedio” !
    I hope to find an easiest way to flag them as a spam.

  314. Also on Youtube, some uploaders puts almost every word in the dictionary, all famous people and all global events !
    If I have a choice, I’ll limit the description to 1000 chars. I believe that if he needs to describe more he will reference to a url and I believe he will not need 🙂

    Sorry for these continuous replys. I think I was in a brain-storming 😀

  315. Don’t reinclude banned websites 2-3 times. I reported a guy who since many years uses massive forum spamming, repetitive keywords and fake social bookmark submissions to increase his position. Google threw him out already 2 times… but only for 1 month!

    Can’t Google just make penalitys similar to a car stereo code? If you try it once, 2 weeks out. If you try it twice, 3 months out. If you try it 3 times, min. 6 months out or even forever.

    Also the empasis on Youtube in SERPs is bad, it leads to users uploading ~300 useless videos, just with captions and directions to go to a specific website, and a clickable link in the description.

    As Youtube/Google have sophisticated and hard to circumvent detection algorithms for copyright violations, why not use these to detect the “originality” of videos, or if the video largely consists of still images, etc…

    Google should only show Youtube videos, who for example are Featured or in other ways “screened” / peer reviewed. Here Google could also add a “reputation” to each user, based on account age, abuses, social activity patterns….

    I often see useless videos with spammy annotations superimposed and description full of keywords.

  316. Hi Matt,

    Big fan, by the way. (Matt Cutts is my SEO Hero)

    I came to your website today specifically to make known a trend of spammy behavior that I’d noticed, but wasn’t sure if Google had discovered it yet or not. Just coincidence that you happened to have this post soliciting spam-killing ideas 🙂

    Here’s what I’ve observed. It’s basically abuse of PR and is a “bait and switch” scheme in which the spammer cons unsuspecting website owners to get one-way backlinks for nothing in return.

    They maydo this by:
    0. Some spammer (an individual or an Agency) wants to promote their site or gets a client who pays them to “get backlinks”.
    1. The spammer probably then searches for high-PR, deleting/recently expired domains that are relevant to the site they are looking to promote.
    2. They buy the deleting/expired domain, and remove all the old content.
    3. A bare bones wordpress blog is installed. They add some content to make it look like the blog is active.
    4. Meanwhile Google still has the site at its previous high PR. Its Alexa rank may be still close to what it was in the old site due to the small amount of time since the domain deleted, or the Alexa rank may be artificially inflated somehow.
    3. Robots then scour the web for businesses with “Partners” pages etc.
    4. Emails get sent to those businesses offering “high-PR backlink(s)”
    5. Small business owners are happy to do a link exchange to get the high-PR link, and all they have to do is link back to a 3rd website – not the one they are getting a link from. This third site is an actual website with real content.
    6. Seems too good to be true, which it is – the pages the spammers are offering routinely have over 100 links on the page already.
    7. Once a certain amount of time has passed, the spammy page is often deleted, leaving only a “one way inbound link” from the unsuspecting business who agreed to the link exchange.
    8. The business owner may never check back to see if the spammy site is still there, so they go on wasting their site’s PageRank authority for a deal that gives them nothing in return.

    (don’t know if others have mentioned this, couldn’t read all the posts above)

    Now I think there’s nothing wrong with honest reciprocal backlinks. But the links above were spam to the core.

    These three-way links look incredible at first, people offering you a PR3, 4, or even 5 backlink, and then the spammers want you to link to a separate page then the one that you got a link from “since this is better for Google than doing reciprocal links.”

    The scheme is a little complicated and pretty crafty, and may rely on the fact that Google only updates PR every 3-6 months (or so I have heard).

    Example:
    One site that does this is wildontarioflowers.com.

    You go to that page and it’s “Just Another WordPress Blog”… i.e. no real work has been done to set it up properly. Some keyword-relevant articles are on the page. It’s running AdSense, and has a whole crop of backlinks to various sites, usually over 100 links on the page. Other sites by the same individual/group have followed the same pattern.

    If you search on domaintools.com, then you actually get to see an image of what the site looked like before – it was a real site about flowers.

    Thumbnails of how the page has historically looked over time are here.

    Screenshot of how the “new” page looks right now (in case it is deleted between now and the time you read this) is here.

    A website I work with has fallen for this trick, and I don’t intend to let another one slip past me if I can help it. If any agencies are offering this tactic as a “linkbuilding strategy” to their clients, I think it should be openly discouraged by Google since the practice is, as we say in Australia, “dodgy”.

    I can provide a little more information about the trend that I have observed on this if you want to look into it more.

    Thanks for reading, keep on being awesome.

    Best,
    Brandon M.
    Sydney

  317. Frustrating that links from about 200 sites like bestbusinessespro.com, lightmarketingbusiness.com etc. which have blanked out PR and nothing in the way of an Alexa rating or value to any web user can push a site up to the top in Google. I have seen people now in 2 different industries buy links from whoever is selling these and make it to number 1. Is this not the most basic of spam online? Google doesn’t seem able to deal with this so I don’t hold my breath for those slightly more sophisticated spammers. From a marketers perspective this just encourages everyone else to follow suit and resort to the same tactics (demand encourages further supply) unless like me you think this is short focused … or thought this was short focused 2 years ago but today still see the same sites profiting from this link buying and sitting in the no.1 spot.

    It doesn’t take a brainiac from Google to go to those sites, look at all the sites who have bought links from them and take affirmative action does it?

  318. I am in now way associated with Mahalo but after searching for a few things it was the one site that answered my question. There is nothing wrong with content aggregation sites, they actually play a very important and useful part in helping users answer questions. Yes I too hate the half baked, poorly designed and rushed type of aggregated content that either does not answer your question or help to send you to somewhere that does – I find that with many software download sites. In the continual evolution of the web, content aggregation sites are an important animal that is very eatable and needs not be exterminated.

    As we search we can easily find ourselves caught in one particular idea of phrase. Good content will help expand your understanding of the subject and if it does not answer your question it will inspire you to new ways of looking at the subject and new queries to go back to Google and search for. Content aggregation sites do this very well and foster the user back to surfing the web.

  319. 🙂 be real about paid backlinks

    … 1st of all this includes top online directories (the real ones with real offices, phone catalogues etc.)

    and you have to be there otherwise big G will think that you are false company… 🙁

  320. @Jacob I totally agree with you. Things have gone downhill with spam, and paid links have gone crazy. I doubt the webspam team is even reading the reports.
    I know one site that went to the top of search results only weeks after starting to buy links, all of their quality links are paid, not a single one is not paid.

    Google either needs to crack down on these, or let everyone pay for links. The situation now is just terrible the cheaters are rewarded and the ones who play by the book are punished.

  321. Your spam filtering is very harsh and unforgiving one the website is flagged as spammy (either by a Google employee or by a competitor), although there are many other factors that will tell you if the website is really spammy or not, for example:

    – RSS subscribers (and you can easily know if someone is faking RSS subscribers)
    – Twitter followers
    – Twitter lists
    – Tweets mentioning the website
    – Inbound links from decent websites
    – Having a human check for the website, not just run a script that will tell you if the website is now OK or not (which will always tell you it’s not OK, as most webmasters and site don’t know the exact reason[s] of why their websites are blocked)
    – Comments on the website
    – Visits to the website (you have many ways to check for that, even if the person is not using Google analytics)

    My website is penalized for a year and a half now, and I’ve been told that there is no way to get it back, although the website is prominent in its domain.

    I wish you can take the above factors into consideration when someone is appealing for a site reconsideration. You don’t seem to do.

  322. It would be nice to tweak your sandbox and domain age restrictions. While domain age is certainly important the new .CO has had an extremely successful launch and we will be waiting forever for the .Co’s to show some strength in the SEO market. More than one prospect has come to our site http://www.1st.Co and asked for SEO help….I tell them to expect to wait quite a while for any presence…and that costs me business to be sure.. If a new domain has 99% of the factors that you require it would be nice if you placed less emphasis on the age of the site.
    Far to many top ranked sites joined the lucky sperm club early, and playing catchup is very difficult..I continually see these aged sites circa 1999 that have 0 new content, links and relevance yet they remain top 3….thanks for the opportunity to vent.

  323. Use facebook to access user pages.

    Rank-up sites that me and my friends link to, and rank-up sites that those sites link to. If that’s too computationally expensive to centralize, make it a plugin.

    Only personalization will fix search… and make webspam a failure.

  324. …..I would really like to see google docs spread sheet be “macro” enabled. I’m sure its coming when docverse is fully integrated….whenever that will be?

  325. Hi Matt, I just wanna give you an update on the spammer (musicgigg.com) I mentioned above. It’s now about 1 week that I reported the matter to Google team with lots of evidences but it seems no action was taken to deal with the site, in contrast most of their entries have been ranked in top 3 mostly No.1 on Google search result page. This also applies to other cases that I used to report Google about spamming. Therefore I think the problem is not that there are a huge number of spammers but it is if search engines especially Google can cope with all this properly. Maybe my reports were not delivered to your search team hence the spammer still enjoy what they are doing? In my country, Google search result pages are of useless information.

  326. Hi Matt,

    A few suggestions

    1. Webmaster Tools

    Once authenticated, why not do away with the need to have 301s done by the webmasters and just have users set the preferred domain and get Google to pass the link juice to the “preferred domain”.

    2. Scraper Sites

    Surely there must be a fix as far as link to content ratio or something on those lines is concerned to pick up on scraper sites.

    3. Google Places

    I recently edited some listings which were un-verified. Once I received my post card, the PIN wouldn’t work and I had to wait till the next set of post cards came in. Seems that Google Places doesnt want users to edit listings unless they have been verified. Doesn’t sound like the right way to do it. As a user, I should be allowed to make changes to an unverified listing as often as I want without interrupting the original verification process.

    4. Ability to search by date

    A user can conduct a search by date by using the “custom range” down the left hand site however, it seems where the url did not change and new content was added, the results do not show the web page as it existed within the desired date range. At least for large sites this should be addressed so that users can see how specifc pages have changed.

    Thanks

  327. Hello Matt. I don’t remember ever commenting on your blog however I have been following your writings for a lot longer than I can remember. I don’t know if this has much to do with Webspam but I would love if the Google Directory would stop using results from DMOZ and allow people to either buy in, like at Y or for Google to edit themselves. 1 quick question. Where did the Google Webmaster channel disappear too on YouTube? Thanks Matt.

  328. Didn’t go through the list, but looks like a lot of feedback.

    The main thing that I would like to see is more stuff mentioned by google implemented. It seems a lot of times google states that you should not do this or you will be penalized, but the rankings do not reflect that happening.

    Also, I do not think I have been penalized, but when google does penalize you for something I think it would be nice to get an alert in google that you have been penalized. A lot of SEOs know exactly when they are doing something that could get them penalized, but a lot of other people are just trying to do something that they feel would be good for their site, but do not realize you can get penalized for it.

    To be honest, by following google over the course of the last couple of years it seems like you all know what you need to change, it is just a matter of figuring out how to implement it.

  329. I haven’t been through all of the above comments, but I did do a quick page search for “comment spam” and there seems to be several comments mentioning this. If I’m just repeating the sentiments of previous commentators, please consider this a +1.

    There must be huge scope for the Google webspam team to work with services such as Akismet in order to get a better idea of where the “bad neighbourhoods” are on the web, more quickly and efficiently than they currently do.

    Of course, it would be great if blog administrators had a few less spam comments to delete each morning as a result, but that’s not my point here. Surely, it makes sense for the webspam team to start partnering with various social media sources (@spam, for example) to help determine new sources of webspam in realtime (or close to).

    I’d like to see the internet’s largest players work together better on reducing the effects of spam.

  330. I am experiencing a lot of new scraping of content. The people generally copy to whole article (using RSS feeds I think) then after they publish they then run a re-write, cache/copy my images, change any internal links of mine to point at their own pages. So they steal content and rewrite. Later, I guess after indexing, they redirect the pages to affiliate sales pages (although often just advertise the same products I do).

    I have done a little research and see that there are many programs that do this copy/rewrite, even a WordPress plugin!

    So – for the spam team, just get very very good at spotting these people and not letting them play! I think that they plan to get the content indexed and then when searches come in for it they redirect. Do not let them get the content indexed in the first place. Maybe even redirect the search to the rightful owner (this could be open to abuse by cunning spammer I guess).

  331. Recalculate ranking algorithm to give less focus to the sites that buy up the rankings:
    1. Place less emphasis on keywords in domain names
    2. Place less emphasis on inbound links
    3. Add more focus on textual analysis of the content

  332. Moderator: what happened to my post? It is relevant yet it isn’t published.

    Topics for Google Webmaster Webspam

    1. Fake trade directories with high ranks due to brand spam.
    Use of high-profile brand names to create counterfeit trade directories that now have very high SERP. There are at least five such ‘trade directories’ (that appear to be off-shore.) Without naming names, you’ll see them with key words: wholesale jovani. Note that Jovani is the leading brand of prom and pageant dresses, which has nothing to do with these fake trade directories. You’ll see these with key words: wholesale mori lee. Note that Mori Lee is a leading brand of prom and bridal dresses, which has nothing to do with these fake trade directories.

    2. Recent survey results:
    A survey of our hundreds of active, legitimate wholesale accounts, who buy for resale at their shops and boutiques, overwhelmingly concluded the top two problems in using Google search when looking for new merchandise sources:
    a. Search results that feature websites who spam brand name key words for bait and switch. (Described in 1. above.)
    b. Search results that feature websites who spam the word, wholesale, when they are only discount retailers.

    The problems of good search is not limited to ‘guy’ topics.

    Thanks for includes the issues of spam in searches of the women’s fashion business.

  333. I think paid links are as actual as ever.
    I see every day – especially on business sites – paid links.

  334. Like “site links” manager in google webmaster tool, is it possible to imagine a “Query” manager ? If I am in a result for an inapropriate query due to competitors spams (links with an inapropriate anchor) it’s could be usefull.

  335. Hi Matt, and thanks for all your hard work. I would like to see Google do something more about plagiarized articles popping up ahead of the original article in search results, or filling the search results page with the same article from 10 different sources with the same plagiarized article. This has happened with many of my articles, the most egregious of which is a page that was stolen from my lake advice website which I have reported to Google, and also contacted the article directories using it, spinning it, and using it again asking them to remove it. Only one has done so. This is the search result page: http://www.google.com/search?q=aquatic+weed+killer+aquatic+herbicide+list&pws=0&hl=en&num=10
    This is just one example of many, although it is the worst I know of. When minor variations of this search are done, in many cases, the pirated versions appear and the original does not. This is a good example of what Jill Whalen was talking about in her recent open letter to Google I think. Is there some way to hit the directories themselves, or encourage them to do a better job of clearing the articles beforehand?

  336. I didn’t make any real suggestion on how to stop the duplicate content I mentioned, but after giving it some thought…
    1. Most articles are time stamped in some fashion or another, and easily identifiable. The first one published is the right one, no matter how big and powerful the domain name of the site which publishes it 2nd or 15th.
    2. Article directories seem to be the biggest offenders. Not that they are always aware of it, but we are all still responsible for what we publish. Make them so. First by dropping their page rank across the board, second by eliminating any offending page completely from the results.

    These can be done with existing technology, nothing new needed at all.

  337. I don’t think white hatters can win this war, at least right now.

    Why are sites using total spam comments and junk ranking number 1 for very competitive keywords?

    These comments are just a mix of numbers and letters that no one can understand and with the site url embedded in them.

    Thanks.

  338. Hi, here is a question that has been bugging me for years:
    I’ve done like 50+ websites (i’m a web developer) and most of them i’ve hosted on the same web hosting that i have my own web site hosted.
    Is this considered something like web spam ?

    When i search on google.ro: link:zeusmedia.ro it only shows like a few results while google webmasters tools shows a lot more in the “your site on the web” >> “Links to your web site”

  339. I found some problem related to content indexing. I don’t know whether SEO professionals noticed or not. But problem is google showing cache of my site web page but if you copy a first line from that cached page content and paste it on search bar returning no results. As per searching technology if a page is cached and content is indexed you should get the result of that search.
    If is it penalty then it should be mentioned in google webmaster tool. There also no information about this penalty.

  340. I don’t know if this is your specialty. If it isn’t perhaps you can pass it on. What is the deal with Google suspending Adwords accounts without two way communication? It seems to me that it would be a win win for everybody to send the advertiser an email and tell them what needs to be corrected, giving them a chance to fix the problem. If they correct the issue Google now gets to keep a paying client, and the advertiser doesn’t lose a stream of revenue. If you haven’t already, take a look at the Adwords forum. It is teaming with people that have had their accounts suspended and haven’t been able to get any help from Google as to why. If you are an affiliate marketer it really seems that you get lumped into a general pile with get rich quick scams and the like. The big G is using a machete (on affiliates especially) when they should be using a scalpel. I may not be as educated as some of the minds at Google, but I understand customer service, customer value, and communication. I believe that you guys are going about this completely wrong. Your thoughts Matt?

  341. Incorporate something like Nielson ratings for TV shows, but for web sites. Have a relatively small number of raters who would be asked to rate websites when they leave the website. This could be intermittent so it does not become too much of a burden. This would give a real value by an actual user to the content of the web site. Those that are rated poorly would be dropped in the search engine.

  342. I believe that Google is in over their heads at this point trying to juggle too many factors within their organic algorithm. An improvement for some will mean a negative impact for others. If Google were to reduce scrapper spam from ranking highly within the SERPS then i’m sure that will open up another hole to plug.

    Google is vigorously monetizing their “Website” and funneling everyone over to PPC.

    All of these recent changes including “Instant Search” and “Google Local Places Improvements” are nothing more than conversion optimization changes.

    Difficulty in ranking organic = Increase in PPC account signups

    Google says if we want to have any success online we have to build sites to exact Google specification and when done so, focusing on all the quality signals and trying to remain 100% white hat this tactic fails to beat out spammers.

  343. I would like to Google better discern the differences in keywords.

    For example, one of my clients is in the windshield business and the other is in the mirror/shower door glass business. Both businesses show up and compete in adwords, places, maps, etc.

    There is as much difference in windshield/autoglass and flat glass/mirror business as there is a difference in the words blond and blind.

  344. I’d simply like to render search results pertaining to the last 6 months… not presented a PDF from 2002 or a page that introduced or originated a now outdated concept or article. A button that say’s “Just give me the latest and greatest”. I guess the other button could say “Time Travel” or “Forensics”

    Ditto on Michael Hoskin’s comments

    Also, after years of trying to educate the ADD and geriatrics on the value of Google Analytics data…Visits, Visitors, Custom Reports, Advanced Segments…etc… is there a formula to determine “Hits” from GA data so I can tell them 100 Million hits and send them on their happy way?

  345. I have noticed a rise in obviously faked reviews on Google Local Listings. They appear to be coming out of non-English speaking countries. I would hazard a guess that “local listing SEO specialists” are outsourcing these to India/Phillippines etc.

    Surely Google can disallow reviews that do not come from the country where these local listings are? (and punish those SEO companiesthose that are doing this) It really irks me that local is being gamed in this way

  346. I’d like to see more attention thrown at the webspam in the smaller countries’ local searches. I’m currently struggling with a staggering level of spam in Bulgarian, there are some people so bold that they are taking PAGES of results with doorways and so on. Sending reports isn’t as effective as I remember it to be, but I guess the volume of spam/reports has grown exponentially.

  347. Here’s what’s happening in some of the niches in the local Bulgarian Google search:
    One e-commerce website’s owner decided to spam his way to the first spots with his 4 month old creation. He’s posting hundreds of his links daily at small ad websites that don’t monitor what and how much of it is being posted. He’s completely taken over the first page, or the first two pages on hundreds of low and moderate traffic keywords. So if you really want to find what you’re looking for you have to go to page three to skip all the spam and find something meaningful. If someone (Matt) actually reads this – the webspam reports explaining all this are all sent with the google account I used to make this post here. This problem is spreading accross more and more niches and keywords. I guess the problem is that these are results and pages are in a language not so many people use, but I’ll be very happy to see a solution soon.

  348. I see a lot of suggestions about allowing users to report something.. That is inherently wrong. People are inclined to jugge based on moods and subjective opinions and usually with nit much thought put into it. Not everything is equally spam to everyone. It’s open to interpretation. Secondly, anything that is based off of user interaction is subject to serious abuse.

    It is my opinion that you (google) are playing a middle man game. You are not willing to give up on “residual” traffic across the board that lands on adsense junk sites, yet you are selling the justness story to web masters. You want your pie and you want to eat it too. Do you want a second analogy? It would not be as polite.

    So,, i agree with an above comment and suggestion that you cooked up the spam problem to begin with. Links. There is a number of ways to measure the validity of a site and it’s quality and it most certainly does not have to involve links. If anything, links are not and NEVER have been natural. Especially not “keyword anchored” links. That is an invention by Google idea that now infests the web

    The MOST natural link ever produced was “click here”, “here” and similar. In fact anchored kw link is far from natural.

    You could look into, for example, too repetitive anchor text (that is not natural,, only a fabricated campaign can be made with a high degree of an exact anchor text.

    Keyword based rank should be based off of site structure and content logic on the site,,, as the content logic of the lites that reference them with links or otherwise and not the actual links nor anchor texts.

    I think id like to add a cent to all of this.

    Your “relevancy” logic is flawed and more so the world of seo wannabe crowd is additionally interpreting it in an even more flawed way. Let me explain shortly..

    Most people believe that in order to valuably position a “strawberry cake” page they have to get a link from a strawberry cake page. That logic is entirely twisted. Why would anyone do it?

    Now, if you could detect inconsistencies like that (which, the fact that you choose what to rank based on keywords searched, indicates that you should be able to detect) i am pretty positive that you would eliminate large amounts of spam links.

    Unless of course they amount to 30% or more of your revenue….

  349. First, I would like to commend Matt and google on spam reporting. I have reported several spammy websites through the webmastertools and the sites are now demoted for the applicable searches.

    Occasional I will type in a search for a particular product or service and the first page results include directories or directory type websites that have little content or content that does not match the title tag, headers, etc. They are cleaver because the pages will include the keyword phrases (sometimes stuffed or “quasi-stuffed”) in the body of the page but the companies listed are hundreds of miles away or not relevant at all. For example, I recently typed in a search for a particular service in Charlevoix, MI. A well-known directory came up first and the only business listed was in Milwaukee, WI. There is a HUGE lake between Wisconsin and Michigan yet the title tag had Charlevoix, MI listed 3x. And this is the type of directory that deals with businesses that would only be relevant if local. I think it is time for google to get tough on these types of directories. You have been patient long enough. As we say in the country, “You need to learn them a lesson.”

  350. Start penalizing more for new domain names that spam blogs with their URL. These new domain names are easy to detect as the spam comments usually get removed, e.g. see backlinks for http://www.voupons.com.au. They usually rank 2nd for coupons in Australia and at times they disappear completely from rankings just to re-appear again.

  351. On your trip calculator where you show the time of the trip, say from north Dallas to Ft. Worth, if you could include historical traffic data and calculate the trip time during heavy traffic and present it as estimated time when leaving at 2:00 PM, or 3:00 PM, etc, with the knowledge built in to the estimate based upon traffic jams – and also report the current time based upon current traffic – that would be nice.
    You currently just say 1 hr or 1 hour 40 minutes with traffic, for instance.

    Scott Foster

  352. It would also be nice if you could create an application which translates movies language in real time. I could be watching a news clip or movie in Spanish, for instance, could click the button for “translate on the fly” and the movie would play but I’d hear the translated English as I watched it.

    Scott Foster

  353. Another year goes by… and still we see sites buying links from dodgy sites and Google still ranks them, ie:
    http://www.ergoresources.org/
    http://mymediacomhome.com/
    http://www.redistrictusa.com/
    http://www.arrugadas.com/
    IMHO and I share this freely, if Google can’t/won’t deal with this it’s time to sell Google shares…

  354. Love to see spammy techniques as this one out of the index:
    http://www.depilacao-laser.net/depilacao-a-laser-com-psoriase/?cid=27
    These guys are ranking top 1 using these techniques. I agree some sites get out of the index after a Webmasters Tool report, while others stay there for ages. Even Adsense is misused in these pages.

  355. Thanks Steven, I second your opinion:
    “But why stop there? Instead of dropping the ranking of sites Google doesn’t want showing up in it’s SERPs for whatever reasons, wouldn’t it be better to directly tell webmasters what you DO want and suggest changes to help them rank better? Webmaster tools already informs webmasters of missing title tags or short meta descriptions but what about showing pages Google deems over optimized or pages Google feels are lacking content or can be improved in some way to help them rank better.

    I think Google employees should try to put themselves in the shoes of a webmaster trying to build a well ranking site with very little advice from Google but 100 trillion pieces of bad information provided by SEO blogs and forums which almost all encourage black hat activity. It’s hard to build a quality website when you have no idea what Google actually considers “quality”.”

    Google should work on ways to reduce collateral damage to innocent sites. It should initiate dialog, maybe for a fee, when the webmasters cannot genuinely find what’s wrong with their sites and at least provide some pointers. It isn’t right to be judge-jury and executioner at the same time with no way to appeal or even know what you are being executed for. From webmasters point of view its horrible, plain and simple autocracy at its worst.

  356. Matt,

    I’m extremely worried and concerned about the way Google is handling these link spam reports and link spam in general.

    Especially how you can sabotage your competition by link spamming and then reporting them.

    A few weeks ago, we noticed an unusual trend of hundreds of incoming links coming to our website via Webmaster Tools. We noticed they were not very good links. We thought to ourselves “It must be a competitor trying to sabotage us, but there is no way Google would allow an Inc. 5000 corporation to get delisted by such shady tactics”. We were wrong.

    Fast forward one week, the home page of our Inc. 1000 company is completely gone from the Google index even for our name (feel free to check, the website is in the e-mail I used to submit this comment). I would understand if the Google algorithm had a flaw and this worked against Mom and Pops shops, but this is no Mom and Pops.

    Later, another friend of mine in the SEO industry told me that this is actually very easy to do. That his company has been delisting client’s competitors like clockwork in the past 6 months by doing similar things.

    The only way to solve this is if the algorithm discounts bad links, instead of sandboxing altogether because of them…

    Matt, I’m very worried about this. Not only because of our website (which we will have to go public and issue press releases if this isn’t solved soon), but also because if this becomes public knowledge, results will become completely irrelevant in the best search engine.

    Thank you for your attention,

    Daniel

  357. Here’s a project:
    Why don’t you investigate every single link that has been added to this page?

    You’ve got a trusted domain and people are taking advantage of the fact that you follow the links in the messages here.
    You are directly contributing to the problem that you’re asking about and you’re allowing people to manipulate Google results.

  358. Personal Blocklist chrome extension is really a great one.

    Thank you Mattcutts 🙂

  359. I would like a stronger filter on blog post. Their are so many computer generated comments that you have to go through and delete its extremely annoying.

  360. On the back of Daniel’s concerns. His attitude to this idea sucks “I would understand if the Google algorithm had a flaw and this worked against Mom and Pops shops, but this is no Mom and Pops”
    What’s wrong with Mom and Pop shops, Daniel? They’ve been the backbone of the web since Alta Vista days and Vax Notes. OK they’re an outdated type of site, but what puts your own Inc 1000, 2000, 3000 whatever in charge? Do you have a moratorium? Stop being so crazy. If you have the wonga you don’t need Googs and ranking. You should have built your own traffic system by now. Oh, did you forget to do that?
    JG

  361. I had to delete 700 spam posts in one week on one of my sites, I eventually got a plugin to disable comments, between writing articles, research, manual backlinking one just do not feel like going through all those spam comments. I have reported quite a few sites for spam, but two of them still rank higher than my site on Google, I have come across 15 of their spam comments on my blogs. Feels like you are banging you head against a wall doesn’t it.

    Honest people get a kick in the behind.

  362. I wish that there was a way Google could filter out of results those articles and pages which are stolen and badly spun content. I get so sick of seeing my own articles which I worked hard on rehased, sounding like junk and ending up ahead of the original in the results.

  363. I’m not sure what’s changed but I increasingly get old blog posts/content returned when searching for information that I know has more current results. I’m sorry I can’t explain it in more detail. It just seems like I’m having to go to page 2 more often because page one results (unless searching for a local company) are not answering the initial search query correctly.

  364. I agree, some guidance in what is a quality site in the eyes of google would help a lot.

  365. Hey Matt, I know I am bumping a really old thread and another thing, I realize you get a lot of bull from your work on ending web spam, but I just want to say thank you and I support you in that.

    I often jokingly make the analogy that web spam is a lot like throwing those plastic 6-pack rings into the ocean. Rude joke or not, they both ought to be ended and I hope this gives you just that much more motivation and determination to keep pushing forward with your ideas.

    Good luck!

css.php