Webspam in 2009?

It’s the beginning of the year, so I just wanted to get some outside opinions: what would you like to see Google’s webspam team tackle in 2009? Here’s how I asked for suggestions in 2006:

Based on your experiences, close your eyes and think about what area(s) you wish Google would work on. You probably want to think about it for a while without viewing other people’s comments, and I’m not going to mention any specific area that would bias you; I want people to independently consider what they think Google should work on to decrease webspam in the next six months to a year.

Once you’ve come up with the idea(s) that you think are most pressing, please add a constructive comment. I don’t want individual sites called out or much discussion; just chime in once with what you’d like to see Google work on in webspam.

Add your suggestion below.

327 Responses to Webspam in 2009? (Leave a comment)

  1. Cloaking used by justanswer.com, which presents one thing to Googlebot (all their content), and a “answers are locked — pay us to view them” to everyone else.

    The reason this annoys me so much is that their users get paid to answer questions, and we’ve found many instances where they are copying our own data, and reposting it without permission as answers.

    The fact that they get high billing in Google due to cloaking just rubs salt in the wound for me.

  2. I think a paid link reporting firefox plugin would be helpful. Right click on the suspect link, select report and report – the plugin could then pass you extra detail like the location on the page, etc, and might save your team time and encourage more reports.

    I think we’ll see more cracking in 2009 to hide links in sites. It would be reassuring if the webmaster console was quick to email/update webmasters with concerns and then for the spam process to be just as quick to re-list the site once it was clean. What about a “report hacked site” addition to the spam report?

  3. Google should greatly punish the webspam through decreasing their PR.

  4. LOCAL LOCAL LOCAL !!!!!!!

    This is by far the worst spam center in Google.

    Also, edu spam in the pills / porn / casino area.

  5. Perhaps this is asking too much, or maybe it’s just over simplifying but more transparency for webmasters. I’m not looking for a road map to the algorithm (although if you have one handy, I’ll take it) but SO much is left to guessing and wondering. Webmaster tools helps with that some and seems to be growing in this area – page title, meta desc feedback, sitemap errors, indexing data, etc… but what about penalties – has my site been penalized? If so why and how can I fix it? Are there areas on my site that are potential red flags? I understand it’s a balance between too much info that leads to manipulation but if ultimately Google wants the best results – the best sites then we as webmasters need as much help as possible giving Google want they want. Now I can already hear the response – Google wants websites with great content. Understood. But there is still a balance. I still have code that can be compliant or not, I still have site architecture, I still have page titles, meta desc, etc… and if I can be doing a better job with them in the eyes of Google I’d like to know.

  6. Chrome for Mac. 🙂

    As far as results go, better local business results. If I search for [pizza small town, usa], the results are usually 90% full of yellow page, directory and review type sites. I think it would be more useful to show the “official” web sites of the businesses first.

    I realize a lot of these sites aren’t that search engine friendly, but most are indexed, just the directory type sites have concentrated on SEO.

  7. My comment is mostly on being a new business with a ton of great content that gets stuck behind older businesses with less and perhaps more out of date content. It appears that PageRank trumps content in many instances (though not all).

    We are a product reviews website and only publish a page about a product if we have at least one review. These reviews are not snippets spidered off of other sites. We have full rights to the reviews, show the full review and actually work with retailers directly to help collect this original content from their users.

    Many of our competitors have more PageRank and just put the word “reviews” on their page. Thus, when users search for reviews for a given product often by typing [sd880is reviews]. Many of the results don’t have reviews at all.

    We have tried to use microformats to indicate the existence of reviews. Google should be able to eliminate the pretenders in this fashion.

    We feel that we are an extremely useful site for product research queries and that Google could do a better job and present more relevant results in the [reviews] genre.

  8. Reviews.

    If I search for [product_name review] I don’t want to see the first page entirely taken up with spam sites that do cost-comparison without a single review on any pages. I want to see blog posts, newspaper articles and maybe cost-co sites WITH reviews of that actual product

    Otherwise it’s pointless and spam./

  9. Hello Matt;

    Here are several point to focus on in webspam :

    . Duplicate web sites for same activities.
    Ex : in Google France there are a lot of high searched queries like “rachat credit”, there are a lot of web sites which present the same activitie and which are from the same owner.
    And also Google is a registrar, so it should be quite easy.

    Second suggestion : when there are penalties, webmaster should know it from Google.

  10. Fewer sites requiring logins up at the top of the SERPs (à la Experts Exchange) and fewer blog-scraper sites that basically aggregate others’ RSS feeds.

    You guys are doing a bang up job, though — keep up the good work.

  11. I’d suggest that the biggest problem isn’t within the engine itself, but within Adwords.

    I see a number of listings within Adwords (e.g. anything in the “guaranteed top 10 rankings” bracket) that would have no place within the main index since, quite frankly, it’s crap. if it’s penalized within search (or banned), ban it within Adwords and leave room for quality advertisers.

    Other than that, I’d say the following things:

    1) Scraped newsgroup listings for Adsense/”content” purposes. This is especially bad in the tech area.

    http://www.google.com/search?num=100&hl=en&safe=off&rlz=1T4GGLL_en&q=%22The+TCP%2FIP+connection+was+unexpectedly+terminated+by+the+server.%22 <– there are a couple of examples here (vistaheads.com and tech-archive.net).

    2) We’re at the point in time where penalties need to be outright removed. A site is either good enough or it isn’t…period. With longtail searches increasing both in quantity and in variance, sites such as the ones listed above still get the benefit of traffic and (presumably) revenue from said traffic without doing anything other than copying public domain and/or private content.

    3) What you do for one, do for all. If someone’s conclusively spamming Adwords or search or GMail, block access to all Google services. Flag all emails originating from the domain as spam. Don’t let the domain into the search engine. Don’t let the domain into Adwords. Don’t let anyone using the domain use Analytics or any other Google service. In other words, make the penalty for spamming severe enough from a revenue generation point of view that it simply isn’t worth it to try it. Don’t hang them high unless you can prove their guilt (innocent until proven guilty and all that), but once you know they’re up to no good, nail them.

    That’s basically it from here.

  12. I often use Google for searching error messages related to programming.

    I find that there are a lot of sites that simply index forum threads from other sites, thus the same thread would appear 4-5 times in the SERP.

    It would be very valuable if it was possible to just index the original site for the thread, where it would also be possible to reply to the forum.

    Another thing is searching for a specific product like [brand model variant]. This often returns a spam page with a long list of links with all model names, but no actual content. Just links to other similar pages with no content.

  13. Voting for web sites by users with rank/reputation. e.g. users with Google accounts can gain rank using a number of metrics based on their usage of the web. This rank allows you to filter and weigh votes cast for pages/sites.

    i.e. Just like a link from a high PR ranked page counts more for the linked-to site so does a vote from a user with high rank. Also, like the PR algorithm, you’d have to keep UR (user rank) secret to make it difficult for user’s to game the system.

    What are those two “vote for this page” and “vote against this page” used for?

  14. I’ll try to keep this short and simple

    – Search Queries – Websites that take you to a search results page or something similar..They are just using your keyword that you type into Google and placing into their search engine. This isn’t helpful to me in anyway. It’s like a never ending process. You click the Google result which takes you to XYZ.com – which displays more search results….and lets say you click one of those links you just get more and more and more links. This isn’t helpful to anyone.

  15. From a Webmaster perspective, I think it is frustrating to see that paid text links are still very successful after Google has made such a point against them. It is frustrating to try to “be good” and see sites that flaunt the “rules” ranked highly. I know not all sites can be caught, but I see many blatant examples that make me think Google should be able to red flag sites that suddenly shoot up the rankings based on unlikely linking patterns. I know we can submit offending sites, but most Webmasters don’t really like the idea of doing that & it’s very time-consuming.

  16. I think the general quality/relevance of the search results improved in 2008, actually. I especially like to see trustworthy review sites like Yelp show up highly for queries like business names and restaurants/bars in a specific city. On the other hand, I do hate to see spammy/shady sites like Rip Off Report and Scam.com rank well.

    On a semi-related note, I can’t wait to see what Google ends up doing with things like Search Wiki and other search projects.

  17. hi

    I know that in the sports vertical topical time sensitive long tail kws are littered with MFA sites

    There should be a SERP check for MFA sites which have some > zero content

    e.g. < 30 words and 4 ad units typed in by outsourced staff should trigger a flag

    There has been a dramatic rise in these sites in the SERPs in the last few months and the usefulness of results has dropped dramatically.

  18. I create a website in order to experience some SEO concepts, and of course, provide good content and being attractive to my audience. I want to see if a normal person really wants to create a good content website, Google will index it accordingly. During my tests I noticed some facts:
    1. Webmaster Tools is becoming awesome! It was fast to include my site there and just 30 min after my SiteMaps creation, my page was already indexed and in a fair position for certain keywords. This was a great experience!
    2. Unfortunately, even if the inclusion was fast, the update for the “Statistics/Top search queries” are the opposite. It takes about 1 week to be informed about the keywords users are getting into my page. This could be improved.
    3. I think Google should pay more attention to DMOZ and Google Directory. There is a great opportunity for getting good content there like it is in Wikipedia. Issue is on selecting good volunteer editors focused on categorize quality content. It is a good issue for a discussion.

  19. I agree with Andrew, the product review spam is out of control. I would like to find a site that actually reviews the product at hand, not content scraped from Amazon.

    I also think the fact that this was addressed in the second comment is testament to how annoying it is 🙂

  20. You could provide a Spam Detection API, which any service (like a blog comment system) can access by sending a string of words, some context and what-not, and receiving a REST response of SpamRank 0-10. Google could indirectly benefit because when there’s less spam on the web as more webmasters access that API, then there’s also less spammy links etc. which might skew results (of course, there’s also nofollow). I believe I requested this before 🙂

  21. I agree with Chris Bartow that seems to be a big problem with local search. Has Google thought about enabling a check box or something that says “search locally” and based on your google account location it will attempt to only show local websites.

  22. 1. I would like to see Google devalue ALL links for sites like, myspace, facebook, etc. Those sites are being gamed in a big way to create backlinks.

    2. Go back to your roots when sites that get spamming using old fashioned techniques, like hidden text, get reported and remove them from the index for a set period of time instead of letting them back in after they fix their spam and work on some other form of spam to game Google with. Not much of a deterrent to spammers.

    3. Actually have someone from the spam team look at spam reports from the Google webmaster console and evaluate the site in the report taking MANUAL action instead of just writing stuff down to try and have the algo figure it out. As smart as the algo is, it will never be as smart as a good Googler on the spam team.

    4. Post more pictures of Emma! Sassy cat likes to see other pictures of her species when she is sitting on my keyboard keeping me from working.

  23. I really hate to see top positions of webs with A LOT of keyword stuffing… they put “keyword” 120 times un their footer and google doesnt notice that… it shouldnt be so hard to detect keyword stuffing. I already sent spam reports but no success there…

    Agree with Adam… you can see a lot fo webs that are spammy into adwords.

    Its frustrating to be a white hat seo-webmaster.

    Happy new year Matt! 🙂

  24. Assuming we can’t reduce spam, I’d like to at least minimize its impact by fixing two related problems:

    1) Scraped content getting indexed on the scraper’s site before the author’s site.

    2) Scraped content appearing higher in the search listings than the original content.

    I’m not talking about a single stolen article. I’m talking about sites made entirely of scraped content. How do they slip past the current algorithms and get ranked so highly?

  25. Spamhound said it like he took the words right out of my mouth:

    “Actually have someone from the spam team LOOK at spam reports from the Google webmaster console and evaluate the site in the report taking MANUAL action…”

  26. Another vote for the reviews issue about which Andrew commented. These pages are difficult to spot on the SERP because the piece of text that gives you the clue (somethink like “no one has written any review yet”) usually does not appear in the result snippet.

  27. Team up with Akismet and use their data for who’s spamming who.

  28. Instead of looking for new areas, Google should first look at the reports submitted through Webmaster Tools.

    I am looking now at a detailed report I submitted on October 28, 2008, in which I reported someone using hidden links on thousands of sites, and a way to find hundreds of more sites using the same technique. Nothing was ever done, as I see that the main site using this technique (specifically mentioned in the report) is still ranking (with a nice pagerank) in its full glory.

    Why should I continue to send reports in, if nothing happens from them?

  29. Gmail’s spam filter is redonkulously accurate. I think the best contribution Google’s antispam hackers could give to the Internet at large would be an API to that ridiculously incredible resource. Bonus points for an Akismet-compatible API.

  30. Duplicate content is great problem in non-English web. Webmasters who used to copy+paste are doing this for traffic not for people. It’s a sort of spam.

  31. More focus on REAL spam like hacked backlinks, comment spammers, forum spammers etc (as someone that runs a few popular sites I’d love to see the option to report some posts/comments) and less focus on penalties for duplicate content.

    Wishful thinking; google takes a less ridiculous position on paid links and goes with same idea msn/yahoo have here, ie. relevance and quality first.

  32. I would like to see listing sites lose of there clout. Any site that just submitted listings is not work a top rank. They offer no new content and force link back from listed sites so they crowd out the quality sites.

  33. Same as Andrew and Chris.
    If use more often German Google but it’s pretty much the same. You google for a product and the first ten pages are some spammy directories, review or price comparison sites that create their content automatically. Often it even says on the page that “no results were found for [keyword]”
    Little web shops that actually offer the product or reviews that actually really do talk about the products aren’t listed first.

  34. I’d like to see far less apparent value allocated to keyword rich domain names. There’s been a build up of networks of interlinked websites, each using key, relevant search queries as their domain names. This type of spammy activity does nothing to enhance the user experience and should be penalised. Google must be aware of these networks so it’s baffling how the sites involved often appear to be enjoying excellent search engine positions.

  35. About local business search: wouldn’t it be nice to create a meta tag, which tells google and other search engines where the business or organization is located? Maybe something like a ‘gps tag’ for websites?

  36. I would like an easy way to tell Google this is NOT WHAT I LIKE ! A scrub it check box or something. I know what is wasting my time and I never ever want to see. Google knows what I click on but not what rings alarm bells, annoys me, what I choose to avoid, what from experience I think is poor quality, what I think might be a scam, what is work, what is shopping, what is research, and what is fun, why I think its silly for me to see adds for wieght loss when I’m not etc. If I could only tell Google quickly as I searched what category of search I am doing and what I just didn’t want in those top 10 crucial search result positions! I’m sure then some sort of clever logic could then be built to make me enjoy being a Google mistress even more than I am already. I know from my Google reader and from organising my mail that there are set flavours/qualities of sites I like to visit. Equally there is an opposite which I would think fantastic if eliminated. I understand that my Facebook user friend or work terminal might love something completely different. I know its asking for a more personalised service but I would love the attention and the adds I hope might become more relevant too!

  37. Reviews. Here’s an example of what I’d love to see;

    Think for a moment how when you type in certain products, say for example, ‘iPod’, in universal search it shows x3 shopping results at the top. I LOVE it how Google came up with this, but why not role it across other areas where you already have plenty of data? Example…

    I type in to Google’s Universal search, ‘iPod Reviews’ and voila – I get x3 review results at the top, which I’d of thought you could take from such pages as: http://www.google.co.uk/products/catalog?hl=en&safe=off&q=ipod&cid=7546580999858004957#ps-sellers

    Reason I request this, is because when I search for product ‘x’ reviews, some of the time, I’m coming across sites that reallllly were not even worthwhile me clicking on, but of course, being top/near top of results, I am usually quite happy with what the Big G feeds me. This way, maybe we can get away from these spam websites… and possibly, if implemented correctly, it would maybe even stop people having to even click on a link for ‘reviews’ if star system was also shown, thus people get to where they want quicker – the purchasing process.

    … then again, maybe Google wouldn’t want to do that because it would miss out on AdWords revenue, but who knows? Certainly not me! Either way, I think it would just be one extra useful way of keeping searchers happy. ^_^ Oh… and of course, goes without saying really, but more and more people are being selective on what they buy, so ‘ratings & reviews’ must be on the up as a search term(s).

    Sorry Matt, quite a long winded comment in the end. Coffee & lots of sugar keeping me awake to crack on with my work late, did not help this matter. o.0

  38. Shutting down all the splogs on Blogger!

  39. I’d like to see a comment against a SERPS listing that indicates that the site may be harmful in it’s content to a Users’ computer. Perhaps this may be identified by the Googlebot spider as it crawls the site and comes across something objectionable e.g. a hacked page or one that will launch a trojan/virus or perhaps from a site that is hosted on known problematic IP addresses.

  40. Find the way to fight duplicated content, those “forum copies” and indexed copies of DMOZ in Top10 makes to much noise from nothing.

    Keep your work, fighting spammers and, more important, phishing attacks – it is a right idea to keep WEB safe. Google could easily, at least, make people know, how to “drive safe” thought the Internet.

    And, keep going the Google way 🙂

  41. Amen to that Greg.

  42. I would like to have slightly better customer service through webmaster tools when trying to get a site re-indexed. It would be nice to have more visability as to why a site is penalized.

    I had two sites that were removed from Google’s index. They used to be fairly crappy sites that did a lot of RSS aggregating so I suspect that they got hit with a penalty because of that. I’m okay with that. I have since (over a year ago) changed the sites and have been adding a ton of original content. I resubmitted them both for re-inclusion and for some reason one was quickly re-indexed and got a PR3 within a few months, while the other one remains out of the index. I’ve resubmitted it and nothing ever seems to happen.

    I don’t make a lot of money off either of these sites so I’d be happy to change the sites in just about any way if there’s still some violation. However, I suspect at this point my site no longer violates any terms but Google’s system doesn’t really seem to give me a way to get reincluded.

    I don’t see how more information in webmaster tools about your sites penalties would really hurt Google. Sure some spammers would more easily slide their sites back in, but as long as Google’s algorithum catches spammy sites they’d just get quickly thrown out again. And there’s probably lots of folks like me who just run a few sites in our spare time for fun and would be happy to make any changes if we just knew what they were and had a real way to request reinclusion.

  43. “Chrome for Mac.”

    I’m not sure the webspam team has any pull with the Chrome team, and I suspect that Mac is already right near the top of the list, but I can pass the suggestion on. 🙂

    Thanks for the other suggestions, everybody, and keep them coming.

  44. False positives on webspam can be a significant problem for webmasters and end users alike. (not to mention that removing valid relevant resources from listings naturally increases the room for spam and non-desirables to fill the gap)

    Perhaps it would be advantegeous to include an indicator in the webmaster tools console for notice of de-indexing a home page, removal of a site, or some other penalty related to uncommon/new issues. Some of those issues could be complex enough that false positives take much longer to correct than index quality control would like.

    Could 2009 be the year of reducing false positives?

  45. Local Mapspam is horrrrrible right now – I get complaints from my clients all the time on this. The Google Local team does not have the capacity to deal with this problem on their own and desperately need your help!

  46. I would love to know how you guys deal with link poisoning.

    Directory spam and virus notification within search results.

  47. Matt, I would love for you to reevaluate Google’s high ranking of such sites as RipOffReport.com which refuse to remove libelous and slanderous information from their site. While yes, Google is site agnostic with its results, Rip Off Report has sprung up an entire realm of the necessity to perform needless SEO to get legitimate search listing pages ahead of RipOffReport’s pages which in many cases are false and hurting the reputation of many good businesses.

    SEOmoz is absolutely right in their post:
    http://www.seomoz.org/blog/chris-bennet-on-rip-off-report

    I am sure you are aware of the issue – their results in the SERP’s actually hurt Google’s authority in the eyes of many businesses who are profitable advertises and legitimate businesses for you.

    Thanks for your consideration.

  48. Hi!

    I’d like to be able to talk to Google without fear of having my site(s) delisted in the event of a hack — my hosting account got compromised a couple of months ago (a hacker shell was installed via a badly-handled php include in a long-forgotten site hosted on the account) and software was installed on other domains hosted there that generated thousands of pages of links to all sorts of nastiness. Google indexed all of this almost immediately (very impressive!) and, of course, as soon as I spotted what had happened I deleted all the rogue pages and tightened up my security. I really wanted to tell Google about it it so that it could be taken out of the index (I was getting hundreds of hits an hour from Google searches, and each of those was somebody who wasn’t getting what they were looking for) and attention could be given to the identical hack that had obviously taken place all over the web, but at the same time I didn’t want to risk putting my head up above the parapet and having my sites taken out of the index altogether. What’s a webmaster to do?

  49. I’m not sure it already is, but search volume in relation to click through rate should be the most dominating factor. Kind of like quality score in adwords. Maybe even incorporate traffic popularity data from quantcast or alexa, et al. There is way too much easy gaming of sites, blogs, text, anchor links, subdomains, etc.

  50. >>>Lead me to professionally-edited, well-written articles<<<

    I’ve heard that Google’s scoring includes some element of checking for proper grammar, but recalling Mark Twain’s quote, “Sorry for the long letter. I didn’t have time to write a short one,” just imagine how much time Google could save THE WORLD if they better directed their users to the “short letter” that took more time to write — websites with well-written, grammatically-correct, professionally-edited content.

  51. I agree with Todd Mintz, the local spam is the most annoying.

  52. Combatting image theft somehow would be much appreciated (how this can be done is beyond me, though!) Possibly a duplicate content style treatment for appearing in Google image searches?

  53. It was mentioned already but, I hate the fact that when you are searching for things like apartments, you get loads of apartment finders, and no actual apartments.

    I would understand if I searched for “apartment finder chicago”, that they would come up…but I don’t want to see a bunch of apartment finder search results when i search for that, I’d like to see websites of condo owners that actually took the time to make a website to attract business.

  54. A few other people have said it, but “Product reviews”

    If I search for:

    modelnumber review

    Then I want to see reviews of the actual modelname product, not 200 sites that have created a page for a product without any real content.

  55. What is spam? Thats the issue. Whats spam to one person might not be to another. The more Google attempts decide this for its users the less choice everyone has. I find myself going to other search engines to get more choices because Google has tightened the options so much.

    So the more Google takes it upon itself to decide the less choice its users have. Nothing worse than one giant deciding what everyone sees.

  56. I have some thoughts but first I’d like to agree with what BuzzillionsJim has said.

    There was a time when you could do a search for “productname review” and you’d find a selection of useful reviews for that product. Now you do that and you find a bunch of links to etailers that either have the same reviews they got from amazon or they just have the ability to post reviews but none actually there. Or if they are, it’s obviously not a real review.

    Also related, I’m tired of all these sites that create garbage content showing up on the first page of search results. They just grab keywords and stuff pages or links to pages. I’m not sure how to classify them so I want to give a few examples.

    Directories. Directories pages shouldn’t be returned in search engine result based on the content of their pages period! If people want to use directories that’s great. If people want to Google for a directory by doing a search for something like “directory of thistypeofwebsite” fine. But just doing a search for a particular company shouldn’t return a bunch of directory pages in the first couple of pages of SERPS. Only when they specifically are searching for a directory.

    Websites that provide information about other companies or people. Stuff like linkedin, spoke, etc. Those should not be ranked so highly, especially since most of their pages seem to be fake. They just found a name somewhere and created a blank page for the person to eventually come enter that information. If someone hasn’t really filled out the page, then it shouldn’t come up. It’s so annoying trying to search for someone or some company and you get a link to a page that only has the person’s name in it. Same with companies. Manta seems to provide some good information but there are so many copies that come up in serps too. Most just have the name and address. Sites like yellowpages.com should not come up in serps when looking for a company. Definately not 30 different yellowpages.com type sites. Only if someone does a search for a yellow page directory.

    AboutUs.com is a site I’m tired of seeing as the 2nd or 3rd result when I’m trying to find information about a company. THIS IS THE MOST USELESS SITE TO RETUN IN SERPS!!!

    Think about it. All they do is copy the About Us page that they spided from other websites. The original company website is the first result when I’m searching for the company. If I wanted to see their about page, I would go to their site and click on the About Us page or directly click on the About Us sitelink Google is now displaying.

    AboutUs.com provides NO VALUE. At least not enough value to rank on the first or second page on Google searches. If someone is doing a search like “collection of about us pages” fine, let them be the first link.

    I’m just so tired of having to go through 3 or 4 pages to find REAL USEFUL INFORMATION AND REVIEWS about companies.

    Same with all these companies that create “statistics” and “valuations” on websites. If I do a search for example.com I want to find sites that are talkinga bout example.com. Not sites that happened to spider example.com and create a website that shows bogus, innaccurate information on the site. If I want to find “what is example.com worth” or “traffic for example.com” then they should be on the first page, otherwise, they shouldn’t.

    I’m also tired of websites that don’t put their content above the fold. If someone wants me to scroll down 2 pages of ads to get to their information, they obviously don’t put much importance on their content and neither should Google!

    You need to do a better job of finding REAL content and not what are just essentially placeholders and duplicate content.

    I’ve been getting so dismayed with Google’s search results that I’ve been using other search engines more.

  57. I agree with what Greg Bulmash said. Google hosts a lot of webspam itself. So ask people inside Google who will be accepting user-generated content to come talk with the webspam team first.

  58. I agree with DraxOfAvalon – It is so frustrating “following the rules” and running into sites that have loaded up the meta tags and content with keywords. I was with a client explaining meta tags and preaching about good content, only to bring up a SERP with a spammy page at number 1 and 3. My wish for 2009 is to level the playing field a bit. I have my white hat on, that should count for something.

  59. I had to come back. Thinking about this got me riled up and thought I’d throw in some more stuff that has been swimming around the back of my head lately.

    “Answer” sites. If I do a search for “how do I do A with B” I get a whole bunch of these answer sites. What’s worse, I get these sites even if the question hasn’t been answered. I know you can’t just filter by checking if it was answered, because then they’ll automatically include an answer, at least the less reputable sites.

    Article sites. Just garbage lately. People post redundant information and many times just plain useless or innacurate to create links to their sites from high ranking article aggregators.

    The problem with these Answer and Article sites is that through volume they have gained high rankings and PR in Google. Though the site may be good for posting and questions, answers, articles, etc. That same ranking shouldn’t apply for the individual content on the site.

    Especially for sites that allow user contributions with little or no editorial review.

    Here’s an example. Someone wants to replace a toilet so does a search for “how to replace a toilet”

    First result is on doityourself.com. That’s not so bad but the article itself isn’t from an authority, though the site may be in general.

    Next ronhazelton.com Some guy that has a home improvement show at 3am or something.

    then, bobvilla.com home repair has been.

    lowes.com, retail site info is probably good but I again, not an authority.

    lancelhoff.com some blogger that created a site for replacing a toilet flapper and 90% of the screen when you go to his site is covered in ads.

    ehow, and wikihow. Neither from what appears to be an authority on plumbing.

    some more crappy links then further down on the second page you see thisoldhouse.com with how to install a toilet. The site has the best pictures and instructions. It’s the best online website and home improvement show that uses real working tradespeople and not actors with toolbelts but they show up far down in the SERPS because they titled their post with Install instead of Replace.

    A targetted site like TOH or AskTheBuilder (sticking with home improvement example) should be able to pass sitewide ranking to individual pages. Sites where any monkey can post content shouldn’t have the same benefit.

    I know people use article sites to help promote their blogs but that doesn’t make sense. If there’s a blogger that is an authority on the subject, they shouldn’t have to piggy back on article sites. Apparently these article sites and sites like digg become clicky and people help promote each other. Not because of genuine interest.

    Some of the stuff may not be easy to create algorithms for, but if this was easy, you wouldn’t deserve to make billions of dollars a year 🙂

  60. Transparency to webmasters. Communicate better via email, webmaster tools, etc.

  61. Hey Matt. These wordpress scraper sites are getting pretty annoying …some are even ripping my links out of my own RSS feed and pointing it to their own URLs … Even showing up on Google News when we don’t.

    I would like to see you guys make a definate stance on the DP network – do you count the links or not? lol – Happy New Year

    PS – is the algorithm recognising and rewarding Trade Mark citations yet? That would be great! 😉

  62. Spam Blogs – Blog search results could be a LOT better. I get a lot of scraped RSS feeds and other sorts of auto generated or re-posted garbage when I do blog searches.

    Wikipedia Spam: I’m sick of Wikipedia showing up at the top for everything. It should be on the page somewhere but at the top for almost everything??? There is no reason why you folks can’t penalize Wikipedia so they don’t show up above the #4 spot. Save the top 3 spots for others and help to create some variety.

    Thanks!

  63. I think Google does a very good job already. I don’t see many spam sites now like I used to. What is happening though is that vast sums are being spent by the ‘big players’ on buying links. The number One player in my niche, travel, has number one rankings in thousands of keywords. I set up a Google Alert several months ago for their domain name and nearly every day I get an email telling me where they have obtained links. It’s obvious these are not natural. Their site is similar to mine and they are the same age but because I don’t buy links I don’t get the rankings.

  64. Howdy Matt, long time reader, first time commenter here.

    I’d like to see Google drop spammy pages from it’s listings. If I get a call from an unknown number, I Google the number. I cannot tell you how many times I get pages with lists of 100’s of phone numbers on it with no information other than where the area code and maybe exchange is. It normally takes me 5-6 clicks to find a site with actual information. Those pages with 100’s of phone numbers are clearly SE spam.

    One thing I’d like to see is a toolbar tool that alerts users to spammed links. If for instance I click on a blog and one of the comments is bunch of viagra links. It’d be pretty cool if the tool bar could just strike through the links so it takes more incentive out spamming for spammers.

  65. Google has a tool to check if a site map has no errors
    Google has a tool to check if a web site has crawler issues
    Google has a tool to check if there are no duplicate issues (like titles and descriptions)
    W3C has a tool to check of a website is HTML (browsers) compliant
    W3C has a tool to check if a CSS file is compliant
    There are tools to achieve a better load performance
    etc…
    All tools in favor of the ‘searcher’
    Why not giving Webmasters a tool to check if a website (or page) is compliant with Google’s guidelines? (which, according to Google is in favor of the ‘searcher’)

    Ciao!

  66. I agree with Hawaii SEO. I’m tired of seeing Wikipedia promoted as if it’s an authoritative site. You once said Google felt it was okay to promote questionable content like Wikipedia articles because average users don’t know any better.

    But who is Google fooling when stub articles are promoted to the top of search results? Why not at least filter the Wikipedia stubs?

  67. Agree with the first comment: the cloaked answers pollute the value of the Google index.

    What I would like to see consideration of is the primacy of some link sites, particularly social voting sites to gain lead positions on results. For example, hitting a result from Digg that has 1 vote (from the submitter) over the original content itself; it creates a bad experience, and pollutes the index with non-results. How you’d do this I don’t know, but content originators shouldn’t be losing out to links farms, even if they have a social voting function like Digg.

  68. Hi Matt,

    I kind of look at spam from two angles … as a user, and as a marketer.

    As a user, I agree with many of the other commenters. There’s nothing more frustrating than searching for something like a product user’s guide, and having to wade through a page or two of sites trying to sell me a product I already own.

    On the flip side; as a marketer, if I build a site that’s monitized with for example, amazon or ebay products, as long as my descriptions and text are clearly targeting “sales”, why should my site ever be deindexed for “lack of orginal content”?

    I’m not trying to attract search users looking for information. I’m looking for people who search for something like “best price on ___”.

    Sites like those mentioned in my first paragraph, that try to trick search users, deserve to be penalized. But if a site is straightforward about it’s purpose, even if it’s a “thin affiliate” site, total deindexing of the domain seems pretty harsh.

    Thanks,
    Todd

  69. Dave (originial)

    1) Google to take a top-down approach on Spam. I.e. Ban the site of the “professional SEO” when they use Web spam to boost their clients site(s). That SHOULD include the “professional SEO” who informs their client of the risks.

    2) Stop pussy-footing around with Clayton penalties and actually BAN sites for a min of 6 Months, longer for repeat offenders.

    3) For Google to STOP sponsoring Conferences where ANY sort of Web spam is taught. It sends a mixed message to site owners.

    4) Stop showing TBPR on the Google toolbar.

    5) Maintain a list of “SEO professionals” caught spamming via email and Web spam.

    6) Have an email where Webmasters can forward spam emails they never solicited from “SEO professionals”.

    7) Don’t except any old site for AdSense. Be selective.

    8) REALLY crack down those selling PR in the form of links.

    9) Google KNOWS the SEO forums that promote and condone Web spam already.

    “Evil prevails when good people do nothing” can be changed to Blackhat prevails when Google does little to discourage it.

  70. “[…] what would you like to see Google’s webspam team tackle in 2009?”

    – Better detecting of link networks and link exchange. Perhaps a formular in the WMT.
    – Better detecting of hidden CSS spam. Just ignore hidden text in your ranking-algo, so everyone can use it for stylish pseudo-Web 2.0 sites to activate and deactivate tabs the user will never click on. If the content is good, the site owner will not hide it, to be sure the user can see it.
    – Better detecting of paid links. There is still a big market out there.
    – No SERPs of other searchengines, bookmark sites and other trash in your SERPs. If a user uses Google to search something, than he will not get results from other search engines where he has to dig again in the results. That makes no sense.
    – Use Chrome (sending variable user agents) to render websites to see exactly like a user. Why? Many sites are designed for standard resolutions and viewports with a nice layout which will direct the user with graphics or short descriptions to the target. But if you scroll down you’ll often find text only made for Google with links with no mouseover and in the same textcolour, so that the user doesn’t see them – in case he discovers the text at all. Conclusion: the site owner will not that the user discovers these miserable and poor text. And in most cases the user goes where they want them to go, if the nice layout at the top is good enough. And the scrap at the bottom was just made for Google – it’s just spam. If you need examples, contact me.
    – Use social media sites to detect spam networks. Further reading: http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2317/2063

    greetz
    Robert

  71. My suggestions as a blogger what I would expect from Google:

    1) I get a lot of link buying offers though I have placed a warning on my contact page that I will flag link buying offers to Google Webspam Team. If I could really flag that messages to Google, I think it will be beneficial to Google to spot those secret link buyers.

    2) Be transparent in penalties. If you could transfer the reason for penalties if any imposed on a site through web master console, the web masters could rectify the mistakes had it occurred due to their ignorance.

    3) Sites found with viruses: Same as suggestion #2.

    4) Better customer service: At least a one-sentence reply that “Google will look into the matter and get back to me” if I report anything to Google especially reconsideration requests if any.

    My suggestions as a web surfer:
    1) Spam local maps.

    2) Still better search results, re: For eg., just now I searched for bakers cyst synonyms and the first result that I got was from a respectful site which has these terms in the title tag but the content has “No entries found.”

    3) In this era of Google, the relevance of directories does not exist for searching anything. So no results from directories of any sort, even your pet DMOZ.

    4) I’m still getting doorway pages or link pages whatever you call it for searches I make.

    5) I tried to compile a list of some companies that are into one particular business and while searching for some company names, I got fed up with “aboutus.com” coming at top of the results page instead of a particular company’s web site.

    6) Personalizing my search: I will be much pleased if I have an option to exclude some sites appearing in my search results for some information that I’m searching for. Eg., Wikipedia, answers.com.

    Still I may come back again to put in here all those things that I may have forgot to write in one go.

  72. Dave (originial)

    6) Personalizing my search: I will be much pleased if I have an option to exclude some sites appearing in my search results for some information that I’m searching for. Eg., Wikipedia, answers.com.Google has allowed this for a few Month now. Just log onto to your Google account.

  73. Dave (originial)

    6) Personalizing my search: I will be much pleased if I have an option to exclude some sites appearing in my search results for some information that I’m searching for. Eg., Wikipedia, answers.com.

    Google has allowed this for a few Month now. Just log onto to your Google account.

  74. Duplicate websites owned by one company monopolizing the search results in certain business areas. Unfortunately, it is often just the business competitors who realize what the one company is doing, but this practice does damage not just to competitors, but also to customers.

    And, definitely what Andrew said about spammy product review sites.

  75. I would like Google Spam and Google in general to coordinate to make Google always be an example of using nofollow properly. For instance, I’ve read that http://www.google.com/checkout/m.html may or may not pass pagerank, and that some google corporate blogs may or may not pass pagerank to customers of google. While of course google internally could be reporting those links, it would be helpful if they were explicitly nofollowed in the html so that webmasters can use google itself as an example of how to use nofollow properly.

    Thanks for your blog, and thanks for the chance to make suggestions.

  76. 1. remove pay-for-the-answer links from search results (e.g. experts exchange)

    2. aboutus.com useless.

    3. notification when I log into any of my google services if there’s a problem with one of the websites in my webmasters tools.

    4. The ability to declare which sections of a web page should be indexed.

    5. Easier to report problems, such as obvious spam sites.

  77. Matt,

    Social media spam is increasing day by day and its been displaying all around SERP for almost all keyword phrases This does not help anyway in improving user experience. I would suggest removing all these from Google SERPs.

  78. I agree with others here that the edu spammers seem to be increasing on google as do the wordpress scrapers. So, hopefully something can be done about those before it gets out of hand

    The main I would like to see tackled, though, is the saturation of sites paying for links listing high up on google – the ones that obviously pay for links on tens or hundreds of sites, all with the same anchor text, then surprise surprise, they list number one for that anchor text phrase. When you look at the sites with the paid-for links, it never looks like a natural link, rarely part of a sentence or paragraph, but simply within a list of generic generic links, all paid-for.

  79. Hi Matt,

    Just like Todd Mintz said above, local search seems to be a cesspool of spam results. The results arnt all necessarily spam results but they are mostly dominated by useless directory listings and VRE sites, not business websites.

    I know keywords in the URL doesnt hold as much weight in Google as it used to but I think putting more weight towards geo targeted domains (domains that contain business location) will clean up local search results alot. E.g. “lasvegascarpetcleaning.com”.

    Could maybe also put more weight towards domains with 90 – 100% match to the search phrase. E.g. “lasvegascarpetcleaning.com” and “carpetcleaninginlasvegas.com” should rank high for the search phrase “Las Vegas Carpet Cleaners”, rather than directory listings from higher authority directories.

    Directory listing suck because it feels like your searching and then clicking on more search results from another SE. Wasted, time consuming navigation.

    I know meta tags no longer hold any weight when it comes to ranking websites due to meta tag spam abuse in the early days. I do think Geo Meta tags are being overlooked when it comes to ranking sites for local search. I think they would be quite valuble to Google when it comes to laser targeted local search.

    This is an example of a Geo Meta tag. (for people that dont know what im talking bout)

    This is Las Vegas

    This is North Las Vegas

    As you can see the geo position which is the latitude and longitude is different so a web master can laser target the a certain area. Obviously the more decimal places the more targeted to a specific location.

    Obviously you would want a website that has a geo position that targets the area of North Las Vegas to rank higher than one targeting Las Vegas if you are searching ” Accommodation in North Las Vegas”

    Just a few Ideas and suggestions, let me know what your think Matt.

    Cheers

    Aaron Hayward

  80. Last try then il give up

    meta name=”geo.region” content=”US-NV” />
    meta name=”geo.placename” content=”Las Vegas” />
    meta name=”geo.position” content=”36.227104;-115.246582″ />
    meta name=”ICBM” content=”36.227104, -115.246582″ />

  81. Dave (originial)

    1. remove pay-for-the-answer links from search results (e.g. experts exchange)

    I second that and would add that site where you MUST sign-up/in to see what googlebot saw should be removed too.

  82. http://www.1degree.com/process.asp

    Take a look at this system. I have reported it to the webspam team and yet…nothing. Clocked content, clocked links, etc….

  83. 1. Stop those websites/blog from appearing in the SERP which have not been updated since a long time.
    Eg: Say a blog with its last entry in 2006 appears on top in SERP for a particular keyword. This limits the chances of those who have new blogs (after completion of sandbox) with regular updates. Maybe Google gives a lot of priority to the domain age or existence period, but whats the use of showing a blog in SERP when it has not have updated since the last 2 years. I feel that in such a case, a reader is deprived of the updated and fresh information.

    2. Please reconsider Wikipedia listings. I am tired of seeing Wikipedia appear for almost every keyword that I enter. Even if it is an authority site mainly giving information, but as a user, it is not necessary that I would always look for information. I might consider purchasing a product and want only those websites that are selling products.

  84. Glad I’m not the only one that thinks aboutus.org is useless. 🙂

    experts exchange ticks me off too. If you go to the page from google, you have to scroll dozens of times to get to the content.

  85. @Dave (Original)

    Personalizing my search: You may be talking about removing/voting/hiding the sites after the search has been done, but I want an option even before that, a query parameter to exclude some sites that I do not want to see in my search results right on my first query itself. With the current available personalization, you can vote/hide a site for one particular search but for another search that site may prop up again in my results.

    Neither Google search preferences nor Advanced Search have options for this! May be I’m ignorant! Can you help Dave (Original)?

    Original?????!!!!!

  86. Deal with the out of control attorney spam in Google Local. I see attorneys creating fake businesses, fake addresses and phone numbers and multiple domain farms to dominate the 1-10 packs.

    The Maps Group never responds to my spam reports. If I could only have 10 minutes on the phone with Maps Jen, you could instantly know who these people are and how they are doing it!

    Google Maps destroys the natural language based businesses by letting spammy, adword looking sites to take over with its easily manipulated algo.

  87. Regarding my mention on bakers cyst synonyms:

    I forgot to mention that earlier I tried the option with the tilde sign, ~bakers cyst, but still didn’t have satisfactory results.

    It has got many synonyms like: Posterior herniation of the knee joint, popliteal cyst, synovial cyst, peritendinitis serosa, myxoid cyst, ganglion, gastrocnemius-semimembranosus bursa, semimembranosus bursa etc.

  88. Duplicate content checks.

    It’s not just spam sites that can copy good content. I’ve had two prominent travel companies copy and paste my text directly into their sites and then rank higher!

    It would be quite a challenge to figure out this kind of duplicate content and it borders on copyright issues but it directly affects google rankings and that’s wrong.

  89. Devaluing links created through spammy link-building campaigns and social media manipulation

  90. Dave (originial)

    Raj, I was thinking that after I submitted it. As far as I know, there is no option to remove a whole domain from search results and never be bugged from them again. It would be a good addition anjd I would use to remove;

    ExpertsExchange.com and Answers.com to name 2 from many.

  91. Arbitrage…

    I hate this so much, I hate to sound negative about Google but I think they allow it only because it brings in so much money.

    Doing a search for something such as BRANDNAME bathroom, you come up with a PPC link to a list of page search results which is Overture listings for “Bathroom” – as far as the user journey goes this is disgusting, taking me back 3 steps in my searching …

    I know that the click in Google cost them 10p and then the next click I could have made in the Overture listings would have earnt them 50p, so I do understand the business model for them, problem is that it effectively devalues ALL listings. I will happily provide a list of companies who actively engage in this!

  92. Hey Matt,

    Posted my wish here 🙂

  93. I tend to review some obscure video game titles as well as the major ones on my site, and annoyingly despite my site often being one of the few places on the whole web with a review of a particular game, it’s the sites with “game pages” that wind up top of the SERPs. These are holding pages for particular titles with keywords like “reviews” littered about them, despite the fact that they haven’t reviewed that game yet and often never do. We all know which sites do this so I won’t mention any names.

    And yes, kudos to them for creating such an SEO friendly website, very clever, but not helpful for the end user looking for an actual review of Game X.

  94. Build a better product search. If someone wants to buy a product, he or she will go to amazon to do the search instead of google. I need there is a lot improvement google can do to make its product oriented search as credible as its regular search that people will do holiday shopping through it.

    it could be another revenue steam for google too which I think will dwarf what google is making now.

  95. [quote]
    frederic Java Said,

    Second suggestion : when there are penalties, webmaster should know it from Google.
    [/quote]

    I would like to see this. More often than not, when I have experienced a penalty, it’s been through a duplicate-content issue due to my inferior programming when I was coding my own CMS’.

  96. Link exchange in the footer… it’s very easy 😉

  97. One webspam issue I’ve seen from time to time is multiple domains registered by the same company, but hosting largely the same content. I’ve done searches where 7 or 8 of the top 10 results have been from the same company, e.g. bluewidgets.com, bluewidgets.net, greatbluewidgets.com, bluewidgetsforyou.com etc. The content on the pages often has only minor variations, and is pretty much the same across domain variants.

    Don’t know if these could be filtered based entirely on content similarity, or whether other info would be required, but in cases where the multiple sites show largely the same content, I’d like to see just one or two examples, and leave space for other sites/domains relevant to the search.

  98. Hi Matt,

    I already asked this from you on the Google Moderator; I got no answer though since most of the people didn’t like the question, but seemingly it was solved. The question was:

    “Take this query: h t t p : / / http://www.go o gle.com/search?hl=en=investigation@fdic.gov The query’s first 6 result is a forged 404, not good for the user. Can something be made against this? While others are posting useful content, those are like spam: useless”
    [i killed the url (at least tried) cos it seems i was too lazy to check it before publishing it on Moderator, made a huge mistake and the link is kinda dead]

    In that particular query there are no forged 404 pages now, but I still find them sometimes amongst the results of IT related queries. I really find them annoying, I think they are more annoying than the ton of spam I get every day.
    The explanation is, that if I search for something, let’s say “Adam Lasnik blog”, I don’t want to get in the search results 176 pages where the every result tells me that they know absolutely nothing about Adam’s blog, but I want to get more info about how can I learn more about Adam and his blog.
    Odd enough, that even though Google states *somewhere* in a guideline clearly that a 404 page should return HTTP 404 code, these all return clean 200 FOUND, and their meta descriptions are also misleading.

    Anyway, I don’t want to lick anything, that’s really not me, but Google is by far the best SE on the net. Probably this is why so many conspiracies appear regarding that it’s evil 😉

    That’s all,
    Have a great 2009

  99. I think it would be great if the webspam team gave us a page where we could check the status of a website. One of our clients have a website that rank on all Google indexes but not on the Google USA one. Nowhere to be found. As a market leader in their field it’s weird and no one we’ve spoken to have been able to figure it out. If we had a page where we could enter a URL and find out if Google suspects or has sandboxed and the reasons behind it, it would be really helpful. This would also be a great tool to find out if a site is legit or not.

  100. Well, this question is definitely going to stir up strong emotions.
    As google increasingly becomes the dominant force in searching obviously each flaw shows up.
    I have a list but more than anything i would like more transparency.
    1. When pagerank has been updated, can you hint as to how this will affect us. You don’t necessarily need to give us the update made but to know how it will benefit us would be good.
    2. When i search i am normally looking for current information, can the results be ordered by date and relevance?
    3. I would love Google to have less dependency on the number of links and more consideration for the site itself, the content, internal linking, well optimised and accessible for everyone.
    4. I HATE the excessive use of the nofollow tag, I completely understand why it was created but I do not think that it has done the job it was created for.
    I would like to see webmasters having more control of who they link to and not just nofollow everything. I especially think this is ridiculous when you see certain sites that each internal link is a nofollow link.
    I am all for sculpting your pagerank but if this continues then the whole internet will be nofollow, where will that leave us?
    I am not just complaining, i would like to see a number of different tags created, each with a different weight. Webmasters are central to this in controlling who they link to and what comments they publish.
    a. comments tag – you know this is a comment but they are participating on the blog, it has been published so must be relevant to the topic.
    b. a recommended link tag (research or source article), again weighted for relevancy.
    c. advertiser tag, a paid advert on the site is meant for the users of the site and not really the link juice, tagging it as such omits any doubts over its presence.
    d. internal link tag, we want google to see this information as part of the site.

    I will be interested to see if Google does consider content to be the most important factor in 2009.

    Thanks and good luck

    Emma

  101. 1. Misleading titles
    2. Link exchange in any form

  102. 85 comments are far too much, so this point may have already been mentioned: SUPPORT!

    The guidelines are OK. But Google must be more open and more supportive with webmasters (GWT feedback). I admire Google for their work, but as I see it, along with many of my webmaster friends, this is their weakside.

  103. What I dislike most about Google’s organic results at the moment is the overabundance of automatic ‘compare and buy’ sites. Its running rampant, at least here in the Netherlands.

    Googling for ‘productX review’ fills my top 10 with 6 compare-sites which all have 0 (zero, none, nil) reviews of the product. They only have an automatically generated page about the product, telling me absolutely nothing.

    Since I’m looking for a review, the first 5/6 sometimes 7 are useless to me (and to anyone lese using that search query)

    And the best part: There seems to be a new site like that every month. They’re filled to the notch with advertising, affiliate links and empty product pages filled with specification by feeds. Not a single review.

    Added value? As it is now, none. Get the real reviews up (heck, I write them myself, I’d like to have my multiple-hours-costing-review to rank above an automatically generated empty page).

  104. I’m in a fairly niche area, but there are still sites ranking well that only have links as part of a link exchange system.

    One particularly successful example has a “resources” link at the bottom of each page which leads to a link directory. This contains hundreds of links to various companies in the broad sector that the website is about (construction) but very few about the specific sector (basement remodelling). I assume that these pages either link back directly or indirectly via other sites involved in the scheme. Whatever the case, it looks fishy, and appears to be the only reason the site is ranking well (content is OK, but far from the most informative site in the subject area)

    The site actively encourages more reciporical links, “To include your website here please email us with your Site URL, Title and Description. Be sure to add our link first as per the following details:”

    The reason I point this out is that I think that, although Google results have got better in 2008, Google still needs to keep working on the basics – i.e. blatant link exchange schemes. Despite this I think you’re doing an excellent and difficult job in reducing spam.

  105. Ohh and one more thing about eliminating webspam: randomize the pagerank updates. There are far too much sites who optimize their links for the public pagerank toolbar updates. This could be eliminated by doing more frequent and more randomly timed PR updates.

  106. @ Devendra
    @ Matt

    Made for Adsense (MFA) sites are of two type one are those who just copy content from article directory and paste it to site but some time their site get ranking thats bug other type of MFA sites provide valuable content even they made for ad sense, that not good to flag them as every one here is for profit so whats problem in using long tail for adsense

    Devendra you can not balm in long tail keywords you will find only MFA sites only, its is true adsense sites are more but Organic SEO alos use long tail keywords to get hits from search engine,

  107. Not your department but it would be nice if Google Custom Search Engine could use more than 2000 annotations per file, say 10000, so one could keep blocking the crapsites themselves.

  108. when searching using google.co.uk it seems that the national daily newspaper sites end up dominating the results. they just seem to have way too much dominance. an example search is a celeb on a uk reality show right now.

    top: news results about reality tv show, fair enough
    #1 wiki
    #2 imdb
    #3 dailymail article about her body Sep 2008
    #4 dailymail article about her getting married.from march 2008
    image results
    video results
    #5 telegraph Jun 2008 story about her baby
    #6 fan site
    #7 fan site
    #8 guardian Apr 2008 interview

    I know this set of results was different earlier in the week and didnt include all those newspaper articles.

    Ive heard you chat and explain universal search and such and I understand that pages can come and go to depending on how popular a topic is – what I dont understand is how google feel these articles are more relevant to what was there before.

  109. Hi Matt

    for me i would like to know what i am doing wrong so that i can fix it

    my main area is not Seo and i have done a bit of research to no avail

    Happy new year, keep blogging it helps

  110. Hi Mat!!!

    Chrome for Mac would be a start 🙂

  111. First of all I think Google SERP customizable results are a good way to avoid spam, if I think a site doesn’t answer my query I can remove it. Cool option, but I hope this doesn’t influence Google algo in any way because it could become an anti-spam measure that will work as a great weapon for spammers too. Then I think that Google Local hasn’t got a great level of control, I see many results for the same business, Google will have to try to remove dupe results even if domain/address changes, the same business cannot have multiple listings and dominate all the first five results.

  112. Agree with products spam, Ive recently bought a new digital camera and wanted to find a number of reviews for comparing.
    More than 50% of the search results had useless content: links to other empty pages, “no review available” messages, or even automatically built fake reviews, using generic phrases glued together.

  113. Peter van der Graaf

    Review your (domain) authority algorithm part!

    Age and masses of links have nothing to do with authority or trustability.

  114. I’mm going to agree with a lot of people on here.

    Answers pages you need to sign in for.
    and
    Websites that leech comments off others.

    Happy new year all!

  115. I totally agree with the penalty notification in the webmaster tools. Webmasters need to know if a penalty has been imposed upon his/her site. Currently this is just a guessing game.

    I play by all of the google rules and have a problem with my home page. The page is indexed. Unique quoted phrases from the page do not show up in serps for some reason.

    I’ve been trying to figure out if I’m penalized or if there is a google error and there is just no way to find out..

    More information on directory submissions would also be great. Half to world says they are wonderful and the other half says they are a waste of time and nothing but spam.

  116. I’d like web spam team to address the issue of internet sheft- identify websites whose entire content is copied from other sites and first send them notice, then [if the issue is not addressed] drop their rank and cancel their AdSense account (if any).

  117. Detect hcards and use them when available.(I mean, really do it, not just pretend to do it.)

  118. What about a rule book? I know there are the general guidelines, but wouldn’t it be a good idea to have a list of things (non-inclusive) which are definite no’s. You could also use this to improve the spam reporting in webmaster tools (if that is working well and not producing lots of rubbish if people misuse it) as people could identify a single practice. From what I can gather, search engine type results on pages are banned, yet I know of one large site which has come from nowhere in one year to large traffic numbers and is doing just that. Maybe it’s OK with Google makes money from AdSense? Or is the problem, you don’t want to get specific as people will use it to justify any practice you don’t put on the list?

    Some guidance would be great as I guess a lot of people (from conversations with others and looking at the comments in this thread) feel a little bit like, ‘I play by the rules, but see loads of people cheating and getting away with it…’

  119. One big thing stands out in my mind:

    f Google bans a site from their search index for extreme spamming of the general public and/or attempted manipulation of search results, why are they always allowed to keep making money via the Adsense program?

    Every spam site that I run across that seems to be penalized by Google (ie – I can’t find them in the search results when searching for their domain name), 95% of the time they are still running Adsense ads. This to me motivates them to figure out a new way to scam their way back into SERPs. I would like to see them dropped from Adsense as well.

    It is my understanding that the Adsense terms and conditions prohibit spam sites from running ads, so why are your Spam team penalties not enforced by the Adsense guys a few cubes over?

    I think this would help reduce the amount of “return” spam that comes back from the dead.

  120. [quote]
    1. remove pay-for-the-answer links from search results (e.g. experts exchange)

    I second that and would add that site where you MUST sign-up/in to see what googlebot saw should be removed too.
    [/quote]

    I THIRD this one. Nothing is more frustrating than stumbling across a page that I think has what I am looking for only to find out I have to pay for the answer… when I am not even sure they have the correct answer!

    Kind of a scam if you ask me. But I do think this is not a easy one to pull the trigger on… very grey area.

  121. I’m going to harp on with the login/signup pages. If I can’t view it with a single click from the SERP’s then don’t bother showing it to me. Search is supposed to be a quick answer to a question not a maze of questions/requirements.

    Drop the Toolbar PR altogether. The whole link buying market is fueled by it. I imagine that as you drop that the need to impress with a PR(insert number here) will also decrease.

    This is a tough one: Increase the size of the description in the SERP’s. I know this was tried a while back, but also allow a webmaster to specify that description kind of like adding a ngsnip (no Google Snippet). This would encourage webmasters to write better descriptions. The tough bit here though would be ensuring that the GoogleBot can confirm that the description is indeed relevant to the page in question. This will offer searchers more information in the SERP’s and hopefully increase relevant click through rates. Hopefully the searcher will also be able to sort the garble from real content before clicking on the link.

  122. My biggest wish from Google would be Chrome for Mac (hacking Firefox to act like Chrome is frustrating), but since you are asking about web spam, I think “x product reviews” is probably the only time I feel I cannot find a valid result from Google. Any other search I do (even when searching programming errors) usually leads me somewhere helpful. Thanks to Google, even though I did not go to college I can keep learning. I have used Google (and good reference books that I used Google to find) to teach myself PHP and Ruby, and most recently C and Objective-C. I never could have done this without using Google. Thanks!

  123. Richard Vaughan

    I wish google would stop its love affair with Hot Frog. Half of their listings that I see pop up in the SERPs are pure adsense pages.

  124. Either make Toolbar PageRank accurate or get rid of it altogether. Our customers obsess over it, yet it seems to bare no relation to the amount of traffic they get. PageRank goes down while traffic goes up. Makes no sense.

  125. Electronics is one area where webspam is out of control.

    I’ve been learning, with my son, how to salvage electronics parts and build things, and I’ve noticed that typing in the designation number of most electronic parts (Transistors, telephone ringers, etc.) will turn up an unending sea of spam pages. It would be nice to just be able to get the data sheet, rather than a dishonest-looking site that claims I can see the data sheet if I register (yeah right) or somebody in China who might make me an offer to sell me 100,000 of them at an unspecified price if I send him my email address.

  126. I think Google should seriously consider “Content VS Adspace” or “Pagination for the sake of advertising” – as these are serious User Experience problems.

    Thanks,.rb

  127. Ideas for 2009 ->
    1) DMOZ
    DMOZ should be given less or no value. I have some high quality educational sites like http://www.polishgrammar.com that took like a year to build with no ads and clean as a whistle and will not get into DMOZ, but others submit junk, things copied from lists and they are in.
    2) Raise the status of quality content from King to Super hero status.
    Make quality content even more important. I do not know how? Maybe focus on natural language structure, maybe I have seen many sites that seem to manufacture content. Or very short articles that if they are not recycled content they are recycled ideas ie how many ways can you write about breaking up with your boyfriend and weight loss.
    3) Help webmasters be even cleaner by letting us know what we are doing wrong.
    Webmaster tools are great. Give us more tools. See most people want to be squeaky clean boy scouts and will do the work to create content that is organized and of value for others, but there will be more incentive if we know we are working in the right direction or something we do is not good. For example, I use wordpress and innocently created a lot of duplicate content, with a few plugins I was testing. Webmaster tools gave me a heads up about this so when my blog fell I could repair it.

  128. Aleluya!!!

    You have to reconsider Wikipedia listing
    Stop show Google maps for searchs like “NY apartments”
    Reduce the number of adwords ads at the SEPR´s, spammers can pay to be in your results over the good results.
    Remove videos from results (if I want to see videos I may go to Youtube)
    Stop showing Google News at results. If I want to see news then I may go to Google News, CNN or any other News related site.
    Stop putting Google stuff above the real result, I want a search engine for Web results, not for news, froogle, blogs, videos or so on.
    Create an easy way to report stolen content
    Stop edu / gov passing link-weight to others sites.
    Some new business that had tons of great content but are stuck compared with old business.
    Create an easy way to report pay links
    Spanish market need their own Matt Cutts
    Create a spam detection api for WordPress or make a deal with Akismet, you need to do something about spam comments.
    Be more accurate about splogs on wordpress.org and blogspot.com
    Classifieds site, in Spain a lot of those sites (MFA´s) have script for content generation that ads no value to the user but you ( GOOGLE ) consider those site really important because they have million of pages, and I´m talking about the top 10, so please take care about automated content generation.
    Remove directories that just copy information of YellowPages to create niche directories.
    Scraper sites that rank faster and highly
    Do better job detecting link networks
    Check the Spam on Google Local
    Arbitrage… you don´t know what it is? Just ask for Ask.com on Spain.

    Matt, I know that “Google is an American public corporation, earning revenue from advertising related to its Internet search, e-mail, online mapping, office productivity, social networking, and video sharing services as well as selling advertising-free versions of the same technologies”.
    But remember that what we really love about Google is the search engine, Google is what it is today its because the way that they were showing results to users, just remember that.

  129. Thanks everyone for weighing in on this. It’s definitely a lot to read through and think about.

  130. Several things for the Google Spam team:

    – make the difference between an affiliate and a publisher site. For example, I have a couple sites and have affiliates for them. Some of my affiliates completely copy/paste the source code of my landing pages, creating duplicate content. I noticed that Google penalized me for it, but ranked the affiliate (that use MY content) better than me. It’s pretty easy I guess for Google to see what’s original and what’s copied…

    – go down on these sites that have random strings of text with one specific keyword in it. For example yesterday, I googled “Yellow sn0w” (to unlock the iphone) and the first 2 results were just complete spam. Look at these pages: http://freewebs.com/newyearsrose/yellow-sn0w.html and http://onice.awardspace.com/yellow-sn0w.html ). Obviously the pages are made to generate money froms ads. Scroll down and look at the text. Totally spamming and ranking for these keywords…

    Thank you

    Sebastien

  131. Just to bump the numbers as previously stated
    – Answer sites that make you register/sign in for the answer. I HATE them.
    – Links to pages that are search results for what you just searched for.

  132. Spammers sending out bots to find redirecting query strings and then scaling the successes out to send traffic/302s/links to their own spam site from a legitimate domain.

    eg if you have http://www.trustworthydomain.com/?linkurl=http://www.affiliatelink.com

    Then someone comes along and adds hundreds of links like…

    http://www.trustworthydomain.com/?linkurl=http://www.PillsPornCasinoLink.com

    Then they go to a 3rd party blogging system (blogger.com, wordpress, etc.) open a quick account and link to all of those spammy redirecting URLs from it. Or maybe they use those redirecting links in emails. Or whatever else…

    What can Google do about it? Notify us in Webmaster tools if you see an otherwise trustworthy site suddenly linking out to dozens or hundreds of spam sites. Anything else?

    What can the SEO do about it? Don’t use redirecting query strings like that first of all, but you can also disallow /?linkurl=* in your robots.txt file. That won’t solve the problem of your domain being used like this, but at least it will keep Google from thinking you’re trying to redirect to spam sites.

  133. Remove social bookmarking services from the search results, they are not delivering content and only spamming the search results often indexing higher than the sites they link to.
    Reduce the impact links from countries like russia have on european or american pagerank. Poor people are forced to create them for minimal payment.
    You could consider to give the first found links a line or even two more text to display their content. Let your user decide whats spam or not and with more text shown it would be easier…

  134. like so many i’ve made mistakes along the way and have never knowingly broken the rules….but it does bother me when i see people spamming local search,, i think this is wrong and should be penalised possibly through lower page rank ..

    regards

    jemile

  135. Transparency and fair play for sites that are penalised.

    – why not give more (some!) information as to what is going on?
    – why not give warnings?
    – why not provide some mechanism for interaction?
    – why not let the community debate your guidelines?

    The collateral damage to penalised sites, whether that penalty is “wrong” or “right” can be huge and disproportionate to the alleged infringement(s)

    With the Significant Market Power that Google enjoys it should be more accountable, transparent and involve the community more.

    Matt, can I send you ideas have around this? Is there a willingness within Google to improve on this difficult area? I have seen little from you on this since your 2006 posting.

    All the best

    Grahame

  136. Somewhat related:

    Make it more clear just what google does with regard to urls in Javascript on a page. At one time I thought it was clear that these urls were ignored. Then I read that if there is a full url, google may crawl it for discovery and indexing but it may not bear on pagerank. Now, even partial urls are crawled by googlebot. This can lead to a lot of 404’s, requests for tracking gifs, etc. Do i now have to figure out a way to nofollow these urls so that i don’t get penalized if they are paid? Or make them all go through a directory that does a redir that is protected by robots.txt – which only makes this stuff more complicated and less reliable and more slow.

    Conversely, if another site includes a widget that points back to me, but that code is inserted into their side in a an iframe and the content returns via ajax/etc, it would be invisible to Google and I get no link love as a result. From what I’ve read and observed, having noscript tags are useless for pagerank. So what is the right answer? Anyway, some clarity here would be helpful for all of us. Sounds like a good blogpost to me.

    Happy to help you with blogpost ideas at no extra charge 🙂

  137. Wow, the first comment points out a site just like you asked us NOT to do. Other commenters should re-read the post too because your talking about Webspam not Chrome for Mac and so on.

    One of my biggest and on going gripes is that Google still value’s aged links from dead or bad practices. In 2005 Realtor to Realtor linking was shot down by Yahoo and later by Google wherein we believe penalties existed for real estate agent websites when discovered by Google.

    To this day I still have competing agents with my own customers that are either actively participating with agent to agent linking or still riding out the value of all those hundreds or thousands of prohibited links that I discourage my clients time and time again to refrain from. They themselves are often questioning if they should listen to the advice about those links being devalued or worthless because hey… the competitors are doing it or are still gaining from it!

    After all why does anyone care to find a Tampa Realtor when on a San Diego real estate website after searching, “San Diego real estate”.? This linking strategy for us is long dead but many are still benefiting from it.

  138. there are just too many web sites which provide none of their own content and they are just simple rss agregators search engine wise designed (better then original sites) to draw crowds and serve a lot of ads

    they add no value to the net and should not be awarded of their position in google

  139. Eric,

    You can’t expect Google to solve all your problems. If you’re using an easily exploitable script when there are more secure options you can’t expect them to fix it for you. That’s just a waste of resources.

    Either use a better script to do your redirects that can’t be exploited, or use Google Analytics to track outgoing clicks.

  140. Ok, one more category of sites that shouldn’t be listed….

    Sites that distribute Press Releases. Companies use these sites to gain link juice and push other sites down the SERPs.

    These are nothing but paid link/paid content sites and they routinely dominate search engines.

    These sites are pretty obscure. Not many people go to them other than through search engines, mainly Google. So they are basically charging people to get listed on Google.

  141. As someone that seems to spend half my time fighting comment spam — i’d love to see Google provide some mechanism to really punish these guys.

    ie: The ability to report spam comments and the destination URLs.

  142. I think it would be great to re-validate the ranking of most of the news/ social syndicating-websites. Most of them just live from stealing content and bother me a lot. Try to rank original sources better than aggregators. A matter of fairness…and a good algorithm.

    Happy new year.

  143. How about jumping on false advertising claims?

    Often I will look for a niche suppliers directory or something looking for exposure and some of the figures that so called reputable websites put out just seem to be plucked from thin air. So, if a website is saying advertise here we get 20,000 visitors per month when in reality they probably recieve only 1000 or something andddd they are using google analytics or something so you actually know the inflated figures then these pages/sites should be punished too.

    I realise that this is as much a moral area as much as anything and dont really know how viable it is to crack down on but nevertheless I think its a disgrace.

    Regards and wishing everyone a fruitful 2009

  144. My wish:
    On a search result page, a domain (incl. subdomain) appear only once.
    It can not be that a domain occupied the first 5 searchresults.
    http://www.google.ch/search?hl=de&q=search
    (ok, google has already improved, it was earlier more than 5 results for this search 😉 but there are also other examples like this)

  145. Hi Matt

    Would really appreciate it if Google could offer a way of eliminating directory listings from SERPS – perhaps “search term”-directory? Eg I was looking for catteries local to me, the first two pages were directories, but I didn’t want the yellow pages, I wanted a cattery.

    I’d also echo the comments about review sites that don’t actually contain a review of the product being searched.

    Regards

  146. please please stop showing http://www.experts-exchange.com site in results

    they are useless

  147. For all of you going on about experts-exchange. Google is indexing them because the answers do show on the page. Scroll down, past the first set of ‘answers’. Past what looks like a massive footer. Keep going and hey presto, there are all the answers, in plain text.

    Have to admit that I didn’t notice that the first time either and was equally annoyed 🙂

  148. Hey Matt,
    Is this among your team’s policy against SE spamming? My post is that one of your blog entry ranks from every page of Google SERPs starting from 10 to 18. I did capture some…
    http://cuocthiseo.vietseoguy.com/seo-contest/cuocthiseo-seo-contest-googles-bias-towards-matt-cutts/
    Really interesting huh? 🙂

  149. The sites like http://www.experts-exchange.com, have an excelent SEO and when i find some problem this site appears but dont tell me nothing…

    Regards

  150. The latest trend seems to be these statistics sites that I mentioned previously.

    They create urls and pages even for domains which they have little or no data for. What little data they have is usually wrong. They also don’t accept requests to remove domains.

    If a site doesn’t provide real information and doesn’t accept request for removal they should be penalized.

    All they are doing is creating billions of automated pages to increase their sites’ rankings using domain names they don’t own and that are copyrighted in many cases.

    These are the same techniques thin affilates and spammers use and it’s a shame google allows these sites to continually appear in the first page of SERPs.

    The link: directive doesn’t always work (maybe do to nofollow?), so I frequently do a search for just plain ‘domainname.com’ to find out who is linking to that site and what they have to say. I usually have to go to the second page worth of results to see anything useful.

  151. Sam,

    That’s pretty close to, if not exactly link cloaking. I know that after they started doing that I stopped checking their site and then one day I did a search for the term and found it all the way at the bottom.

    It’s deceptive and violates the spirit of what Google is trying to do.

  152. The webspam I’d like Google to tackle are three-way link schemes. For many mom-and-pop ecommerce sites, exchanging local and relevant links are their best shot at getting off the ground. It’s never going to make them tremendously powerful and even in the worst light, they’re at least open trades. But three-way schemes are deceptive by design. I’m sure it’s on your list, but it’s getting out of control.

    By the same token, I wish you’d dial back the gray-barring of internal pages. It looks like you’re trying to devalue SPAMmy pages and links pages, but you’re catching a lot of innocents, too. I’ve had important navigation menu pages fall victim to this, which really sinks whole sections of sites into oblivion.

  153. It seems this day and age content is king – I think some people are using too many words on the page making it less relevant. They’re not technically keyword spamming but after 20 bolded ‘toys for sale’ we get the point. I would like to see Google “award” pages that are updated regularly and pages that have good content that isn’t over optimized. Maybe some visually aesthetic points. I know I’m not talking SEO jargon here or really directing this in the correct spam forum. This comment is more for algorithm purposes. I just wish everyone a good 2009.

  154. DIRECTORY SPAM and NON-SPAM

    The Web has become bloated with huge directory websites that re-package links (spam and non-spam links).

    I would humbly suggest that Google move ALL directory websites under
    a seperate Google search category section much like is currently done for Images – Video and so on…
    End result would look like this:
    Web – Images – Video – News – Maps – Mail – Directories

    This way Google can separate out the (directory clutter) from it’s
    general results.

    Cheers,
    Rick Vidallon

  155. Cleaning up the spam in the insurance vertical.

    Every other month in Google, the spam comes back and then penalizes them and then they come back, and then they are penalized. I understand freshness, but when you see results are focused on linking, then the next week on limited content, then the next week something else – it makes it difficult for webmasters and users alike.
    Also, the trends seem to follow an annual calendar. Where spam is there in January – April, then penalized through June, then content prevails through September, etc…

    David

  156. Dave (originial)

    For all of you going on about experts-exchange. Google is indexing them because the answers do show on the page. Scroll down, past the first set of ‘answers’. Past what looks like a massive footer. Keep going and hey presto, there are all the answers, in plain text.

    That is NOT true. You MUST sign up to their seven day trial scam to see answers/replies.

  157. Dave (originial)

    Ok, I see what you mean. However, this is VERY deceitful at best. There is already more than enough deceit on the Web, 1 less wont be missed. Google should drop their results until that stop trying to deceit email addresses from Google users.

  158. Lots of people work really hard to make good content… then it gets stolen and those dogs cash in on it.

    I would like to see “Content Registry” where I can submit my content first and mark it as mine. Maybe something in webmaster tools?

  159. I still find that a number of sites that do not have very good content nor much original content but high page rank come up above those sites that have excellent, relevant content. I understand that content is not the only factor (external links, trust, etc.) but it seems to me that the sheer number of links to a particular site too often trump excellent content.

  160. I agree with EGOL, I love the concept of Content Registry. Perhaps Google should buy copyscape or develop one of their own system. All Duplicate contents including the ones where they scrap first few lines of your articles should be punished !

  161. Google needs to influence the web for good instead of evil…

    For example, by rewarding positive behaviour such as:
    – 100% correct HTML
    – meta tags for specifying locality
    – meta tags for classifying pages – blogs/forum/product info/privacy policy etc, taxonomy
    – using google analytics

    Positive behaviours should be verifiable via webmaster central, with a small but tangible boost in search position as a reward.

    Negative behaviours that can be automatically detected such as too many links, keyword stuffing, duplicate content etc should have explicit penalties viewable in webmaster central – to provide explicit “reward” for avoiding such criteria. I’m not talking serious black-hat activity – just common tactics that are unhelpful to google.

    The rewards and penalties do not have to be large – just not hidden – for them to have an effect on the behaviour and actions of web masters. And not all penalties need to be published – just minor but common occurrences that hinder Google’s ability to properly rank search results.

    For too long, Google has been (indirectly) sponsoring web spam by creating the environment where web spam can flourish. Or to put it another way, spam targets Google because Google is the biggest.

    Turn the focus of webmasters away from gaining page rank or positions, to creating better web pages – by rewarding good behaviour, and punishing minor bad behaviour, with explicitly measurable rewards and penalties. Webmaster central has started to do this with things like duplicate title tag detection, but this needs to be expanded, with the consequences made more explicit.

  162. For all of you going on about experts-exchange. Google is indexing them because the answers do show on the page. Scroll down, past the first set of ‘answers’. Past what looks like a massive footer. Keep going and hey presto, there are all the answers, in plain text.

    Sam, you must be new here. Otherwise, you would have realized that your answer was far too intelligent to be understood by the masses.

    People can also use the Google cache to see any answer (if they do what you said, of course). Or if they’d rather, use the Yahoo! cache. This works every time experts-exchange is part of a SERP. So there are not one, but three answers to the so-called experts-exchange problem. Yeah, they’re annoying and not immediately obvious, but it doesn’t take all that much to figure it out.

    Now, is EE an authoritative resource? Should it be considered one of the premier sites on the Internet? Absolutely not.. But it can be occasionally useful (and by occasionally, I personally find it useful about once every 3-4 weeks), and the data can be obtained without requiring a membership.

    Note: I’m not a member of EE, nor do I work there. I just don’t want to see something I find useful every so often removed from SERPs because a few members of the emotionalogical crowd would rather complain about a problem that for all practical intents and purposes doesn’t exist than take five minutes to find a solution.

  163. Lots of people work really hard to make good content… then it gets stolen and those dogs cash in on it.

    I would like to see “Content Registry” where I can submit my content first and mark it as mine. Maybe something in webmaster tools?

    This is a really good idea, and would be stunningly simple to implement. Allow webmasters to submit an XML POST request whenever new content is created, we submit our stuff as it’s generated, you mark down that we got there first (assuming we did) and only show that result in SERPs. Easy, breezy, beautiful Cover Girl. You could even go one step further and require some form of random key generation or verification for this task specifically.

    Think of all the article sites you’d take down in one shot with that move!

    Big up for that suggestion, EGOL.

  164. Probably outside of the scope of the Web Spam team, but it would be great to see Duplicate Content Search and Agent Rank developed and launched.

  165. Thanks for asking us!

    Focus on GOOGLE Local please. I see so many instances of multiple listings for the same company, location, phone, etc. It’s amazing that you haven’t dealt with this already.

    Knock domain name value down a peg or two PLEASE! The value and importance you place on exact keyword phrase .com’s and .net’s is way too much and there are so many poor websites and spam sites that get absolute top listings just for buying a good domain name. This also makes it more difficult for legitimate companies to buy domains from domain squatters asking tons of money for these domains.

  166. Google Guidelines are indicative but can be bit vague too. I had a travel website which was stripped of its pagerank years ago and removed from the index. It was ranking well for a lot of keywords before that. Through ignorance I must have done something wrong.

    I read the guidelines million times, removed anything that could have caused the problem and submitted a re-inclusion request few months after this happened. No one from Google ever assisted me, nor could any professional help solve this issue!

    After an incredible 4 YEARS (or bit more, since I had lost hope to be honest), my website is back on Google and ranking fairly well and this happened after …

    MATT CUTTS wrote on this forum that ‘some old penalties are expiring’.

    Whether it was an automated ban or not I will never know. What I know is that Google is not transparent any many honest webmasters are wrongfully tagged and penalized.

    Recently Google also suggested that our competitors can somehow get us in trouble. I’m in favour of cutting down on spam (like everyone I guess), but one has to be very careful.

    I ask MR. CUTTS, will you be happy to spend 4 years in PRISON and don’t know why? Guess that sums it all.

  167. Seperate Blogs, Forums, Videosites and “ordinary” websites from eachother and make seperate searches for it. Exclude these from the other searches. By creating these type of verticals searches could be made a lote easier.

  168. 1) Agree that more transparency (especially when it comes to looming or actual penalization) is always welcome, and Google Webmaster Tools is a great vehicle for offering it. Webmaster Tools improved a lot in 2008 by the way – thank you for that.

    2) Agree that pages that are searches for something should not appear in searches for that term – especially grave when a page that is a zero result search for something apperars in the SERP. Kelkooo and Kijiji sites are the most high-profile “offenders” I have noticed.

  169. I absolutely agree with Adam. Some kind of “Content Registry” in the webmaster tools would be a great help to fight copycats…

  170. Local business listings & map spam need serious attention.

    The whole process of adding and verifying a listing is shocking in my opinion. I’ve tried and tried to get our listings resolved but through black magic, they continue to show the wrong information which is costing us a whole swag of money.

    Spammers being able to submit a duplicate item into the local business listings, which you cannot rapidly have action taken against also isn’t good enough.

    If I go to the effort of submitting valid, high quality information into the local business centre and have it verified – I shouldn’t have to suffer when there is map spam bleeding my business of its business.

  171. I’d like to see the shopping results sorted out. I find it infuriating that they are mostly ebay expired auctions, out of stock items or just non existent. I don’t know if those results are gamed or just sub google standard but now that they appear above genuine products and sites on most SERPs I find it misleading and a waste of time both as a site owner and a consumer.

    The other thing that bugs me is that so many people (in my SERP world anyway) have exact copies of their sites linking back to their primary domain name. Google filters some of the pages but not all (cache shows as primary domain and not secondary for most but not all pages). This seems to give a considerable boost to these sites ranking.

    Happy new year Matt!

  172. Hey Matt.

    In my opninion i would be so glad, if google focused ALOT more on bouncerate(Or some other kinds of thing for punishing people who visits, and leave within 4 secs.), so we can get the cloacking/manipulating barstards punished.

    Ty.

  173. Roony McDononny

    There are a gazillion Made for Adsense sites that end up ripping off all legitimate users. Instead of playing police and looking to penalize sites competing with you go after these fraudsters.

  174. I think one of the biggest issues is there is not enough reporting from google if something dodgy is found on your site. Not everybody can be experts at SEO, so if you employ a SEO person to do the work for you, its near impossible to tell if they have done something which can cause a penalty or lower rankings.

    It seems google likes to be secretive on this , which is odd. I know very little about SEO, we had a website that ranked very well, but employed a SEO consultant to do some work, a few months later we lost all our page 1 ranks.

    Because I know very little, I then had to employ another SEO company to find out what was wrong, all of this took 6 months, and in the end we found out that the previous company had subscribed us to some sort of automatic link farm thing. It would have been much easier if there had been an alert to this in my webmaster tools, instead of just knocking my site off the rankings without any notification. You cannot expect everybody who owns a website to be experts in SEO.

  175. 1: if you hit the back button and it does not let you out it should have a penalty of -1,000,000,000 at least. Lots of sites still do that.
    2: Old sites with 20,000+ links still dominating niche serps, new white hats have no chance regardless of content because age and link still weigh more than actual content. You started cracking down on paid links than please finish it. And mean it.

  176. Nothing new to say but would like to weigh in with:

    1)Local results – appear to be so easy to spam. Can see the same company five times, slightly different listings, in a set 10 of local results.

    2)A particular paid links annoyance: sidebar and footer links on website home pages with some page rank scream paid when you look at them, but can still work wonders in Google’s search results. Same thing with some networks of sites that don’t look spammy but actually just link to each other.

  177. By Multi-Worded Adam: “you mark down that we got there first (assuming we did) “

    Actually the owner might not need to get their first. If he claims it, the same filtering actions could be applied to existing copies already indexed. It would be preemptive DMCA – but the person filing for ownership would be required to swear a strong oath that the content is really his.

  178. Oh Yeah! Matt one more simple wish. Please keep news results in Google news, or at least down to one or max 2 items.
    I do read Google news daily, I don’t need news results in my regular searches.
    thanks!

  179. A method to eliminate or flag backlinks to your site that you do not want Google to count. I thought this feature was being added to Webmaster Central, but I sure can’t find it.

    And I 2nd Grahame Davies comment.

  180. Happy New Year !

    Of course I realize that what I am about to say may bring the walls of the social media version of jericho crashing on my head!:)
    BUT onward I go..I think there needs to be a great divide recognized when it comes to business inquiries in search for a product or service. I would respectfully submit that if a potential customer is looking they do not want to wade through the u tube vids, the self proclaimed blogging experts and the over all rath of canned social media entries on the subject. They are looking tyo source a physical provider of service/product. If Mr. Googlebott could somehow give a choice -click here for social media listings on your query-but just give the legitimate providers the search result listings -that would be awesome.

    You folks have contributed to the ability of small and medium businesses to compete and survive globally and regionally-keep up the great work.

    Thank you for reading.
    Regards
    Muskoka

  181. Google actually doing something about sites that have been reported as being spammers.

    Over the years I’ve reported a lot of sites and today, those spammy doorway pages are still live.

    Maybe a trusted group of people who can report and investigate spam?

    My two cents…

  182. The biggest problems with webspam could be eliminated by attacking their various linking schemes that usually empower their webpages. Most of the spam I encounter has little if any useful content of its own and has to be powered up externally with all kinds of different link building strategies. If you ridded yourselves of rewarding link schemes that artificially empower mediocre or non-existent content I imagine most webspam would be pretty powerless.

  183. Wow, the knives are out in this thread. Prey no one names your website by name here. Someone might arbitrarily decide to nuke you from the google index because a few haters call it out.

    I for one have found helpful answers on expert-exchange. Too dumb to scroll? Man up.

  184. I’d like to see:

    –blog networks dealt with (links links links)
    –blog giveaways (where links count as an entry)
    –blog badge links (I was featured on…)
    –sites requiring login to view content (should they really be at the top of the serps?)

  185. I would really like an “investigation log” to be able to access case updates when someone reports paid links or spammy tactics. I don’t know what’s happening with my submissions and it drives me crazy. For example, I’ve submitted thegeneral.com multiple times and have no idea what’s going on with them except that they’re still buying links and ranking page 1 for auto insurance.

  186. I echo the sentiments about experts-exchange.com, users are not served well by clicking on a number one experts-exchange result only to find out that the content is not accessible.

    Also, sites like ripoffreport.com rank well, yet the owner is in hiding to avoid process serving and the servers are offshore to prevent prosectution under US law. Bottom line: If a website does not provide a reliable agent of service and takes such drastic steps to avoid being bound to US law, they should not rewarded with google authority in serps.

  187. Yes, multiple sites by the same business has become common practice in a number of industries. They just think it is how you play the game — if two sites come up, you have twice as much chance to get clicked. They see their competitors doing it and think they must. They don’t even know they are spamming. I’ve had a couple clients talk about needing a bunch of similar websites. To be fair they need outreach…. Or I suppose some major penalties will get the word out fast. I’ve seen this in attorney websites and the printing industry. I’m sure it happens elsewhere too.

  188. Keep on doing the great job you are doing already…

    I think Google should look into link exchange tricks where webmasters develop 2-3 websites on the same subject or the situation when they end up with one way links altough those links have been exchanged already.

    Good Luck!

  189. I have two real pet peeves about Google search results. For the most part, Google does a good job, but there are two areas that I’d like to see major improvement on:

    1) The age factor needs tweaking – It’s frustrating to be looking for information and to find a load of web pages that are older and outdated appearing higher in the SERPs than newer pages with more recent and relevant information. I understand the purpose for the age factor and don’t disagree with it, but there must be some way to filter out older non-relevant and outdated results.

    2) TLD discrimination – This one I just don’t understand. In a global economy, any website in any country should be able to rank highly for keywords. I have purchased domain names with country-specific TLDs just to test them against .coms and .nets and they always underperform, even when I use the same tactics. Even when searching Google.TLD, those websites don’t rank. The dot coms, dot nets, and dot orgs always outdo them. I think there should be a level playing field across all TLDs, including country-specific TLDs.

  190. My idea is to considerate the time users spend on a particular website as one of the parametres for ranking.

    Used in the right way, this could help promoting websites with content, that the users find interesting and useful enough to spend time on.

  191. I think Google Maps/Local spam needs to be cleaned up. Many companies with fake mailing addresses and/or non-legitimate businesses create very spammy profiles. This hurts the user experience for sure…that is my 1 cent :o)

  192. Info for Matt Cutts

    Can something be done about all the “web hosting review” websites that come up in the SERPS? It would be cool if Google could limit how many actually show up on a page. (no more than 2) Why do we need more than 1 or 2 of them on the 1st page in Gogole? Most of the time I’m searching for a web host and not some website with 10-15 affiliate links on it.

    Give me the real companies! *please*

  193. Google should individually check the sites / blogs for contents. They should read it and decide if that specific website is just used for spamming or not. Other sites are just posting irrelevant and repeated content for the purpose of increasing their Adsense earnings. Google should be more keen to check whether the links that points to a specific website is natural or not, has quality or no quality, etc. etc. 🙂

  194. Matt,

    I wonder if the fact that these results are almost identical to last year’s is sending any kind of message?

    Definition of spam: stuff people don’t want.

    1. Local search spam. You want to find a local restaurant with a website you can check out – but all you get on page 1 is trash directory listings. It’s spam – or low-quality search results. Where is the line drawn?

    Same as last year.

    2. Price comparison site spam. More or less the same as #1, but for anything that can possibly be bought.

    Same as last year.

    3. Cloaked paid content spam – your search takes you to a login page / paid content / subscription site. It’s spam. Not on page 1 please.

    Same as last year.

    4. Oh, and also, of course, please do something about the terrible, awful, appalling webmaster site reinclusion process.

    Same as last year actually.

    Chris Price, Kent, UK

  195. Todd Mintz said it perfectly – LOCAL!

    And please add a way, either through the LBC or via Webmaster Tools, to report Mapspam. Right now, the only way is through a publicly viewable report in the Google Maps Group or a publicly viewable edit to the listing – both which display the User ID of the reporter and open him/her up to swift payback from each spammer ‘outed’.

  196. Dave (originial)

    This is a really good idea, and would be stunningly simple to implement.

    Also “stunningly simple” to abuse.

  197. Dave (originial)

    I bet it would also be legal mine-field that Google will prudently NOT volunteer to walk-through.

  198. wow 188 comments.. Matt: Do you really read them all ??

  199. Google Algorithm Addition

    If Page contains “****” “****” “****”
    = Googlebot no index + delete page from Google index

    99.999% accurate
    0.0001% false positives

    eg.
    3,460,000 for “poker” “insurance” “cialis”
    2,360,000 for “poker” “insurance” “levitra”
    2,520,000 for “poker” “insurance” “phentermine”
    2,080,000 for “mortgage” “poker” “phentermine”
    1,150,000 for “slots” “insurance” “phentermine”
    1,750,000 for “casino” “porn” “levitra”
    1,800,000 for “mortgage” “poker” “levitra”
    105,000 for “casino” “insurance” “phentermine”
    106,000 for “casino” “sex” “levitra”
    110,000 for “hold em” “f##k” “cialis”
    12,200 for “roulette” “sex zoo” “cialis”
    134,000 for “roulette” “c##t” “cialis”
    176,000 for “mortgage” “insurance” “levitra”
    337,000 for “roulette” “pu##y” “cialis”
    350,000 for “holdem” “f##k” “levitra”
    608,000 for “roulette” “porn” “cialis”
    71,700 for “mortgage” “casino” “phentermine”
    84,300 for “poker” “Animals s#x” “cialis”
    840,000 for “slots” “insurance” “levitra”
    957,000 for “casino” “f##k” “levitra”

    If Page contains site:.edu “****” “****” “****”
    = Googlebot no index + delete page from Google index
    99.999% accurate
    0.0001% false positives
    eg
    1,020 for “mortgage” ” insurance” “levitra” site:.edu
    2,420 for “casino” “porn” “levitra” site:.edu
    555 for “poker” “porn” “phentermine” site:.edu
    833 for “poker” “incest” “phentermine” site:.edu
    877 for “casino” “incest” “phentermine” site:.edu

    Make sure theses types of pages DON’T get spidered
    & are NOT in the index
    = No incentive for spamers to produce them!!!

    10yrs & Google spam team cant write this simple code into its algo?????

  200. Actually the owner might not need to get their first. If he claims it, the same filtering actions could be applied to existing copies already indexed. It would be preemptive DMCA – but the person filing for ownership would be required to swear a strong oath that the content is really his.

    You put this way better than I did, but that’s pretty much the thought I had. The one thing I would add to it is that there needs to be an avenue for counterclaim of copyright to be made (in the event that some poor schlub gets his/her content stolen).

    The only issue I’ve had with DMCA as a whole is that it’s relatively unenforceable on a global scale (Vietnam comes immediately to mind here). But on the other hand, Google acts as its own police so I don’t personally see this as an issue here. (And no, that’s not a criticism of Google…they act as their own police because they have to.)

    Joel: Matt doesn’t need to read them all. He just has to skip over those that have hidden and/or obvious personal-only agendas and focus on those that actually have some merit (e.g. EGOL’s idea…thanks to the guy who gave me credit for it, but it’s not mine. I just expanded on it.)

    Steve: smart call, dude. If one can’t operate a scroll bar to the point where one can look to the bottom of a cached page, then how can one request that EE be globally censored? (Funny how many people fall into that boat too, despite the Google cache loophole being in place since about 2001).

  201. It seems to me that some sites that are very large (have hundreds of pages) get better rankings on certain key words, even though those key words are only 1% of the sites content. It just seems like the sites get way too much SEO juice from being large.

  202. I’m sick of jibberish sites that appear highly in Googles results for text and names taken from my articles. Google alerts almost every single day include these “sites”. Is Google really incapable of identifying these sites?

  203. Dave (originial)

    wow 188 comments..

    Just goes to show that those with Websites still see Webspam as an issue with Google. I.e. We ALL have agendas, some are just willing to admit it, while some others are sneaky and conniving about it. The prudent people know that “personal agendas” don’t always mean a bad idea. I.e. NO business should wear their Heart on their sleeve, like some people do.

    ExpertSexChange are deceitful, the ONLY reason they have the visble replies right way down the bottom IS for googlebot, not for those who find their pages via Google. Hence the *big visible banner above the fold* (and the replies that are closed for viewing) about starting a 7 day trial to see the answers, that is a lie. I would never trust EE with my email address due to their deciet.

    IF your own pages are being outranked by stolen content pages, you have bigger problems that putting the onus on Google. Google already do a fantatic job on ranking the orginal above the stolen content. I doubt Google would ever enter this legal mine-field, despite the ignorance of some thinking it’s ” “stunningly simple”. That and the fact it just too easy to abuse.

  204. I would like to see scrapers that rip off original content and social nets that display a blurb not rank above (or replace) the author’s post. In one case this year, I had the scraper credited, and later the Digg blurb replaced it, while the original content was ignored.

  205. Hello Matt. First time to comment here, although I enjoy reading your blog. With all these comments here I am not sure if this has already been mentioned, but it is what popped up in my mind while reading your post at the beginning.

    – I would like to see the quality of the Ads rise-up a bar. AdSense is no new news and if there are many ads that just point to useless pages and without interesting content, I am afraid that users/visitors, will one day just ignore them.

    – I have google alerts turned on and it is one of the things I read everyday. While there are many useful “alerts” that I receive, there are some that requires further filtering. Maybe just like Google showing results on the SERPs

    Thanks. Will take this opp to wish you and your team a happy new year.

  206. I know that technically it’s a hard one, because even the domain name authorities are struggling to police it. But I’d like to see some way to block sites who have false domain name owership details…. I know of one guy who has a load of websites and just about every single one of them are under false names and addresses.

    Perhaps an alternative is to have some way of boosting websites with authenticated domain name ownership details….?

    This should help cut down on the fraudsters out there!

  207. Wow, too many to read to see if anyone has mentioned this. I want Google to certify “trusted content providers.” Someone, some team, at Google needs to look at popular web sites and blogs that are copied frequently, verify the orginal source, and certify them. Then, as other sites and pages are spidered, if the Googlebot finds trusted content being duplicated then the offending sites page(s) should be removed entirely from the index.

    Likewise, the newsgroups need to be indexed and ANY, and I mean ANY, site that duplicates newsgroup posts needs to be eliminated from the index. It’s extremely frustrating to find hundreds of hits on an error message I’m searching for and discover that every single one of them is from a web site duplicating a newsgroup post about the error and that none of them have an answer.

  208. Remove warez-scrapper-spam sites that hawk viruses.

  209. Penalize websites that promotes SEO contests which build massive spam back links that increase SERPs for competitive keywords. Almost all ask that participants create a page with a topic that relates to their homepage then have them link back with very popular targeted keywords. This is a problem that google should address before it gets out of control.

  210. Make spam reporting much easier, with a simple feedback form on a WEb page that’s available to anybody, so that when spam e-mail arrives from the likes of fedexdispatchcenter@gmail.com or free.porn.super@gmail.com or toptek4@gmail.com and when spam mail uses links like http://docs.google.com/View?docid=dhdcnntg_0hpbk5hfc … they can be dealt with quickly. And start accepting messages from spam reporting services like SpamCop, instead of forcing the service to return a ‘ISP doesn’t want to hear about this’ type of message, which is what happens at present.

  211. Google should consider a testing area to enable webmasters to scan for possible infringements, not a scan that would give a score but guidance. It could report about lack of sitemap, meta tags and html errors. It could also warn the person if the site has possible infringements such as cloaked or stuffed words informing people to consider the actions if they continue. Maybe the scan could also help towards the spiders work?

    I know hundreds of small business owners who do not have a clue about proper site design, having this in webmaster tools would be a good start to improve the content of the Google experience.

    For companies like us we want to see better quality websites out there, education is the key and warnings would aid this.

  212. Hi Matt,

    I’d like to see better enforcement of Google’s webmaster Guidelines. While me and my company follow and respect the guidelines, I still see the same sites year after year we compete against rewarded for blatantly violating them (One Personal injury firm who have been #1 in Boston for years has thousands of inbound links from a college in NewYork?) My clients keep asking why can’t we do that too if Google isn’t doing anything about it?

    I realize it’s tough to police so many sites but I’m hoping you can figure out a way to police these “Black Hat” artists in 2009.

    Thanks for this forum, I enjoy it!

    Steve

  213. Also I forgot to say that if a site had violated any rules a cease and desist warning should be emailed to the webmaster giving them 7 days to alter the site.

  214. 1) One of the best spam and noise-on-page removers is the AdBlockPlus plugin for the FireFox browser. When you visit a page on a website that contains adds (images, Flash or JavaScript based) its very easy for me to remove the add or the addserver for good.

    I suggest you make a similar plugin – Google SpamBlock – for your Google toolbar, and for your new Chrome browser, where users can easily mark and block pages or entire websites they do not want to appear in their search results.

    You do not have to make the ekstra filtering serverside, you can let the plugin do it in the browser. E.g. AVG Internet Security has a browser plugin that checks all links that appear in Google searchresults. If any of the links have been reported or is suspected to be harmfull, I am warned by AVG. I appreciate this plugin and I do not experience any delays when I search using Google.

    Collect the block-data and tell webmasters in Google Webmaster Tools how many users dislike their website.

    Allow your users to subscribe to other users lists of websites they block.

    However, it is important that users have to decide themselve and actively have to choose themselves if they want to use this feature, and if they want to use other users lists of blocked websites.

    2) I strongly supports EGOLs idea about a “Content Registry” service in Google Webmaster Tools, this would be a good way to fight dublicate content, and “borrowed” content.

    3) I also support the suggestions about detailed warnings for webmasters in your Google Webmaster Tool. Not all webmasters know if they are doing something that is against Googles guidelines, why not warn them and give them a chance to fix it.

    4) Fighting spam is not just about trying to filter out bad content, it is also about allowing webmasters to tell Google if there are any areas of a page they do not want you to index. Today a webmaster can prevent an entire page from being indexed using a filter in a robots.txt, or using the noindex META tag. Why not allow them to prevent an area on a page to be indexed. Thus I suggest Google to support a tag.

    /Grosen Friis

  215. Rank sites that use keyword stuffing lower. Everyone says that it doesn’t matter any more but I am continually outranked by sites that use this method when I use relevant and accurate keywords.

  216. Stop page scraping/hijacking! The websites I work so hard on get scraped by spammers all the time. These scrapers make it look like I have duplicate pages everywhere. I’ve lost page rank a few times because of them. HELP!

  217. DMOZ is a joke. Please please eliminate it from the algorithm.

    Getting a new site onto DMOZ is so arbitrary and nearly impossible to investigate. If a category does not have an editor you may be out of luck. If the editor is not actively doing their job you’re out of luck. And If that editor is not legit (they signed up simply to promote their own site or sites) then you’re out of luck.

  218. I agree with Spamhound and Rosenstand. I have send several spamreports via GWT, but it does´nt seem to have made any difference in the SERPS.

  219. I want an option on the search results so that I can click something that says this was relevant to my search or this was not relevant to my search and then the webspam team could go in and look at the results and act on them.

  220. Quite simply – better search results!

    That is to say more relevant better targeted search results as Google set out to do by becoming the best SE in the first place.

    I still find search results are often a load of old crud that are not really the sort of quality I would like to see and are too easily manipulated by clever little tricks and by those who do not care about the end user.

    I would like to see the very best search engine results that are of the utmost relevance and best quality possible.

    I was thinking about this recently and think I may have come up with a simple solution! Now I say simple the idea is pretty simple but as someone who is only really into SEO and trying very hard to help provide the best search results, for me it is not so simple RE the coding and programing that would likely be involved but the idea I have would I think almost certainly help to improve the end user experience as well as create more stickiness for a search engine so people would tend to use it more above others.

    So – an improvement in the results of searches is what I would like to see much in line with Googles original philosophy.

    On another note – Any chance of an interview RE SEO on my forum by any chance please Matt?

    If you Google “Best SEO Forum” you will find the forum in my link above and presently there is a block on various names being joined up as (Your own one Matt and Sergey Brins as well as others and pretty much every top music band I could think of to block others signing up as) reason for that is because Pete Townshend on MySpace isn’t Pete Townshend of “The Who” so I wanted to ensure the same thing does not happen TVWorlds!

    Another thing I would like to see is websites like MySpace etc allowing people bands/groups control over their own names even if someone else tries to impersonate them – but of course I realise that is probably not something Google could accomplish without buying out those sites and enforcing such a policy.

    So please ignore that last pipedream it is just a personal thing I dislike on the web.

    Here is an example that might make sense to you all at Google.

    You employed the Chef who worked for the Grateful Dead so let’s say some social network site had a member called TheGratefulDead but that member was not the band – I would hand that member name to the official The Grateful Dead and not allow impersonators.

    So may be Google could, if not able to force other sites to do such a thing, perhaps Google could start a site that identifies those that are real so users of such sites would know when things are official or when things are either scams or impersonators. To my mind it’s a bit like phising another site and social network sites should tighten up on these things or could Google perhaps find a way to improve that particular situation?

  221. Hi Matt,

    I see poor quality sites ranking higher than quality sites on major target keywords. I think Google needs to improve and take into account the track record of a site, looking at true authority in a lot more detail and pay less attention to things that can be easily influenced, such as the practice of using one word in a title to gain weight, anchor text on account of a site conveniently having the main keyword in their site name. Things like this mean crappy sites can rank higher than an established site of over 8 years! You need to try and address things like this.

  222. Dave (originial)

    IMO, Domains are FAR to cheap and thus spammers view them as disposable. Blogs etc are even worse as they are free. Thus, they ARE disposable.

    When you pay little or nothing for something you ARE willing to take risks that you wouldn’t dream of taking if the same cost you a lot of hard earned money.

    The problems for SE, in regards to Webspam will only get worse and likely at a disproportional rate to normal Web growth.

    It seems to me that the left hand of SEs are fighting the right hand.

  223. Tackle automated forum-spam. Should be easy to do…

    The autobot-spam robots–even the human-mediated ones–use virtually identical text with identical links. As soon as you spot that the text is a block of repeated text-with-links on a forum or blog, discount every link down (preferably) to nothing.

    As soon as the spammers realise that they are paying for no result, they will stop, and guys–like me–with forums or blogs will breathe a sigh of relief, since we will be able to spend our time on new content, rather than wiping the filth off our forums all the time.

  224. It really depends on what you consider spam.

    Some of the most annoying spam right now if major authority sites building a single page on everything they can think of and then pushing internal links at that page.

    This results in a very bad user experience. Searches get presented with a page from one of the major newspapers or something like that, which is useless and way below that on the front page are great results written by people who care. And yes I think a lot of wikipedia falls into this also.

    Lets be serious Matt Google is in bed with all the large publishers – which will mean at some point in the future another company will start to produce better search results if it keeps going the way it is now.

  225. Dave (originial)

    It really depends on what you consider spam.

    In this case, anything outside the Google Webmaster guidelines.

    Lets be serious Matt Google is in bed with all the large publishers

    And your proof of that statement??? I thought not 🙂

  226. How About — CONSIDER “VOTES OF AUTHORITY” *BESIDES* JUST LINKS!!

    Disclosure: The website I work for has been listed dozens of times by major national newspapers as a source of information. Newspapers have referenced the site with quotes like:

    Source:
    but they don’t actually LINK to the site.

    If anything, this is one of the strongest votes of authority — an author saying a website was the REFERENCED SOURCE of their information. If an article simply links to a site, it might be a positive statement or a negative one; the article might have said “Never visit .com; their information lacks credibility.” Yet Google might make that site more relevant.

    I bet if Google started looking at the phrase:
    Source:
    And using that and similar contextual ways as a way of conveying authority (and didn’t announce that), it would help move the world away from all the webspamming/cheating/exploitation that’s out there, and do a better job of properly weighting authority based on authorial intent. (Plus, remember, a LOT of newspapers, website builders, home page owners and online grassroots communities out there still don’t know how to link, but they DO talk about and cite authoritative sources of information and websites they love).

  227. [quote]
    IMO, Domains are FAR to cheap and thus spammers view them as disposable. Blogs etc are even worse as they are free. Thus, they ARE disposable.[/quote]

    Dave (originial) is onto something: spammed results are more likely to use low and no-cost link sources. Perhaps it would be useful to look at the percentage of inbound links coming from free sources like blogs and forums, versus those links that come from high-cost domains (which may be more likely to be editorial). It would be worth researching, anyway.

  228. I’d like the value of Google Alerts to be improved to remove the spammy blogs at blogger. (these are the ones that usually have a serial number in the URL).

  229. Hey Matt,

    This suggestion might be for the web spam team or the Product Search team (assuming such a team exists).

    One thing I run into all the time is Product Search results that list an item at a very low price, but after you go to the product page, you see that the product is:

    * Out of stock
    * No longer available
    * Open-box/used/damaged/etc. (These results are only a problem when they are grouped together with the new products and aren’t clearly marked as open-box, used, damaged, etc.)

    I imagine most of these occurrences are unintentional, but I do see the potential for abuse. A site can “rank” at the top of the Product Search results (assuming a Google user has sorted the results by “Base price,” instead of the default, “Relevance”) by listing a specific product for an impossibly-low price, despite the fact that they may not even sell that item at all! Then, of course, the site completes the bait-and-switch by offering you some alternative product.

    (The rest of this is just background info, FYI.)

    The reason this issue is fresh on my mind is because just recently I used Product Search to find a CPU cooler for my new Intel i7 rig. I was shopping for the cooler literally the week that Intel released their new chips, and since these chips have a new socket configuration, the 775-based coolers don’t fit. So suddenly there was a huge demand for 1366-based coolers, and only a couple of manufacturers had a product on the market at that time. So in this case, I wasn’t worried about price–my only concern was finding the item in-stock.

    I selected a highly-rated (according to Product Search) retail site, which actually claimed to have 3 of the coolers in stock. I placed the order through Google Checkout, but then afterward… I was informed that the item was out of stock. So when all my new PC components showed up, I went ahead and built it, using Intel’s factory CPU cooler as a temporary solution until my “enthusiast” cooler arrived. I placed my order on November 21, 2008. The item shipped on January 9, 2009 and is estimated to arrive on January 15th! I realize that there’s no way for Google to actually verify whether or not a site has an item in stock, so in my specific case, there’s not much you can do (since the site actually submitted false information about their inventory). But I still think there could be some kind of system in place that removes the “product listings with fine print” from the legitimate listings. One possible idea is to allow sites to feed their real-time inventory data to Google Checkout, so that Product Search can offer a sorting option like “Only show me items that are in-stock.”

    If this issue doesn’t really fall under “web spam” and you want to forward it to the Product Search team, then that’s fine by me, but while you’re at it… can you please tell them to put back the “Sort by price (descending)” feature? The ascending/descending sort options disappeared a few weeks ago and were replaced by the “Base price” option, which always sorts by price, lowest-to-highest. This inexplicable change has delivered a massive blow to the usability of Product Search. I know the economy is hurting, but let’s not assume that I always want to see the cheapest products listed first. 😉

    Thanks, Matt!

    -Darren Slatten

  230. How about wiping out the 100s of spam directories that iEntry use to entice people to cough up an email address for submission so they can do some real spamming?

    http://incredibill.blogspot.com/2008/03/jayde-nichebot-crawls-for-ientrys-web.html

  231. Dave (originial)

    ientry.com = 404 page not found

    http://www.siteadvisor.com/sites/ientry.com = “We tested this site and didn’t find any significant problems.”

  232. Maybe you should learn to use a browser:

    http://www.ientry.com/ works just fine, it loaded.

    It does appear McAfee lifted the alert but it’s still tons of directory spam.

    However, it does look like their network of directories is offline tonight.

  233. Dave (originial)

    Maybe you should learn to be civil, rather than trying to score cheap points. The URL DID return a 404.

    It does appear McAfee lifted the alert but it’s still tons of directory spam

    However, it does look like their network of directories is offline tonight.

    Maybe you should check your facts before running your mouth off and posting FALSE information for the World to see.

    I give as good as I get (or better) BillyBigMouth (aka IncrediblyVAIN). 🙂

    BTW, nice-attempt-at-a-spammy-link-drop.htm

  234. I’d like to see a reduction of sites that require me to pay to see the answer I searched for. Often I search for a technical problem, like a PHP error, and some of the top results look like they have my answer, but the actual page has no info, just a request to sign up so I can reveal the answer.

  235. I would suggest that Google include an icon to represent harmful sites that can get users hacked. Google bots would need to gather more information and the algorith should be able to track such links on the site wisely.

    This could have webmasters of such sites reverting to Google for such action, so it should all be justified.

    This would be a great achievement in driving more traffic to Google search.

  236. I daresay someone has already requested this so I apologise if I am duplicating –
    Most webmasters are only too happy to fix any issues that may be adversely affecting their site’s ranking with Google. More explicit information from Google webmaster tools would help enormously.The ‘content analysis’ tool has been enormously useful. Perhaps this could be extended to highlight aspects of a site that may be ‘improved’ without giving too much away regarding the Google algo.

  237. I’d like to see a paid express reconsideration request tool in webmaster tools.

    It should be priced high enough to make sure that the system is not overwhelmed with “I dropped one place in the serps do I have a penalty” queries.

    People using this paid service should receive a reply telling them what they need to do to remove the penalty.

    The reason I would like to see this is that a site I run received a penalty last April, I’ve made a load of changes, submitted 3 reinclusion requests and still have no clue what is causing the penalty.

    As I’ve lost a lot of money from the penalty, I would not mind paying a high price for a service like this and I’m sure there would be a high demand for this service.

  238. Hey folks, just a reminder to please be civil.

  239. Local search results spam – this is getting rediculous – it also must be stuffing googles DB with loads of fake addresses etc, which will eventually make the data useless.

  240. Blogspot is a huge spam fest – making it harder to spam would be wonderful. Perhaps give users the option to switch to different CAPTCHA types or use things similar to Askimet or Bad Behaviour.

    Also, on the other end of the scale, continuing to improve notification for when people are hacked etc and end up spamming unintentionally. Random sudden thought related to this – perhaps some form of authenticated RSS subscription that can easily be set up for all authenticated sites in a user’s webmaster console (perhaps to convey non-spam-related messages too).

    Also – continuing to reduce ability to place Google Ads on spam sites – still quite a bit of MFA cruft out there.

  241. By the way, the spam level in Google Groups is absolutely insane – perhaps better spam detection could be applied there too (though I understand it’s not too widely used any more).

    Map spam and spam in other relatively new product areas (such as video) is something to try and get on top of early on. I await with baited breath the first report of a book which has been written specifically with a mind to rank better in Google Book Search 🙂

  242. I’d like to see Google crack down on:
    1. MFA sites.
    2. Sites that block ia_archiver to avoid copyright infringement.
    3. Act on reports of sites that use multiple domains pointing to a main domain for the purpose of hogging the serps.
    4. Find a way to de-rank scraper sites.

    Go Google! After a decade, still my favorite search engine.

  243. Whatever you do, please include measures that would prevent a COMPETITOR from hurting another companies standing. It is way to easy to do so now despite what Google says.

  244. Thom, please keep in mind that there is no origin source for newsgroup postings. I guess that Google is not allowed to take Google Groups as only source for newsgroup postings into SERPs because of their monopoly position.

    It’s the same thing with lyrics; however, Google has to list all of them, because there’s no rule of “who has the best right to publish the lyrics”.

  245. Transparency, transparency, transparency on penalties in WMT and action on collateral damage to give some comeback for sites. I would pay for such a service.

    Obviously there are ethics there that people would say google are deliberatley profiting from banning sites but faced with losing $20,000 a month or paying $500 etc for a google review I would stump up the cash.

    A guy who was hosting on our servers (www.europe-cities.com ) got blasted and had to let 10 people go before xmas because of it. Maybe common dns or IP, I am not sure but google got us down as a bad neighborhood.

    I feel really bad for him because he was punished for our stupid linking. He spent 10’s of thousands of euros on content and is reduced to getting a few uniques from yahoo and the other weedy engines. We had other sites that had good sites taken out as collateral damage. I have done a reconsideration for them but that part of Google needs some work too. How about charging to cover costs?

    The worst part of the ban on our sites is the whole thing is in limbo. What do you do? Make a new site because the old one is dead. Wait for the old one to come back? I suggest a system something like the following in WMT.

    Penalty 1= 30 days. You have a category one penalty. Possibly for doing X. ( some legal crap so google cannot be sued)

    Penalty 2= 60 days. You have been flagged with a possble Y violation etc

    Grey Bar= Forever. Give up mofo 🙂 you have been zapped for being a mong spammer dog. No explanation required.

    Please do something so people can plan more because there are innocents as well as the guilty.

  246. I’d like to see Google hit Twitter as long as they let their users send the exact same tweet 100s of times to followers.
    Lots of bloggers seem to be encouraging their twitter followers to spam their followers with the exact same tweet in order to enter a competition and have their link promoted. So, I’d say hit twitter until they hit those using these practices.

  247. “Sites that block ia_archiver to avoid copyright infringement”

    That’ll never fly because many of us block ia_archiver to avoid scrapers that use ia_archiver

  248. 1. Try to call out and dis-credit sites that are buying blogs posts with nofollow. There are several sites in my niche that purchase ‘pay per post’ type blog posts by the hundreds. When I analyze their websites and where their inbound links are coming from, it is 75-90% of these ‘pay per post’ blog links. These blogs usually show an ad or offer to buy a post on their site and the posts read very unnatural. Should be and easy pick-off! BAMMO!

    2. Scrapping sites that do not link back to the original content should be easy to pick off and eliminate. BAMMO! (I don’t have so much of a problem if people are linking to the original post.)

    Thanks for the great work at Google and all the opportunities it provides our family!

  249. i would like to see the weighting to domain age to be reduced or at least reevaluated. Just because a domain is older does not necessarily make its content more relevant to the query search. It seems there is a leniency to older domain that violate basic google guidelines. ie the older your domain, the more you can get away with keyword stuffing.

  250. Thanks for the continuing suggestions, everyone. After the first 150 or so comments I spent a few hours putting the suggestions into very rough categories. In the interest of transparency, here are my rough tallies from that point.

    Webspam in 2009

    Empty reviews or cost-comparison sites: 20
    [sd880is reviews]
    (or scraping Amazon reviews)
    (mentioned empty social sites created with random names)
    http://www.merriam-webster.com/thesaurus/Baker's+cyst
    video game sites that don’t really have reviews for something
    electronics/data-sheet searches

    More transparency, esp. about showing penalties or reconsideration process: 14
    (“Could 2009 be the year of reducing false positives?”)
    More communication, including by email
    Show if a website is compliant with Google’s guidelines
    Show warning when logging into Google services if website in WMT has a problem
    Site ranks well outside the U.S. but not in the U.S. (Olivier)
    More/better webmaster tools
    Tool to alert site owner to open url redirects?
    Provide mechanisms for interaction (Graham Davies wants to send suggestions)

    expert-exchange type sites: 13 (justanswer.com cloaking?)
    also including empty answers sites, or [how to replace a toilet]

    RSS/scraper sites or duplicate content: 11
    (or that show up higher than original author)
    (or that copy/paste manually)
    (or image theft)
    Get rid of spam blogs in blog search.
    easy way to report stolen content
    differentiate affiliates from the affiliate programs

    Local spam: 10
    [pizza small town, usa]
    (‘gps tag’ to say where a business is located) +2 for geo tags
    (too many directory listings and VRE sites)
    attorney spam

    Low-quality directory/listing sites: 7
    especially that require a linkback

    Link exchanges: 7
    esp. at the bottom/footer of a page
    want more action e.g. in real estate

    Paid links: 6

    Blogger spam: 4
    (or other Google properties with user-generated content)

    Spam detection API, a la Akismet: 4
    Ability to report spam comments and the destination urls
    Partner with Akismet to do better spam detection: 1

    Firefox plugin or other easy way to report spam or paid links: 3

    Sites with search results: 3

    Every spam report is looked at, or have an service-level-agreement: 3

    Multiple sites by the same owner: 3
    [acura water pump]
    [rachat credit]
    greatbluewidgets.com

    MFA-ish sites, esp. using newsgroup listings or forum or DMOZ copies: 3
    http://www.google.com/search?num=100&hl=en&safe=off&rlz=1T4GGLL_en&q=%22The+TCP%2FIP+connection+was+unexpectedly+terminated+by+the+server.%22<– there are a couple of examples here (vistaheads.com and tech-archive.net). Keyword stuffing: 2 Crack down harder on hidden links: 2 Article bank sites: 1 Hacked backlinks: 1 Comment spammers: 1 Forum spammers: 1 Phone number spammers (all phone numbers listed): 1 Doorway pages: 1 Misleading titles: 1 Press release spam: 1 Don't trust MySpace/Facebook/social bookmarking links: 6 Social site like Digg with one vote on a story shouldn't outrank original story Don't show social media spam Requests for demotions Show Wikipedia less often or lower in the search results: 4 Filter Wikipedia stubs at least Demote aboutus.com/aboutus.org: 4 Demote ripoffreport.com or scam.com: 2 Demote Hot Frog: 1 Stop showing PageRank in toolbar: 3 Allow larger descriptions for snippets: 2 esp. for the first result Reward sites with grammatically-correct, well-written, professionally-edited content: 2 MFA (no mention of scraping/non-original content) sites: 1 (esp. in sports vertical topical time sensitive long tail kws) Police AdWords for "guaranteed top 10 rankings": 2 Police AdWords for arbitrage: 1 (Gerry willing/wants to give a list) Make AdSense harder to get into: 1 Kill a site for spam, automatic yanking from AdSense: 1 Faster refresh for webmaster console: statistics/top search queries: 1 Eliminate penalties. Put everything on a spectrum: 1 Longer/harsher penalties: 1 (ban SEOs, don't sponsor conferences) Comprehensive bans (banned from search? Also ban from AdWords/Gmail): 1 SEO "hall of shame" with public list of banned SEOs: 1 Google Account holders gain reputation, can vote sites up/down: 1 Personalized/SearchWiki; ability to block some sites better: 1 Email alias to forward cold-calling spam emails: 2 (or to flag attempted link buying/selling emails) Use DMOZ more: 1 Use DMOZ less: 1 Less weight on keyword-rich domain names: 1 More weight on keyword-rich domain names: 1 Demote stale sites that haven't been updated: 1 Render sites with Chrome and penalize sites with search-engine-only text: 1 Signal suggestion: lots of impressions but few clicks implies not a great site: 1 Signal suggestion: content vs. ads space, or pagination to show ads: 1 Signal suggestion: reduce the impact of links from Russia on "American" PageRank Apartment searches shouldn't bring up all apartment finders: 1 Chrome for Mac: 3 Use Trade Mark citations: 1 Better product search: 1 Sort by date: 1 Revamp any algorithms for "domain authority": 1 Request for CSE (from Tar): allow 10K annotations, so can block crapsites: 1 More/better/faster malware warnings: 3 (also tell Google the problem is corrected without penalties) "Show me completely different types of search results": 1 Investigate e.g. Twitter/social usage patterns to find spam networks: 1 http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2317/2063

    Double-check nofollow/PageRank flow on Google: 1
    http://www.google.com/checkout/m.html

    Detect/use hcards: 1

    Webmaster guidelines as a PDF or ebook: 2
    Write a rule book to go into more detail, e.g. spam to submit.

    More communication about directories (good or bad?): 1

    Communication: take a public stance on DigitalPoint Co-Op Network: 1

    Communication: how JavaScript links are handled

    Tackle the problem of open url redirects around the web

    A Spanish version of Matt Cutts

    More Emmy pictures: 1

    People that remarked that Google is doing better or doing well on spam: 6

  251. Good List! Looks like a lot of good anti-spam things will be happening in 09!

    I love the “SEO “hall of shame” with public list of banned SEOs:” idea….Let us know if you set up a submission process. I have a few companies i would like to send your way!

  252. Put more emphasis on traffic patterns and less on link building… Search bots can’t recognise a well designed website which offers value to its visitors but by looking at traffic patterns Google can measure the popularity, frequency of new vs old visitors, bounce rates etc etc…

    Don’t get me wrong links are still important but traffic patterns have become more important..

  253. Thank you for the summary, Matt. This is apparently a hot issue, just as it should be. I think you and your team are doing a great job.

    I am a reputable SEO, or at least fancy myself that way. A couple things I have noticed about my maturing in my job over the past ten years is that, although PageRank is important (obviously hugely important), I do not look at the number as a measure. It seems that good content still trumps the horizontal green line, which tells me that what I have preached to my clients and peers about content is still a big part of Google’s algorithm. I do not normally see instances where scraping will out-rank the original content, but if it is being seen widely, I would have to agree with the concern as follows:

    “RSS/scraper sites or duplicate content”

    I recently described my services to a non-technical friend as a “glorified technical writer”, which is fairly accurate. If I write and properly prepare content that people want, the important part of my job is done. Based on this, I would have to agree with the part as follows:

    “Reward sites with grammatically-correct, well-written, professionally-edited content”.

    If it could be implemented properly, I would perhaps agree with the part on “Google Account holders gain reputation, can vote sites up/down”, but this would clearly be hard to trust, as it is open to a lot of interpretation and potential abuse. However, if the reputation was earned and doled out correctly, it could make an interesting study.

  254. I have a feeling this might be one of the top posts of 2009, and so early into the year :0). I am glad Google is looking into this, as it’s something that affects us all. The hard things is finding a consesus, because depending on your viewpoint some things are larger protities than others. As a blogger, nothing kills me more than finding my content stolen, or rewritten in a random way. For others it’s searching for a product review to get an adsense filled page with no content and stolen pictures.

    Another of my concerns are harmful sites which download things as soon as I visit. I do report them when I can.

    I thought about this a while and I think Google should add a button to the Google Toolbar where sites can be reported with a click (like Stunbleupon). Then if a site gets clicks from many regions, different IP’s, it can be put into a sand box until reviewed.

    Actually I have a homemade custom version for myself, but something more robust I think would be a good idea…

    Anyway, I enjoy the blog, keep up the good work
    Adrian

  255. I agree that many articles’ websites are crap lately but Google shouldn’t bring them down just because they are article sites. I disagree that all articles’ websites are bad and contain content that is not professionally written. There are some article directories (including mine) that has invested a lot in providing quality content to our visitors through hiring in-house creative content writers. For me, instead of writing off article directories all together, pages should be ranked on page per basis by Google (as it does this very nicely right now). If a page doesn’t deserve to rank (for example, a page which is duplicate inside the website itself or elsewhere, grammatical mistakes, bad presentation, etc) – Google can put some sort of algorithm in place.

  256. Hi Matt,
    Since you singled my names out with the ranking all over the world but not the USA, I wanted to know if you had any suggestions about where we could find answers or submit the URL for some sort of feedback. By the way, the weirdest part of all this is the WMT tells us we’re ranking in the US (when we’re obviously not). Maybe we can talk it over at SMX west, I’m on a panel for Google Maps and Local.
    Thanks
    Olivier

  257. Reading the summary I was relieved to see that my latest web-presence policy seems to satisf most of what people have suggested and/or moaned about.

    What most gets my back up about this whole this is that we have always tried to give the minimum required info on any web-page (i.e. what a user is actually interested in instead of all the piffle) but we have seen our rankings slip from number 1 to around 7 becasue the competing sites are seem to have more ‘meaningful content’ on their page.

    If you look closely at this meaning ful content then the keyword stuffing really comes into effect.

    I agree 100% with Amol’s point about article sites being valuable. I have learnt a lot from some of these in the past but (like already suggested) the Experts Exchange style sites which only let you view after paying but let Google crawl the site for free really pee’s me off. When you get a search result for these sites of site only to find the post blocked its a RAAAGE moment.

    We ourselves are building a free ‘How To’ section with videos, tutorials, hints & tips etc., all for free and easy to index. This will be based on the questions we get asked by our customers. So in this context I think they are valuable.

  258. Hi Matt

    Greater transparency would be a great bonus. There are many of us out there who think google is trying and doing its best, but we feel that the communication between google and those webmasters who are doing an honest, “legal” job could be improved, be that through the webmaster tools or another device.

    I also agree with the many comments that review sites that offer no review can be really frustrating, though I would put “pure” affiliate sites in the same category. I find it most frustrating finding the top three “results” on a page merely being really well seo’d affiliate sites that do not offer the product you are looking for (or any products at all), and send you off who knows where after.

    Thanks

  259. Going over the entries in that last list from Matt Cutts:

    Empty reviews etc – issue of “stubs” is already covered by Google Guidelines http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=66359 – one note about that page, I’d suggest changing that to state REP protocol in general as robots.txt may not be easy for a site to automatically do for millions of products/locations/etc, and meta robots may be much easier (and is a perfectly valid way of doing things)

    Wikipedia – I’d like to add to handle wikipedia redirects as 301’s (or better yet, get various wiki software e.g. MediaWiki changed so this works on all sites using wiki software)

    For wikipedia stubs – as above perhaps this could be achieved by getting a change made to the wiki software to include robots “noindex, follow” automatically when the “stub” note is included on the page

    experts-exchange etc – I almost filed a spam report for them till I realised they were using Google First Click Free – We need some way to determine if someone is using Google FCF so we don’t submit stupid spam reports. Quick ideas from top of my head include a meta tag or similar included on all pages using this or a list somewhere.

    Social media – this issue is a difficult one, most people said the same about blogs when they came out (why can people write about their cats and get ranked highly) – blogs eventually matured into a key part of the web – like blogs, social media sites are just websites, and reflect a change in then nature of the web, and are maturing also. They too will likely be made invaluable.
    –That said, spam is spam. I’m just pointing out that you can’t just bad all social media sites and think that will solve the problem – social media on the scale of today’s sites is a new phenomenon and may need new thinking about how to solve.

    http://www.google.com/checkout/m.html – Thinking about it, this seems a legit use of dofollow – if Google are happy for these people to use Checkout, that seems like fair enough “endorsement” of the link – this is different from W3C sponsors issue (pay money, get a link). There is ‘vetting’ going on here (they’re not going to let a spam site use Google Checkout)

    Some final, lighter notes…

    Love the “more x 1 vote, less x 1 vote” entries 🙂 – I’ll just add “less emmy pictures” to be difficult 🙂

    I see people asked for Chrome for Mac but no-one has asked for Chrome for Linux (shame on you all!). I formally add an off-topic request for Chrome on Linux 🙂

  260. I would like to see consolidation on plural results, natural snonym acknowledgement within search, and clarification on the “suggested” searches that appear using the type ahead, and some form of archival of web pages. The results are vastly different when you enter something in singular format, versus plural. Does the “s” on the end need to matter so much? More search based off real english root words I guess. If I look up cabinet, show me cabinet, cabinets, cabinetry etc., People are just too lazy and search illiterate to know to check all forms manually. Which brings me to the type ahead suggested results – it is very frustrating again with the singular and plural differences to be optimized for say singular, because adwords and keyword tools show this has a far greater search volume (usually) yet the type ahead tends to display the plural version! I don’t understand this. Why not auto suggest the result with the greatest search volume. Most search engines have this same problem. It baffles me. Now as for synonyms, it would be great if there was a synonym index someplace to help people with their searches, or if it was a part of the auto suggest feature or just even available at the search box level. Say I type in the word transmitter, I would like google to suggest to me that synonyms for this word is also sensor, and transducer…or else return the results that have all three. I know that is a lot to ask because you get into trouble with the english language – say cream is a synonym for lotion, but cream is also a heavy form of milk used in cooking…when I am researching something I find I manually have to check synonym forms to make sure I am getting everything. Lastly, some form of web page archival would be beneficial. I have used the bet labs gtimeline feature to filter for timely results – and it has its problems, but I still run into so many sites with completely out-dated data, or irrelevant data. Who needs the pre-poll results from elections 8 years ago? I would like to ability to mark something as outdated. Then maybe if google sees enough people marking pages as outdated maybe the pages could got into an archive or something….And one more thing which I hate and always have is the stupid searchiomidmarket thing, doubleclick popup pages, and I’m sorry to say, but Wikipedia. I am so sick of seeing the wiki results at the top, and other crap results that I would like to remove them and never see them again. I have used the personalized search in labs and appreciate the ability to rank stuff up and delete (love the little poof by the way) stuff out, but the settings don’t seem to save except suring one browser session. I’d like to say never show me wiki ever ever again. Never show me this stupid netflix popup ad ever ever again. But alas google is so miles and miles ahead of the other search engines. I love google. I can not imagine my life without google.

  261. The pages that Google should really cull are the “parked’ sites which are really just a list of affiliate sales pages. They contain absolutely no content and are, in most cases, just a way for people to display website URLs that they are trying to sell.

    Whilst its a free market, it wrong for anyone to grab all the popular .com sites without actually putting them to use.

  262. Hey Matt,

    If you want to make the count on the last one 7, I’d be on board. I don’t think anyone could deny that there are problems with spam (if there weren’t, this thread wouldn’t exist!) But for the problems that do exist, you guys have continued to evolve and are still light years ahead of everyone else in that regard. And at least someone out there is listening (multiple someones when guys like Lasnik and Brian White are brought into play).

  263. Overall I think you do a good job in eliminating spam. But I would like to call your attention to one problem with the search results. This is the huge number of high-ranking Youtube videos which contain no relevant information about the search topic. These often occur in pairs on the first page (top 10) of the search results. Some of them appear to highly ranked merely on the basis of their title, because their content is totally different from what the title implies. It is obvious that no one at Google checks these videos to see what is really on them. It makes no sense to rank something on the first page when you don’t no what’s on it.

    You may not consider this to be spam. But whatever you call it, ranking these videos so high is a serious flaw in yoursearch results.
    Thank you

  264. Guess i have to check your blog more often 🙂 Anyways, i bet a lot of webmasters would want a “report” tool that really works. Theres some sites i reported long time ago and they still show up ranking high with no content at all. So i’d say a working “report” tool is a must for this year.

  265. Dave (originial)

    Empty review pages that Google ranks above non-empty review pages are not Webspam. They highlight a flaw in Google algo.

  266. duplicated content, indexed site search, rss agregators…

  267. Having had some time to think about this
    1) the biggest and worst web spam there is comes from blogger splogs. Blogger is a black hole of splogs & comment spam. Google, hold the torch to your own feet.
    2) paid links – still a scourge after all these years.
    3) Scrapers: I thank scrapers. A lot of sites suck and if a scraper didn’t clean the info up I wouldn’t find it among the mess. Scrapers can be a search engine’s best friend. Sucks if you’re the orginal content guy, but you know what? Pick up your SEO game, clean up your crap if you care so much. I doubt you care about that more than your $.10 of adsense/mo.
    4) Almost all the other above complaints aren’t problems with spam, but with the algorithm. Empty review sites just haven’t gotten a review for a certain product yet. Not spam, but certainly placed inappropriately in the index probably, since enough people are mad about them here. Same with aboutus.com like sites, affiliate landing sites & the bunch.
    5) Transparency & feedback with Goog: it stinks.
    Frankly I suspect many people writing here are simply doing more and more advanced, niche like searches and the algo fails them at their level of search engine query.

  268. I find it annoying when I am trying to search for error messages, or specific version numbers of software, and really struggle to find information about the specific version of the software that I’m looking for due to the way that google deal with . separated numbers. Sometimes I end up getting outline view numbering of reports, which is totally irrelevant.

  269. Matt,

    I would like to know why sites like Rip Off Report and Complaints Board are in the Google index? They claim on their site that no post (true or false) will be taken down, no matter the case.

    It sad to see small business reporting each other on these sites and people posting all kinds of things to hurt others’ businesses. ROR and CB have a good concept but the news should be validated so it has some kind of credit.

    I am personally sick of getting calls from small business asking me to get rid of ROR because they rank for their company name.

    Do us all a favor and take care of this problem in 2009.

    Thanks,
    Marko

  270. Maybe you could send at least one Googler to be the Anti-Spam Team for Adwords.
    I still still see lots and lots of scraper/rss/random content sites doing adwords.
    I understand you make your money with AdWords, but basicly you are screwing not only other advertisers, but also you search users.

  271. I’d like you to ban sites that offer license keys and cracked versions of software, and also sites who (wongly) claim to sell “cheap OEM versions” of software.

  272. Save Google Notebook, Sign the petition and spread the word
    http://www.petitiononline.com/gnoteb/petition.html

    Let them know how many of us are there that do care about the Google Notebook.

  273. I hope your malware reporting/info/notification will improve…

    You (Google that is) flagged my website and I must say it looks like a false positive (you did also remove it immediately when I reported it to you… And I am quite happy with the speed of that, thanks!)

    …You reported my website had trojans but:

    * AVG siteadvisor has+had me clean.
    * McAfee SiteAdvisor has+had me clean.
    * I downloaded some executables from another computer myself and scanned fully uptodate AVG. Clean.
    * I scanned my entire computer with newest AVG. Clean.
    * I inspected HTML output myself of my website. Clean.
    * Webhost told me the FTP had not been accessed by other IP Addresses.
    * My downloads (software I develop and sell) are scanned by many downloads sites as well.

    * And Google’s own diagnostic says:
    1)
    “Of the 38 pages we tested on the site over the past 90 days, 0 page(s)
    resulted in malicious software being downloaded and installed without user
    consent.”
    2)
    “Successful infection resulted in an average of 0 new processes on the
    target machine.”

    Does that REALLY seems like a trojan infected website? Or does it seem like a false positive flagging all my downloads? Well, who knows because there is noone I can contact to get more in-depth information? (Since you flagged it, you must have it?)

    I can say that I believe the *only* day you as-far-as-I-know flagged me coincided with me having a temporary problem (10 minutes) with infinite redirects to another domain of mine since I was preparing a permanent redirect / domain move…

    Oh, and even though my website was in Google Webmaster Tools, promoted in Google Adwords, showing Adsense ads… Did I ever receive an email? 🙁 Not that I know of at least. Perhaps it’s a setting somewhere, I don’t know… But I was left to discover your warning myself (which of course propagated to StopBadware etc.)…

    I won’t burden this blog with more details… But at the very minimum I think you should default to provide more detailed information to website business owners/domains who have existed on the net more than 10 years and used AdWords regularly over past years as well…

  274. How about creating some kind of blacklist or a whitelist and then let google users vote for the ones they really think are spam and the ones that may be usefull, in that way you have free human review of every spam page. The only problem I found is that spammers will have some sort of exposure at first, but if voting is realtime, lets imagine that with 10 votes the page disappears from the search engines and permanently blacklisted for mails (or something of a sort). I know my idea is not that original, but I think the original part is integrating real users (human instead of bots) in this issue, maybe by a new type of social webpage. Also there will be needed some type of categories (and filters for adult content) for contents and some time of verification for users (we don’t want some bunch of pranksters blacklisting a real website).
    That or coming up with a magic algorithm that destroys any sign of spam, maybe based on users preferences (what he likes and dislikes, or if the user is a she or he, I don’t why a man would be interested in a pill to make his breast grow, or a woman interested in cheap viagra).

  275. There are two spam practices that really bother me which I would like to see handled more forcefully:

    1. The practice of creating a bunch of small websites in an attempt to get more positions on one keyword or to use the links the small websites bring to boost another website. This includes creating blogs on separate addresses to capture two positions on the same keyword.

    2. Writing a blog post and getting it put up on several different blogs, usually with similar words, not to get the site indexed, but to get the link back counted.

  276. I found another example of ap oor guys stuck in Limbo. Look at this the poor mother.

    http://www.webmasterworld.com/google/3825761.htm

    5 year old site and he does not know to get a new domain or pray for the old one to come back. This is exactly the sort of zombie state that Google should address so at least he can definitively know one way or the other.

    I do not know what he did and maybe he deserved it but just not knowing if it will come back or not is cruel after 5 years of working on something. Given G’s market share of search there should be some indication of what will happen in WMT.

    Would appreciate it MR MC if you could look at the site I posted 15 posts back as an example of collateral damage. He hosted on our servers and got zapped because we were buying dumb polish spam network links. All in common was our DNS and IP.

  277. We have several issues on Google Search
    1. Google Local Results (1, 3 or 10 Result) on Search
    a. Dominating Google Organic Results
    b. Misbehaving competitors/spammers – Same landing page but different listings (repeated) on Local results
    c. Search Result Space occupation – better to implement the bottom and save Organic SEOs and their continues efforts

    2. Paid Links & Google Webmaster Spam Complaint
    a. Spam complaints are not rectified shortly
    b. Old snippet issue, no consideration while we raised the complaint through Webmaster Account
    c. Genuine & true complaint for Selling Paid Links, but no effect

    3. Keyword Stuff and Repetition
    a. Most of the competitor site adding keywords to page header in gray color and small size font for visible, but actually they used CSS to control the size of the fonts and everything
    b. CSS tricks to hiding text

    Google team has to do something for the above issues

  278. My wish is to address a routine complaint I get from my AdWords customers. Please add 1) an option to send an email when your credit card is billed, 2) an option to email a weekly, monthly, billing report (in PDF, excell, etc).

  279. Dave (originial)

    Max, read Quadrille’s reply.

  280. On NOS forum I wrote under a nick name Kieb; as I filled in (Kieb) on Google my complete profile appeared. This is something I don’t like becauce it has to do with privacy. How can this been removed from Google?

  281. Paid Text Links.

    We have always been white hat seo. I noticed we have lost a lot of ranking related to our competitors. I asked an SEO consultant who was a friend to see if he could figure out anything I could not.

    He came back and said we are getting killed by the paid text links. He has never recommended to anyone to go that route but had never seen anything like what he did in our area.

    For those of us trying to work within the rules, either there has to be some improvement in punishing those with paid links or some other option to level the playing field.

  282. Domain squatter spam.

    I am a bit late to respond, but anyway: I believe Google should completely remove this spam from results it even if user searched for “xxxxx.com”. You already do this for some sites like “mouse.com”, but not for all. For example “tee.com” “cold.com” “paint.com” are still #1 in Google.

    It seems to me that it is relatively easy to detect and the benefits/effort of this change should be significant.

  283. The biggest problem in my opinion are the webmasters with the deepest pockets that keep buying, buying, buying links. EVERYWHERE they can get them.

    They are so easy to detect. They simply target one keyword at a time and these links are always in the footer of the bought pages.

    I know two perfect examples that have bought to date 1,000,000 links and 700,000 respectively. I’d be happy to provide these examples privately.

    It should be easy for google to begin scanning for links en masse that simply contain one keyword. Someone buying “real estate” or “porn” or “lawyer” should be suspect. Especially when the recurrence of those keywords are prevalent. Obviously sites are not linking naturally to each other using a method like that. It’s ALWAYS bought or traded. Always!

    thanks, Matt

    marc

  284. A more stringent process is needed for local. A search for companies like mine in my city lists a few businesses that have no physical address here and are nothing to do with the local market. The addresses are often residential ones, so perhaps they are sending the postcards to employee residences or the houses of friends, I don’t know. One SEO company has done this for cities throughout the UK, despite only having an office in one city.

  285. What about working better towards detecting the original source?

    I don’t mean the spam sites only. But there are so many aggregators, planets, notebooks et caetera (even legitimate ones)… I am looking for some term, and first 10-20 replies are all the same quotes from the same feed (and original is somewhere deep on 2nd or 3rd page).

    General case is difficult, but detecting exact quotes from the feed seems possible….

  286. And smaller (maybe local) irritation: last year quite a lot of fake forums appeared (at least here in Poland). Spammers grab old usenet content and publish it as the forum. And this forum suddenly ranks fairly well while original usenet archives are somewhere deeeeeep.

  287. Martijn van Alphen

    A quality brand or approval. A lot on the web is of a poor quality. It would be nice to see some sort of approval of quality system like most physical products have (like the Michelin stars for restaurants).

  288. According to me the biggest problem with webspam is currently the way Google (you?) deals with it. Recently we received an e-mail informing us that our site was not complying with one of Google’s guidelines, etc. etc., then my considerations are based on direct experience.

    What we have been accused of was something not intentional due to a bug in the code, following some major changes introduced in the site. But, this is not the point. I don’t intend to prove our bona fide here. The problem is that it was impossible to make sense out of that mail message.

    What does it mean a sentence like

    “… pages from are scheduled to be removed temporarily from our search results for at least 30 days. …” ?

    Scheduled? Who, on earth, can tell when those 30 days are starting and ending, after reading a phrase like that? Pages? Which pages? All of them? Some of them? The old ones?

    If one says “removed”, I suppose that the pages affected are those already indexed. You can remove from a certain place only something that is already there. Well, not at all. Pages indexed in the past are still there, but no new pages are visible.

    The whole domain is affected. Then, even though only one site doesn’t comply with Google guidelines, all sites within that domain are banned. What’s the reason for that? A site can be clearly identified by its hostname, so you don’t need to ban all the others. In democratic societies, responsibility is individual.

    Then, an invitation follows to submit the site for reconsideration. We did it, but we received noreply. What do we have to expect? Has the site been already reconsidered? Is it now complying with Google’s guidelines? Complete darkness.

    Removing sites from Google’s indexes has a deep impact on real life. From one day to the other your income is drastically reduced. If it’s a company site, some people will be certainly fired.

    Google determines the behavior of Internet users. You can build the best and most interesting site in the world, but what makes the difference is being on Google’s pages. This is a fact. But occupying a dominant position means not only benefits and advantages, it means also to have responsibilities towards the society as a whole.

    Google should act according to clear rules and should inform site owners of what is going on. As of now, one is condemned without a trial to a sentence of indefinite duration, with no possibility of defense. In real life, this wouldn’t be tolerated in civilized countries. But, today, there’s not much difference between real and virtual life, since what happens in the latter deeply affects real people, not avatars.

    To start with, it would be very simple to interact with site owners, to answer such simple questions as the ones above, keep them informed for how long and when their sites are penalized, whether their sites are now complying with Google’s guidelines or not yet, to reply their mails… I know that this requires a lot of resources, but, as I said, to be the only actor in such a huge market doesn’t mean only advantages.

  289. Jacques Meyrues

    Hello Matt, first of all I apologize beforehand for my English mistakes (I’m French 🙂

    What I would like from google (not directly SERP related) :
    – Informations : Thanks to your guidelines we (webmasters) already have some useful informations about what not to do ; but every time we make
    a modification to one of our sites we fear loosing rank because we would have made something that google does not like.
    I believe that most webmasters are honest and fair people that work hard to improve their site contents / structure and ranks, but crossing the line
    between optimizing a site and “cheating” is all the more possible as this line is not clear.
    So please keep up giving us updated informations about what not to do.

    What I would like to see removed from SERP :
    – cost-comparison sites : They often trust the first places and bring no real content.

    – video sites like Youtube or dailymotion … : There is already an “Images” category on the google page, so why don’t you create a “video” category ?

    – domain squatter spam : they are of no use, except to their owners.

    – duplicate websites owned by one company : I know of a company who has more than 35 sites like this.
    For example imagine a company selling cars and car-parts (it’s just for giving an example ; the site I talk about is not at all in the car business). This company has a main site toto.com with nearly all it’s products. It also have a wheels-cars-3000.com site with only the wheels (already on toto.com) a seat-car-3000.com site with only the seats (already on toto.com) etc … Having range specific smaller sites is not a bad thing by itself, but all these sites have the same structure (they all use the same software) they share the same images and text. The smaller sites of this company do not bring additionnal informations to the customers ; they are just like its main site, but with less products, and are only intended to make it easier for the site owner to trust the places on SERP. The site owner of toto.com has also make other dedicated sites for specific customers such as cars-for-doctors.com, cars-for-lawyers.com, cars-for-nurses.com … with again the same pictures and text.
    Several weeks ago I discovered by chance the same company had just created a lot of manufacturer dedicated sites such as ford-bytoto.com, nissan-bytoto.com, generalmotors-bytoto.com etc … without the agreement of the manufacturers. Like all the other sites of this company, these new sites have a similar look and share the same images / data.
    I made a search on google for some products sold by this company and on the 4 first serp this company appears at least 11 times. I thefore made a spam report and gave all the needed informations. I thought that with such a stupid cheating method google would act quickly, but up to now there is no change.

    This brings me to my next wish :
    – Make sure spam reports are looked at seriously : Through the spamreport page of the webmaster tools you provided us with a way to inform you of spamdexing. However, while browsing webmasters forums I never ever read a post where someone said “I made a spamreport and google banned the cheater”.

    I’m sure you are taking the spamdexing problem very seriously, but many webmasters feel like you don’t care at all. I suppose that you receive a lot
    of spamreports. Most of these reports are probably made by unidentified people trying to harm a competitor, but you must also receive spamreports made from webmaster tools pages of google accounts. In this cases, the senders of the reports are identified and they know that you know. If they did not really believe their reports justified they would not have made them this way. I also suppose that some of these reports are not technicaly justified, but some of these reports are, so when there’s no visible action from google the people who sent the reports don’t understand.

    For example I have a competitor (not toto.com another one 😉 whith an online shop. Several months ago this competitor bought some domains names and created basic doorway pages. I made a spamreport without result.
    This month I discovered that this competitor also had a “multilayered site” 🙂 : There’s his real shop that can be browsed by customers, but in a subdirectory of his server there are also about 900 doorway pages (with keywords stuffing hidden text ….). A lot of these doorway pages exist since more than one year and appear well placed on Serp. I had not noticed them before because most of them look like genuine search pages (but in fact they can’t be find when browsing his shop ; they are optimised for specific keywords and are all similar in their structure (with no real content)). I made a spamreport with all the details (I thought that with such a number of doorway pages google would react) and guess what ??? a few days ago my competitor site dissapeared from the google index. I thought “Whaou !! Google is really taking our reports into account”. Unfortunately three days later his site is back on the google index. The doorway pages are still there and this competitor still trust a lot of first places on important keywords ??????

    So my main advice is take the identified spamreports seriously. It helps you (as the detection work is made by webmasters) and it’s good for all
    of us (people who try to follow the rules will be encouraged to keep on ; and people who try to cheap will be punished).

  290. As already mentioned, searching for error codes sometimes brings op a website that just IS that error.
    I’d like a bit more in-depth, so error-code-search actually brings up a solution or discussion result, and not so much a page full of errors.

  291. Matt, one of the biggest problems right now with real estate websites is the MLS listings from IDX providers. Agents who pay certain providers to make the MLS pages static end up with a site that is 4000 plus pages long, and often will top established, content rich agency sites just because of that. And the MLS listings are duplicates of the local board of realtor sites, and any other realtor in the area that brings them in. Likewise, Zillow and Yahoo Real Estate and even RealtyTrac shouldn’t get extra power from displaying listings (often expired blanks) either.

    In the old days, the local boards had a link that the agents framed in, and nobody really got credit for them. To the customer, it all looks the same, but with this new IDX system, a website with 30 pages of content about the area, the neighborhoods, photos, and things that interest the buyer will always be trumped by a cookie-cutter site with a database feed of MLS listings coming into it. And on top of that, some of the unethical providers are hiding links across the entire database…giving their so called “SEO” customers hundreds or thousands of links that are either cloaked or just jumbled up enough that the site visitor doesn’t see them, which Google Bot picks them up.

  292. Jan,

    I am not so sure I agree with you… you have some valid points, more pages are certainly better, but I have a site with a framed IDX that is ranking well above every other competitor, even ones with literally thousands of listings… there is more to SEO then quantity… the quality of the actual content, the manner which the content is displayed and the quality of the anchor links to the site all play critical roles, particularly in the real estate world where geographic competition is very tough.

  293. Matt, here’s an issue I keep finding. Web designers are hiding links that point back to their website in 1pixel x 1 pixel white .GIF spacer images on their clients’ websites. It’s like their client won’t allow them to post a link on their site, so they hide them instead. I assume image links have less weight than text links, but I think you should also filter out image links when the links are below a certain pixel size and/or file size.

  294. Hi Matt,

    Very simply, I would like to see the local results taken care of. In a down economy, the spammers are winning the battle against the real local businesses.

    The authoritative onebox and the Google 10pack maps results should be policed better. These results outweigh the organic results in many cases. I have asked many people about one case, and they say, “yea, I would click on that for sure”. …spammed!

    Thanks for your time.

  295. I just thought of something else I had forgotten…

    What about adding a service for “certified local” businesses?

    Maybe have a logo on the local pages that state that, and give preference to those businesses who have gone through the motions to verify their businesses and have been proven to play by the rules. It would be similar to the dmoz editing process.

    Thanks again.

  296. Lower PR of the sites with duplicate content.

    I my area of business (the patent business) there are website that are just duplicating the content of published patents and patent applications.
    Of course, this gives them a lot of indexed pages and then, a lot of PR.
    All this, without creating any original content.
    If as you say, Google is all about original content, I would like to see those websites without originally created content penalized.

  297. I just have something simple on my wish list.

    CATEGORIES

    I think it is about time Google comes out with categorized search results. For example have a different engine for shopping, for information, for blogs (which already exists), social networks etc.etc.

  298. Please please remove search results for other SERP’s on external sites, how pointless and timewasting is that? To perform a search on for “CD cases” google and be pointed to a less reliable, auot created, search result page for “CD case cleaner” from “some-website.com” dominated by text link ads MILDLY relating to my search term.

    Also remove sites that just redirect to others via URL’s, such as shopping results. Again, a pointless waste of time site that frustrates and points to expired offers doesnt help anyone, if google crawls more often, which you do, then your results are more reliable than some glorified shopping directory OR AFFILIATE LINK SPAM. How can they get away with it just because they are a big name?

    And sites with numerous pop-ups, and pop-unders, this needs to be addressed and punished.

    Rant over 🙂

  299. Here is another thought I had…Is there a way for google to identify authoritative sites whose links represent “peer review”, even though those sites do not have a high page rank. Let me give you an example.

    I am a music professor and I have an educational site. Lately, I have been receiving links from public school orchestra pages. In my opinion, these links are the best compliments I could receive because they represent professional educators reviewing my site and adding my link for their students. Here’s the catch; some of these sites do not belong to their respective public school websites and therefore do not have even a “1” for a page rank. However, in my opinion, they should be counted well above links I have from other sites that have higher page ranks.

    These orchestra pages do not have a high page rank because they are not much concerned with how well they come up in the searches. They are written by orchestra teachers primarily to make information available to their students. In other words, they are written for an isolated audience. Maybe this is a special case but perhaps there are other situations like these sites.

    Is there a way for google to identify authoritative sites like the ones I mentioned that may not have a high page rank but should be given high priority when counting links? Just a thought. Maybe it could help in the fight against spam.

    Thanks, Matt for opening up this discussion!

  300. I think the most companies dont want to do something wrong for google SEO, but still they have some problems with their rankings. Maybe you can show more options in webmastertools like suggestions, trends what to do and do not, show bad links, or add here a (payed) helpdesk and give websites more details. It is impossible to communicate!.

    Further it is worth it to start a better DMOZ directory with google employees and not the volunteers only (a lot of categories are empty and filled with old links). Ask some money for a listing and check the links every month. This directory could be more important for ranking if you make it more value.

    Stop appreciate links so much, it is more interesting to watch the real unique content and how much it change. Maybe it is an option for every webmaster to put you analytics open voor google ranking: post things like time on site and bounce rates to google. The better websites will do it for sure. In other word, give the best websites more opportunities to show google that they are seo friendly etcetera.

    PS Google spam by their self as well: adwords links at the left site above the organic results, do you think my dad still see the difference? Be an example, dont do that. It will break you up one day.

  301. There needs to be a better way to handle penalty. Not just an email, telling people to read the webmaster guideline, but perhaps automatic warning messages for the specific offenses in the webmaster tools and email alerts, and give webmaster a bit of time to fix their problems if possible.

    Us webmasters are more than happy to follow the rules, sudden de-ranking from Google is like being sent into execution without knowing what crimes we have committed. If you want people to follow your rules, you tell them what they did wrong, and punish them harshly and make examples of them.

    We Chinese have a saying, “killing the chicken to warn the monkeys”, which was a story of a pet owner who owns a bunch of chicken and monkeys making tons of noises. So he went ahead and killed a few chickens, monkeys, being smarter, stopped misbehaving right away. Make a public demonstration of punishing wrongdoers in order to deter innocents who might mimic the example of the wrongdoers.

    Same goes for misbehaving webmasters, write down your rules (webmaster guideline is far too vague), and make a few public examples with the exact crimes of each offense, smarter webmasters will get it right away and fix their offending errors and hopefully that message will spread to the less informed webmasters (chickens). Right now Google is like randomly firing shots into that bunch of monkey and chickens, and they are still making tons of noises and jumping about without knowing what they are doing wrong. I’d be very happy to know what I am doing wrong currently with my site to get the Google penalty so I can fix it to comply with the rules.

  302. I suggest to add more human power in order to enrich the algorithmic PageRank system with corrective intelligence.

    Human expertise and manpower will decrease the volume of webspam ex post. I think webspammers will always try to overcome the algorithm; but experts who diminish spam after being indexed will help to keep the spamrate low.

  303. Things that really bug me as an SEO (and are bad for Google) –

    1) Proxied numbers listed with regular numbers in Google Local Business results. Background: I have seen some companies set up several local businesses setup ‘proxied’ phone numbers, e.g. 1300 FOO BAA which redirect to their actual phone line. In one case I saw one company listed 5 times in the SERP mini-map using proxied numbers which all went back to the same phone number.

    2) Ecommerce sites posing as informational sites. What I mean by this is ecommerce pages which pretend to be impartial ‘buying guides’ but are actually just sales landing pages with product listings. It is fine when people do this on seperate pages, but when they do it on the same page it gets really annoying. I have been doing some SEO work in the diamond industry lately – this would be a good place to start.

  304. Also, there should be cash rewards for sucessfully reporting spam!

    Adwords credit would also be acceptible 😉 Just a thought.

  305. 1. Kill duplicate sites. If someone uploads the same site under multiple domains, they should be killed from the index. Happens in Google Australia a fair bit. If you don’t penalise this behaviour, more people will do it thinking they can get away with it. There is one large Australian web developer who is running 100s of sites in duplicate in Google. You can email me if you’re interested in more info about it.

    2. I also agree with some of the comments regarding product review sites. I’ve lost count of the times I’ve typed in a search query, got a search match, opened the page and there is a totally different product on display.

    3. Also, I don’t like affiliate sites. They add no value when you are looking for something. They should be resigned to the bottom of the rank, appearing on page 57 in the results for instance. (Just a number I’ve plucked from the air, LOL, make it page 157 what the heck!)

    4. Off topic, but fix the launch of PDFs in Chrome. Every time I try to open a PDF page in Chrome it shuts the program down. I have already submitted this problem to Google (about a week after Chrome’s launch), but still no fix yet. It means I don’t want to use Chrome even though I love it – it causes too much PDF grief.

    I love Google in every other way. Cheers Matt and have a great day.

  306. Kill the duplicate sites.

    Currently I have found that few companies are taking the whole contents (and in few cases the design also) of other competitor’s site which is at top 10 in the SERP for the most competitive keywords.

  307. Matt,

    Add another to your tally of demote ripoffreport.com. It’s a shame that they are able to take advantage of SEO tactics that get around the spirit of Google’s algorithms. I find it odd that their so-called reports by consumers all seem to be written by an SEO copywriter, that knows how to both complain about a company and do so in a way that increases SEO. They’re a lucky bunch to find so many legitimate complainers that also happen to be SEO copywriters don’t you think?

    Anyone else think that Matt’s talley’s of complaints about RipoffReport.com should be a little higher than two? If so, please speak up!

  308. In my opinion:

    1. Paid links

    &

    2. The possibility that you could buy a domain that already have good pagerank because it have existed before. I know this could be hard to fight because of the preexisting backlinks but in my opinion this should not be possible! Of course you shold be able to move the domain to another (ex. if you sell your bussiness) but if the domain cease to exist the good pagerank ect. should be neutralized an the preexiting backlinks should not be counted forwardedly.

  309. Asking Google to stop web spammers is futile. Google’s business model requires web spam.

    Basic truth: when Google organic search takes you directly to the site that sells what you’re looking for, Google makes no money. But if the organic search results send you to a page crammed with AdWords ads, the cash register rings in Mountain View. Google’s results can’t be “too good”, or users won’t click on ads.

    This is the paradox of Google’s business model. Almost all of Google’s revenue comes from ads. Google, as a business, is an ad agency. Think of Adwords-heavy pages with high rankings as Google’s form of “interstitial advertising”, like those pages you have to go through to get into many sites.

    That’s why this problem doesn’t get solved at Google.

  310. Hi Matt,

    Didn´t see any info on you taking action on the input you got?

  311. Ok so its a bit late as there are only a few days of 2009 left. My 2009 webspam concern was the same as many others “local”. But I have to say it has vastly improved over the last few months. The city I live in was getting spammed by national call centres who were just picking random addresses in the city centre. At the moment it seems to be working well with only genuine local companies showing in the seach for the business i’m in.

  312. Autoblogging would be nice to focus on as it has been flooding the SERPS annoyingly

  313. David Saunders
    Plymouth, England

    Local – Places

    Way too many fakes and cheats. Besides being a web design service I also use Google Local for everything with my G1 and get some frustrating local listings at times.

    I think some feet on the street in some manner would really help tidy this up.

    Brilliant Google though

    David

  314. Ralf Schwoebel

    I would pretty much like an improvement on the communication processes with webmasters. The background is: Some of us are pretty good system admins or employ coders who might be even more experienced than some of the top Google guys.

    Yet that background is not valued in communications and whenever I point out a bug (currently there is 1 bad one in Google/base and WMT) I bounce back to automated answers.

    I understand that the amount of inquiries to Google are massive and diverse and I am also not angry about the lack of qualified feedback, just helpless and desperate.

    In my opinion, Google has to introduce a ComRank value to Google accounts, including trustrank, age of account, amount and quality of requests sent, etc.

    Use your algo of pages on the accounts of registered webmasters and react accordingly to requests of ComRank 9 webmasters, who need some qualified feedback and are always ending in a loop of automated standard answers. I fear I will not get that “base bug” fixed within the next 6 weeks from your side … and that is IMHO a very bad sign and it is also exhausting to report a detailed problem to an automated answer system or people who do not understand what we write them.

    I know you folks want to automate the crap out of any process and that is good – but sometimes you automatically mess things up and I vote for: admit it, adjust your strategy and add more community evangelists like yourself who can improve things around the standard processes!

  315. Matt,

    No. 1 on my list would be to reduce the number of AdSense blocks allowed on a page. At the moment, some sites seem to be full of it, and it’s very distracting and looks spammy – and, of course, it tempts people to slap up a ton of crappy-looking sites that offer no real value.

    I really can’t see why you would need more than a couple of ad blocks, and maybe one of the smaller link blocks too.

    No. 2 would be those sites that appears in the SERPS that just take you to another page of scraped search results. I know you see fewer of these nowadays than previously, but they shouldn’t be there at all, as they add no value to the end user.

    No. 3, which is not exactly a spam issue per se, but it would be useful to know what Google deems to be good design / content, because it’s seems like it’s very possible that you try to build a good site and inadvertently fall foul of something you don’t know about. The information may be tucked away somewhere, but an upto-date summary in an easily-accessible location would help.

    While you could argue that this will also help the web spammers, good sites take time and effort to build, and that’s one thing the spammers don’t seem to be want to be bothered with.

  316. Hi Matt

    I think Ripoffreport and the rest like it. We could all build a site like Ripoffreport and start posting bad things.i found the site when i was looking for an seo company i know I found some very bad reports. I used my own steam to find if the story was true. (it was not). so find sites like that unfair.
    Goodluck with you work your doing a great job.

  317. salameh allan

    Hi Matt,

    I want to suggest a few methods to improve search to return relevant results and to improve the search experience.

    1. If you look at how the trading market works, it depends on news, who made the news and where the news comes from. If we can say matt cutt says websiteA is a good website and Matt Cutt is a known figure on the internet then maybe we can put a weight on the site to say, yaa this site got an approval from a well trusted person on the net. Its the same how the market operate on news and trusted source.

    2. Certificates, allot of sites are verified by verisign, a trusted organisation, so if we promote such activity this can lead to better and trusted sites.

    3. Look at if the site gets put in the news for good reasons, maybe downgrade for bad reasons.

    4. Look at Alexa traffic as an example maybe for more information about the site

    5. Maybe use the company registration to verify the company

    6. Use the yellow pages, ie trusted directories

    7. Authors tag on the site, build a database of authors who trusted

  318. Hi Matt,
    Is there a form for submitting general issues with search results?
    For example a search for ‘merchant services’ in Google.co.uk brings up a local search map which is irrelevant, I would like to see some way of reporting searches like these?
    Tom

  319. Google saya “The life span of a Google query normally lasts less than half a second, yet involves a number of different steps that must be completed before results can be delivered to a person seeking information.” Is this true or do you cache popular results for a pre determined amount of time? I’d really like to know.

  320. Hi Matt,

    I think the term is ‘referrer bombing’?

    A lot of my sites have shown that i have had traffic from a lots of sites with .golbnet.com extension but when I log on to these sites i cannot find a link to mine? Still baffles me but doing a bit of research a lot of other web-masters are having the same problem. Would this type of spam hinder our rankings?

  321. Create some Certification for WebSpammers Detective. I think this must be taken care by group of SEO people taking initiative. Everytime some new tricks to spam is invented atleast one of these Detectives can buzz the alarm and Google Spam team can start taking actions.

  322. I’d like to see Google de-rank those ultra often updated blog network that make their lives republishing other’s feeds excerpt and full filling every page with ads (even AdSense).

  323. I am a big fan of Matt Cutts. Aaron Newton made a good point. Is there a followup post to see if any of these suggestions happend?

  324. I find it odd that their so-called reports by consumers all seem to be written by an SEO copywriter, that knows how to both complain about a company and do so in a way that increases SEO. They’re a lucky bunch to find so many legitimate complainers that also happen to be SEO copywriters don’t you think?

  325. I realize I’m late to the suggestion party, but perhaps Google could start a registry of known spammers (for consultants who create non-relevant links en masse for customers) or for sites that continually create non-relevant comments (ie Best Viagra Deals) on blog posts in an effort to create links. One of my blogs must get 50-75 per day.

  326. Best change, and low spam, now in 2013:)
    Pinguine and Panda work perfect

css.php