Notifying webmasters of penalties

If you don’t want to read the full post, the executive summary is that Google’s Webspam team is working with our Sitemaps team to alert some (but not all) site owners of penalties for their site. In my world (webmasters), this is both a Big Deal and a Good Thing, even though it’s still an experiment. Sign up for Sitemaps to try it out. Oh, and the Sitemaps team introduced a bunch more new and helpful features too. Check out the Sitemaps Blog for more info. 🙂

The responsibility of picking “Don’t be evil” as an informal motto is that everybody compares Google against perfection, not to our competitors. That’s mostly a good thing, because it keeps us working hard and thinking how we would tackle each issue in the best possible way. Lately, I’ve been thinking a lot about how the ideal search engine would communicate with webmasters.

There’s a Laurie Anderson song called “The Dream Before” based on a quote by Walter Benjamin. Part of it goes

History is an angel being blown backwards into the future.
History is a pile of debris,
and the angel wants to go back and fix things, to repair things that have been broken.

But there is a storm blowing from Paradise, and this storm keeps blowing the angel backwards into the future.
And this storm is called Progress.

In the early days when Google had 200-300 people there was no way we could do everything we wanted to do. But as Google grows, we get more of a chance to “go back and fix things,” to build the ideal search engine. And part of doing that is having more and better communication with webmasters.

I believe the ideal search engine would help site owners debug and diagnose crawl problems, and the Sitemaps team has made great strides with that in Google’s webmaster console. But I think the ideal search engine would also tell legitimate site owners when they risk not doing well in Google.

For example, I recently saw a small pub in England that had hidden text on its page. That could result in the site being removed from Google, because our users get angry when they click on a search result and discover hidden text–even if the hidden text wasn’t what caused the site to be returned in Google’s results. In this case it was a particular shame, because the hidden text was the menu that the pub offered. That’s exactly the sort of text that a user would like to see on the web site; making the text visible would have made the site more useful.

That’s an example of a legitimate site. On the other hand, if the webspam team detects a spammer that is creating dozens or hundreds of sites with doorway pages followed by a sneaky redirect, there’s no reason that we’d want the spammer to realize that we’d caught those pages. So Google clearly shouldn’t contact every site that is penalized–it would tip off spammers that they’d been caught, and then the spammers would start over and try to be sneakier next time.

The way that we’ve been tackling better communication over the last few months is by testing a program where we try to email some penalized sites that we believe are legitimate. The issue is that it can be hard to contact a site by email: some sites don’t give any way to contact them, and some sites don’t receive/read/respond to the emails that we send. Overall, the experiment has been very successful, but email has definite limitations.

The Webspam team and the Sitemaps team have been working together for several months on a new approach: we are now alerting some sites that they have penalties via the webmaster console in Sitemaps. For example, if you verify your site in Sitemaps and then are penalized by the webspam team for hidden text on your pages, we may explicitly confirm a penalty and offer you a reinclusion request specifically for that site.

I’m really happy about this new way to communicate with webmasters, even though it is a test for now. If the initial results are positive, I wouldn’t be surprised to see us gradually broaden this program.

Here’s some questions from a webmaster perspective:

Q: Are you going to show every penalty for a site in the webmaster console?
A: No. Our program to alert webmasters by email has been successful, and this new program is a natural extension of that, but we’re still testing it. We are not confirming every site that is penalized for now, and I don’t expect us to in the future.

Q: I don’t understand why you wouldn’t show every single penalty to every single site owner that asks?
A: Let me give you a couple examples to illustrate why. First, let’s take an example of a site that we would like to confirm a penalty for. Check out this site:

A real, legitimate hotel in the UK

This is a small hotel. They offer 18 bedrooms in Bath, England, for you to rest and relax. It’s a real site for a legitimate business. But notice the hidden text at the bottom of the page where I’ve highlighted in red. This is a perfect example of a site that should be able to find out that their page conflicts with our quality guidelines. Google wants this hotel to know about potential violations of Google’s webmaster quality guidelines on its site.

Now let’s look at an example site that we wouldn’t want to notify if they were penalized:

A very bad, very spammy site

From this picture alone, you can see that the site is doing
– keyword-stuffing
– deliberately including misspellings
– nonsense or gibberish text, probably auto-generated by a program
– you might be able to guess from the left-hand side and all the variants of “tax deferred” that there are many other pages like this. You’d be right: the site has thousands of doorway pages.

What you can’t tell from the snapshot is that
– the site owner attempted to gather links by programmatically spamming other sites. Specifically, the site owner found a vulnerable software package on the web that doesn’t yet support the nofollow attribute for untrusted links, and then spammed several good sites trying to get links.
– this site is also cloaking. Search engines get the static page loaded with keywords that you see. Users get a completely different page.
– the pages returned to users employ sneaky redirects. Users get a small page with a JavaScript redirect and also a meta refresh; each page just does a redirect to the root page of this domain.
– Given all this, would it surprise you to find out that when a user finally arrives at the root page, every single link that they are offered is a link that the spammer makes money from?

Needless to say, I’d rather not tip off spammers like this when we find their pages.

I hope these two examples give you some idea of the sites that we’d like to alert (and not alert) to issues with their site. Just to repeat: not every site with a penalty will receive confirmation and the offer of a reinclusion request. But if this program works well, we’ll certainly look for ways to keep improving communication with legitimate site owners while not tipping off spammer sites.

Q: Okay, okay, I understand that not everyone will be notified of penalties, and that it’s a test. What will it look like if I do have a spam penalty?
A: In the webmaster console, once you verify a site, click on the tab labeled “Diagnostic” and one of the page sections is called “Indexing summary.” The specific text will say

No pages from your site are currently included in Google’s index due to violations of the webmaster guidelines. Please review our webmaster guidelines and modify your site so that it meets those guidelines. Once your site meets our guidelines, you can request reinclusion and we’ll evaluate your site. [?]
Submit a reinclusion request

If you find the issue and clean it up, then just click on the “Submit a reinclusion request” and fill out the form.

(Someone asked me this at a recent conference, so I’m throwing it in.)
Q: I’m the SEO for a client’s site; can I enroll my client’s site in Sitemaps on their behalf?
A: If you have the ability to upload files to the root directory for the client, then yes. Just log into Sitemaps, add the site, and you’ll get a file to upload to the root level of the domain. Multiple people can verify the same site in Sitemaps, so both client@gmail.com and seo@gmail could sign up and get Sitemaps stats for a domain, for example.

212 Responses to Notifying webmasters of penalties (Leave a comment)

  1. Great news. I was just talking last night about how google might have to disclose a little more about how it ranks sites due to the recent lawsuit from the link farmer. (I’ll trust your explanation of why you’re doing this, though, rather than my CYA theory.) Now, if you’d just tell me and others how to get out of the non-existent sandbox that’s been holding my sites hostage for 10 months, I’d be most appreciative!

  2. Great post matt, Very helpful – never had any experience getting my client sites penalized. Will pass on this information to my friends though.

    Your post mostly deals with onpage spam. How do you deal offpage spam like comment spam, crosslinking lots of domains etc??// Does sitemaps team recognize and notify about this???

  3. Great stuff…the sitemaps team are really working wonders. I’m looking forward to see what they come up next.

  4. Very informative post Matt. Thank You! Sitemaps have been a very usefull tool for us and i use it everyday. It’s definitly a breath of fresh air to know that our site has no errors!

    Keep up the good work,

    Scott

  5. Hi Matt.
    Thanks for the info.
    I recently bought a domain and its unusual that after 3 months none of the pages is indexed, not even if I search for the domain name.
    so I maild Google with THE QUESTION 🙂
    Is my site banned?
    From what I understand, if in the sitemap there is no info about that, means its not banned?
    Thanks.
    Pex

  6. C’mon Matt, it’s not hidden text, it’s tinytext!

    🙂

  7. Its great to see Google doing so much to combat this and also that it is being discriminating and enforcing rules based on the level of intent.

    But I have to say the new SiteMaps design is UGLY. It’d be nice if they made the sludge colour more like GMail’s blue.

  8. Good work. However, since this has been introduced, I’m now seeing this error for some of my sites under Potential indexing problems:

    “We can’t currently access your home page because of a robots.txt restriction.”

    When using the “robots.txt analysis” tool, I click the “Check” button and the result says:

    “Allowed by line 2: Disallow: Detected as a directory; specific files may have different restrictions”

    (I’m pretty sure that it only used to say which line disallowed URLs to be checked…)

    And here’s my robots.txt file which hasn’t been a problem in the past:

    # BOF #

    User-agent: *
    Disallow:

    # EOF #

    Is this a known issue? I assume it’s just the tool that’s broken rather than Google being unable to read robots.txt files incorrectly…

    (BTW, my other sites’ robots.txt files that actually contain Disallow: entries don’t seem to have a problem.)

  9. Wow, that’s great to know Matt. I use sitemaps and thankfully (cross my fingers) have never had a site banned. Now at least I know where to go and what to do if it happens.

  10. Hey Matt,

    Questions regarding hidden text and keyword stuffing.

    I came across a fantasy football site when I accidentally misspelled a term in a search engine. The landing page was this:

    http://www.ffspiral.com/typos.php

    Obviously, there’s plenty of keywords there that lures in the visitors, but nothing is hidden and the site admits that these terms are all typos. Is something like this okay? If so, I’m adding it to my site! 🙂

    I’d be interested in hearing your thoughts – thanks.

    -Mr. Football

  11. So often it seems (or feels) like optimizing for Google is a “webmaster vs. Google” battle. I’m excited to see Google reaching out to webmasters with tools that help us achieve a common goal. You guys are doing a great job. I’ve really gotten a lot of useful information out of certain Google tools (Analytics, for example) and I’m looking forward seeing how Sitemaps will help.

  12. Chris Harris, most of the stuff that we’ll be notifying people about is on-page stuff for now–things like hidden text and hidden links. Those are two of the most common mistakes that legit sites make.

    Pex Cornel, if you don’t see a message in Sitemaps, it doesn’t mean that there’s not a penalty. Right now, we’re not telling every site that has a penalty about it. If you bought the site 3 months ago and still aren’t seeing any pages in Google, it’s possible that it does have a penalty. I’d read up more here about how to do a normal reinclusion request.

    Ken, it’s some pretty tiny text. 🙂 Splaso, I’ll pass on the feedback about the colors.

    Tony Ruscoe, I’ll ask someone about that. I know the Sitemaps folks read over here, too.

  13. I have the same problem with the robots.txt file. It allows all bots but I get an error.

  14. >> the ideal search engine would also tell legitimate site owners when they risk not doing well in Google.

    YES it would. Another very helptul post Matt, and I think the sitemaps process is a good idea, though I fear most people who have problems (I’d guess 95% of your ranking complaints) fall into the “subtle algorithm issues” that are still addressed vaguely or not at all.

    Some transparency is a virtue, even with the algorithmics. I’d guess that there is MORE spam from secretiveness because it creates an elite group of successful algo chasers rather than putting everybody on the same page, competing only on content quality issues.

  15. [quote]What you can’t tell from the snapshot is that
    – Specifically, [b]the site owner found a vulnerable software package on the web that doesn’t yet support the nofollow attribute for untrusted links,[/b] [/quote]

    Hm, does that mean the directory software I installed a long time ago (long before nofollow existed) is causing my website to raise a flag and therefore is not showing up in any serps?

    My site in question offers tons of information [b]+[/b] a local business directory (yes, the software is a little old, but it cannot easily be upgraded). If such a directory would meet the criteria listed above by you – would that explain the whole domain with all the other content is punsihed? If that is the case – let me know and I remove the directory immediately.

    Can you elaborate? Thanks

  16. Hi Matt,

    Good to here you guys are giving webmasters more and more information!

    I got a question though; I hope it’s not off-topic. Since March 14th (BigDaddy ?) one of my sites has no descriptions left in the SERPs, you only see the title and uri. But it’s still ranking the same. We never use Black-hat stuff, so it can’t be a penalty.

    What is happening here and what should I do?

  17. Matt,

    Is there some quirk that cause a site to completly disappear from a google search and then return to it’s place a few moments/hours later? I just freaked when searching for my site and finding it gone from google. I paniced, sent a reinclusion request and 3 minutes later, my site is back in all the right places.

    PS, thanks for this informative blog. I’m not a seo expert, just a Realtor with a web dominance dream.

  18. nsusa, what matt was referring to there is a forum software or blog that posts user comments without the nofollow tag.

    Thus, they created a bot to automatically comment on posts in the forum or blog, and put links to their site in all the comments.

    Because of such bots, I’ve actually shut down some of my older sites. I had one bot alone fill up 10meg of my mysql database / day. It would post 1 comment on every post… every day.

    Your outdated software isn’t what’s throwing up the flag.. it’s people who take advantage of old software and use it to post a ton of links to their site.

  19. Thanks for the advice and for the graphic explanations that comes with it. I will be doing a presentation on responsible SEO soon and your examples will speak loudly.
    Regards,

  20. Matt, can you please clarify your statement about Doorway Pages?

    You said: dozens or hundreds of sites with doorway pages followed by a sneaky redirect

    Google Guidelines say: Avoid “doorway” pages created just for search engines, or other “cookie cutter” approaches such as affiliate programs with little or no original content.

    I am wondering if Google recognizes any legitimate use for doorway pages?

    My own example is that we sell widgets across the United States. We know that state and major city names followed by the word widgets and a couple widget related terms are effective key phrases and that the people who search using those key phrases will be interested in our products. We would like to create simple landing pages on our website to help these potential customers more easily find us (no cloaking or redirects or stealth), but we have not done so because we are afraid of incurring a penalty.

  21. – Clients site (in url) not even showing up for business name.
    – Site several years old.
    – No penalty showing on sitemap.
    – Ranking well on other search engines.
    – Properly cached pages.
    – 1 1/2 years ago it was hit by a 302 link, and none of its pages have shown on searches at Google since.
    – at the time of the 302 link, index page disappeared, and other pages progressively disappeared/decached. Got 302 page stopped, and after a year, pages started to be recached
    – many site reinclusion requests and nothing changed

    There must be a number of sites like this. Its very frustrating, especially when the SEO is good enough to have pages ranking well on the other search engines.

    What can be done about such situations???

  22. Wow,

    this is the coolest – doesn’t seem to be working yet (for me at least)….

    thanks matt, thanks google

  23. Speaking of disclosure, we should all go to the New Yorker caption contest at: http://www.cartoonbank.com/captioncontest/. Vote on #46.

    This shows Matt that we want more disclosure from Google, but not TOO much. It’s good for a laugh at least.

  24. BTW, Barry Schwartz was live-blogging the Meet the Crawlers session at SES Toronto where Shiva first introduced this to an audience:

    Google now shows you that “no pages from your site is in the Google index” for constitution.org. He goes to the site and shows at the bottom of the page, hidden text – and that is the reason Google shows the message “no pages from your site is in the Google index.” This is pretty big stuff.

    You can see the hidden text at the bottom of the constitution.org home page, including the words “constitutional compliance.” Who would SEO for a phrase like that? 🙂

  25. Michel Leblanc, happy I could give examples. Lots of time, people don’t realize just how spammy some sites can get.

    T2DMan, was your site in anything like an digital automated link exchange network? Maybe in 2004?

    nsusa, what Ryan said. BTW, Ryan, how did the interview go?

    Greg, I’m guessing you were just hitting different data centers. Different centers can rank things differently.

  26. Cooperation between different parts of Google is really a good thing. Things like this is the result of this. I’m hoping for you working together with the Analytics team next, you are overlaping alot.

    A question for the sitemaps team: I have been using Sitemaps for a long time (over 2 months I think) and I’m fully indexed and get crawled quite often. But the “Crawl stats” and “Page analysis” are almost empty of information. Is this some known bug or should I just continue waiting?

  27. I agree with Splasho, the new look is ugly, but progress isnt always pretty.

    RE: The Tax deff3rd,
    “..every single link that they are offered is a link that the spammer makes money from..”

    Seems more and more of these crappy scrapped sites are the results that turn up, glad to see it’s being combatted!

  28. Matt,

    Does this also include the CSS display:none tag as well?

    Stuey

  29. Not really an issue but a little annoying.
    Searching in the help pages I couldn’t find really good information about this statement:

    some pages of the site are partially indexed

    I wonder is this because of exclusion of a directory by robots.txt, while the excluded directory shows up as URI only on a site:domain search? All other pages have description and titles.

  30. Ralf – One of my sites shows the same, but I was unable to find what pages it refers to when I did a site:domain search (all pages show full descriptions and cached copies).

    I’m also getting an error saying that my sitemap unable to be downloaded because it is “restricted by robots.txt”, but I can pull it up with no problem, and the Sitemaps tools shows no errors.

    Matt, perhaps this is a bug?

  31. Just wanted to quickly post here to mention that we’ve fixed the robots.txt issue and are in the process of refreshing the display of information. So if Sitemaps is incorrectly reporting that robots.txt is blocking your site, you should see an updated status shortly. The Sitemaps blog has more information:
    http://sitemaps.blogspot.com/2006/04/updated-robotstxt-status.html

    Thanks for your patience as we update the display!

  32. Matt, I don’t think the phone call went too well..

    he told me that my work experience was too light for what he was looking for (and i have to agree.. i graduated in Dec 2004 with my BS), but he sent me a worksheet anyway. He was concerned with my lack of python (1 semester in school), and my lax skills at user interface design. (I can critique the heck out of a UI and tell you what stinks and why it should be better… but graphically I can’t create…)

    He reccomended I apply for a different opening. Sadly, I wasn’t even sure what position it was before the phone call, as I had sent in a resume and it got forwarded to him.

    I filled it out and emailed it back, and havent’ heard anything from him since.
    I know I screwed up the last question on it though…. it only worked in IE not firefox, and I was running out of time before it was due…

    I guess it’s time to start monitoring the Google jobs website and keep sending in those resumes.

    I have my BS in CS, but i’m going back next fall for my MBA (and possibly a MS in Industrial systems engineering too.. cuz it’s only 6 more classes)

    So maybe that will help. (but if your team has any openings, I can fight spam with the best of em!)

    Thanks..

  33. hi kevin
    not a bug but, but a matter of time, I’m sure this will resolve itself, but it shows what may happen if too much time is involved in executing a complex program.

  34. Ralf, I agree… I suppose we just have to be patient (arrrgh!).

    Vanessa – thank you for the update!

  35. Yeah Ralf, I’ve been wondering about that myself.. Trying to think what could cause that..

    I’ve noticed a trend with my partially indexed pages on noslang. doing site:www.noslang.com (and including the excluded results) shows no description for the “partially included” results.

    I’ve also noticed that most of these pages don’t have many (if any) incoming links from other sites. That’s really the only constant I’ve seen on my site.

    It makes sense (sorta), as it’s not entirely natural for people to only link to your front page. However, I can’t see Google rationally penalizing them for that, as in many cases the site owners discourage deep linking to these pages. (it bypasses ads)

    Even more confusing.. these pages show up with no cache or description using the site: operator, but searching for a string of text unique to a page, google shows the page with cache and description.

    I’m guessing that’s what partially indexed means? I’m also guessing that it’s keeping these pages out of results for broad queries..

    But again.. this is only speculation.

  36. Hi Matt,

    I think a better way is to not require people to enter sitemaps initially. There should be a standard query in google or maybe even *part* of the existing “site:” command that gives the user notice as to whether they should take a next step because there are some penalties. The next step could then be to join sitemaps. Not all of us want to do sitemaps for all sites and lots of our clients who we develop sites for don’t want to a) pay us or b) care about sitemaps THEN a 2nd gen owner/company comes in to clean up the mess and is now screwed and probably has no clue.

    A simple query at google for a given site should provide enough information as to appropiate next actions. Such as contacting google or joining sitemaps for that particular site.

    Regardless, I think this is a good step but lacks the simplicity that is required for the millions of webmasters that never see this forum or know any better.

    Cheers!

  37. Ryan, confirmed

    I’ve also noticed that most of these pages don’t have many (if any) incoming links from other sites. That’s really the only constant I’ve seen on my site.

    First, I thought partially indexed may be 50% or what ever.
    Sometimes google’s explanations are very hard to understand, may be folks at google have to much IQ, they can not image people like us with less brain.

  38. I’m confused by how it can say that my sites are partially indexed. How does it know?

    It must know that pages exist on the site that haven’t been indexed, so why doesn’t it just crawl them?

  39. Matt, kudos to Google for the notification work!

    I’d echo Thomas’ comment regarding near-duplicate pages. In my industry, I need a way to deal with synonyms without being an evil naughty spammer.

    Specifically, I have pages about “wedding registries”, which to a human are exactly = “bridal registries”. To make it tougher, if I have a page that talks all about our WEDDING REGISTRY (singular would be the natural form to use on such a page), I have a hard time getting it to rank well for WEDDING REGISTRIES (the plural is the more common search!).

    Now, I can gen 4 pages for the crawlers, subbing in a variable throughout the text, title, and meta tags:
    {wedding, bridal} X {registry, registries}

    But that’s gonna make Matt very angry [if we get caught!], and we don’t like making Matt angry 🙂

    I realize that solving the singular/plural problem is tough, and the synonym problem is an order of magnitude tougher for you folks, and will necessarily take some time to solve. But realistically, do you have advice for us in the meantime?

  40. Hey Matt, I was just fiddling around in the sitemaps section of Google and noticed that one of the “tools” was a ‘report spam in our index’ link, which goes to a different form and posts to a different location than the standard ‘report a spam result’ page. Is this because you think people who are comfortable enough with showing Google every nook and cranny of their website aren’t likely to be spamming, and are therefore a more reliable source of anti-spam information? Are spam reports received through the sitemaps given more weight/credit/urgency?

  41. Partially indexed means they have seen a link to the page from other sites, and include the URL in the SERPs but have not yet indexed the content of the page(s). There are times when they will decline to index that content, because of something that has happned in the past, or other “bad” indicators about the site.

    **A simple query at google**

    Have you tried:

    site:domain.com
    site:domain.com -inur:www
    site:www.domain.com

    yet?

    Those three searches can tell you a massive amount about how a site is indexed, show you problems with duplicate content, and show several other things.

  42. Partially indexed means they have seen a link to the page from other sites, and include the URL in the SERPs but have not yet indexed the content of the page(s). There are times when they will decline to index that content, because of something that has happned in the past, or other “bad” indicators about the site.

    **A simple query at google**

    Have you tried:

    site:domain.com
    site:domain.com -inurl:www
    site:www.domain.com

    yet?

    Those three searches can tell you a massive amount about how a site is indexed, show you problems with duplicate content, and show several other things.

  43. Hey Ian, would that be the famous SEO/SEM Ian from Portent?

  44. Michael, if you want to rank for “wedding registry” or “bridal registry” or whatever… I might suggest a complete redesign of your site.

    I know this isn’t what you want to hear, and i’m not trying to be negative, but to a human visitor it appears as if your site is more about wedding travel than wedding registries. In fact, I couldn’t even find any combination of those words right next to each other anywhere on the site, and I didn’t find them at all without the toolbar’s highlight function.

    If you approach it with “i want to build the most useful site for term X” instead of “i want to rank for term X” you’ll do a lot better.

    Side Note: Some of us regulars should start an seo critique site: whydoesntmysiterank.com or something… and offer a free critique every week or so.. I bet it would be pretty successful… and the extra PR (not pagerank), would be pretty good.

    Adam? Harith? Aaron? What say you?

  45. Matt, you accidentally left the h t t p : / / off of the Google Sitemaps link at the top of your blog — you might want to fix that.

    Great news — I’m really impressed by the features you guys are adding onto the Sitemaps service! I hope this trial works out so that you’ll continue to increase transparency (where possible) for webmasters and SEOs.

    Muchos gracias!

  46. Hi Matt,

    Has Google ever thought of publishing a list of domains that have been banned in their index? I made the mistake of picking a domain that was previosly owned. After having my code there not get picked up into the index I sent a note to Google as to why it was not indexed – and then I got the form letter back saying it was banned. I had to move to a different domain to get indexed. I thought of doing a new site which would be a repository of domains that were banned which might make it very helpful for people not to make mistakes of picking expired domains which were bad – but it wouldn’t be as current as Google doing this themselves.

    Or maybe Google could re-evaluate old domains to see if the issues on them have been cleared up by new owners? Just a thought.

    Chuck

  47. That’s awesome Matt. I’ve been waiting for something like this for quite sometime now. I have a site specifically that had over 15,000 indexed and then suddenly it dropped down to 850 indexed, no idea why. I hope this penalty notification will be out soon. I think it’s a great idea 🙂

  48. The site that Mr Cutts took as an example here is http://www.villamagdala.co.uk/

  49. Stuey, if display:none is used to hide text, that can cause issues.

    Ralf, I see that message too. It just means that there can be some uncrawled urls for a site, which in general is not a big deal.

    Vanessa, thanks for the update! Now that’s working together. 🙂

    Ryan, are you still in Michigan? Or are you expecting to be in the Bay Area?

    Glenn Ford, fair feedback. We wanted a way to communicate penalties to a webmaster without showing it to the world, but I see where you’re coming from.

    Michael, we actually do a pretty good job on plurals/singular and synonyms. My advice would be not to make a page for the cross product of all those words, but to make one essay which (naturally) incorporated all of those terms. People fixate on a single page for a single keyword phrase, when one nice essay page could do a good job on bridal+wedding+registry+registries. The title might be a bit awkward: “Bridal Registries: Is each wedding registry created equal?” but doable, and then there’s plenty of room on the page to include each of those words in natural text.

    Matt_Not_Cutts, remember that you sign in to Sitemaps, so we have a little more information that just a web form. I wouldn’t be surprised if spam reports via Sitemaps could be given more weight. Give us a few weeks to hook those reports into our system though.

    Chris Smith, that’s the idea. I’m pretty psyched too. I’ve been wanting to get this out for a while now, and thanks to the Sitemaps folks (with a little bit of assistance from webspam), I love that it’s happening. Thanks for mentioning the h t t p, by the way.

    Rick, I’m going to go ahead and prune that comment. I’d save it for a grab bag thread. No sigs, either please. 🙂

  50. Indexing summary:

    No pages from your site are currently included in Google’s index. Indexing can take time. You may find it helpful to review our information for webmasters and webmaster guidelines. [?]

    Googlebot last successfully accessed your home page on Dec 1 .

    am…… so ……… very………….. tired

  51. Matt, I’m still in Michigan, but have enough vaction time, and would love to see the Bay area.. I think i could get free airfare too.

    I’ll be in vegas in july.. that’s closer (lol)

    If there’s a good reason for me to be somewhere, I can get there!

    Why What’s up?

  52. Adam? Harith? Aaron? What say you?

    On the one hand, it would be a cool feature for Matt’s site.

    On the other, it’s already being done as part of a more comprehensive review over at HEDir to a certain extent.

    So I’m not really sure how I feel about it. I’ll defer to the opinion of the others and Matt. If they’re down, I’m down.

  53. Matt,

    I’m interested to know why the following happens.

    If I type http://www.google.com/webmasters into my browser’s address bar, I get the “Google Information for Webmasters” page. However, if I search google for “sitemaps” and click on the sponsored link that is returned (which has a url claiming to be http://www.google.com/webmasters), my cookie is used to take me to my Google Sitemaps homepage.

    Isn’t this in itself contrary to google’s guidelines?

    Dan.

  54. RE: “Needless to say, I’d rather not tip off spammers like this when we find their pages.”

    I agree Matt. However, the site might also be victim of the spam (just like Google are) due to a shonky SEO “pro”. In these cases, I would love to see Google somehow get the name of the ‘so called’ SEO from them and address the problem at the root? This way, you could kill the spam at the source rather than simply trimming the branches!

    I know that Webmasters are held responsible in these cases but I really believe that is the wrong approach. Besides, a shonky car mechanic WOULD be accountable by law if they did unprofessional work which caused a car accident.

  55. oooo I cant wait for the day an engine tells a webmaster where they are badly going wrong… with so much BS online about this and that you kinda end up hating search engines and forget about using them all together. Im suprised there isnt a better DANGER list for the google engine. at least it would help folks setup their site better. maybe a google bot helper tells you off for getting certain things wrong…. might be worth while hehe

    Dynamic urls
    keyword stuffing
    Over anchor text pages
    Exsesive Keyword link and prodominace.
    Threhold limits per content format
    NAv placement
    Table structures
    Priority of tags
    Anchor variables
    Landing page inbound and foundation keyword matching
    The list goes on
    etc etc
    dam maths

    blah blah If only there was a more detailed rule that all webmasters should know about instead of googles plain jane approach.. it would mean many webmasters would get their crap together. One of my sites was dropped yet still doing well in yahoo and msn…. funny thing is the pages are generated yet very accurate to the keywords and still provide info and the right products to the end user ….yet It was dropped :0(

    help me obi wan your my only hope

    Win a prize and realise the mistake I`ve made with this site (ok maybe not a prize) but at least some respect for noticing

    http://www.products-directory.co.uk/

  56. Matt/Vanessa,

    1. Does it ever make sense to do a proactive re-inclusion request to signal to Google that we play by the rules and we are a real person?

    2. I have a site that Sitemaps says “Googlebot last successfully accessed your home page on Apr 18” but according to the Apache log data, he has hit that page dozens of times since. And yes, I did a reverse-IP lookup to grep out the legit GoogleBot versus the User-Agent spoofers.

    3. Minor nit – bottom of Sitemaps Blog has copyright 2005.

    4. Consider allowing comments on Sitemaps blog rather than pushing comments into Google Groups. Later is good for general discussion, but be nice to have relevant comments attached to the posts.

    Took a little getting used to the new Sitemaps interface, but I like it. Been fun watching Sitemaps develop – nice work Vanessa (saw you chime in above) and others Google’ers.

    alek

  57. Google requests are a waste of space…..

    Tried so many times and never got a reply not that I dont blame them…. I mean there must be what 25 million webmasters…. say they get 25000 emails a day…. geez I hate to think of how much of a headache that must be lol

    Anyways thats my 2 cents and email requests can hurt

  58. I normally would not do this here and Matt, please feel free to delete this post if I have crossed the line.

    Since Ryan said: Side Note: Some of us regulars should start an seo critique site: whydoesntmysiterank.com or something… and offer a free critique every week or so.. I bet it would be pretty successful… and the extra PR (not pagerank), would be pretty good.

    I invite everyone to visit
    http://www.SEOcritique.com/forums

    It’s not fancy or anything cause I just bought the domain and set it up.

  59. Matt, i have a question about “display:none”
    I have a bug in my site, i use some background images to make it pretty.
    The thing is that those images are downloaded last, so my page looks awfull until the browser puts those images in place.
    So I have to preload those images first, and i do it with this line:

    So it’s no text, just images with display:none. Images that would show up anyway, not really hidden.

    So by my logic, no ground for a penalty here.
    Am i right?

    Thanks.

  60. Somehow my line of code was filtered, sorry, here is the line again:
    “img src=”header_mainpage_1.jpg” style=”display:none;” alt=””/”

  61. Hi Matt,

    You are simply admitting that google search algorithms are broke! The google’s situation has deteriorated to the situation AltaVista was in when google made entry, it is as simple as that. Google stock is in the honeymoon phase else GOOG price would have fallen by few points overnight!

    At AltaVista’s peak in 1998, there were only spammy commercial sites that were trying to gain an edge over competitors, using keyword stuffing in the Meta tags, keyword stuffing in the content was pretty low at that time. Non commercial content sites had pretty good standard, people were publishing content to express themselves. Successful Banner advertisement sites were popular sites generating traffic from “bookmark” repeat visitors this has not changed much even today, regular net user visits about 5-6 sites daily that’s it.

    The introduction of Adsense suddenly changed the scenario, today most nonsense crap information is found on sites running Adsense, only purpose of those crap sites is to run Adsense and nothing else and google is squarely responsible for the debacle.

    If google want to improve the quality of content on the net, google should review sites running Adsense and ban crap sites. If scrapper / crap content sites are banned by Adsense program most of those crap sites will disappear, and improve the quality of content available on the internet!

    Proves my point once again “no evil = no profits” “big evil = big profits”!

  62. Hi Matt,

    I’m a human bot and I’m notifying you of a potential problem.

    Potential indexing problems: One of the pages I tried to crawl returned a HTTP error. In particular, while attempting to follow your link to “Sign up for Sitemaps”, a http error code 404 was returned (it’s just missing the http://).

    You may want to take a closer look at: the hyperlink.

    Feel free to file a reinclusion request with this human bot with the subject line “convoluted requirements april 2006 update secret password abc exactly as is or the mail rule will send you to a blackhole”.

    Charles

  63. Thank you for the article explaining Sitemaps further. At first, I thought is was not necessary to submit because I thought my site was built in a spider friendly manner. But after I submitted to Sitemaps, I found out a lot more about my site and was able to track down some bottlenecks and broken links through the HTTP errors page. The Diagnostic area has some great tools.

    IMO, Google would generate A LOT of goodwill by notifing webmasters of potential penalties.

    Thank you.

  64. The IP addres for blogspot in China doesn’t allow to vist.
    So I can’t see the complete post about this penalties. 🙁

  65. I definately will try this .
    I am working now with a website that has in Google 2 versions of homepage (with and without www. ) , the problem is that non-www version is indexed 5 july 2005 , I redirected non-www to www but not sure if Googlebot will ever see redirect as Google didn’t reindexed this page almost a year .

    About this kind of problems system will notify us ?

  66. >2. I have a site that Sitemaps says “Googlebot last successfully accessed your
    >home page on Apr 18″ but according to the Apache log data, he has hit that
    >page dozens of times since.

    Same here. The bot visited us many times since Apr 18. We also do a reverse lookup on every GoogleBot visit so we know it’s the real one. I believe the timestamp is the date when your title was last updated in the index and not when a particular site was last visited. We recently experimented with the title-tag on our main page (http://www.mysite.tld/default.asp) and it was around or at Apr 18 when the new site main title was taken into the index. Not the site description though.

    Oh and a side note: as there are more and more problems with updating DMOZ site-descriptions wouldn’t it be more reasonable to drop the syncing from their site-descriptions? DMOZ has become a bad joke in recent years… we just can’t get rid of that very ugly spelling mistake in our site description. Nobody cares to update it 🙁

  67. Thanks for the Update on Sitemaps Matt.

    i gave a presentation upon some Google products to my bosses yesterday, one of which was the sitemaps before it was updated. Its great to see sitemaps and other G products are being updated after release.

  68. It is a nice thing if Google has plans to do this. I am saying this ‘in general’ as well as, of course, for my own gains as well. I see sites being penalized and then taken back in the index fairly soon. My site has been out of Google’s index for more than one year now. It is a lifestyle-related content site and I cannot figure out what went wrong. It could have been the wrong type of redirection but that has been fixed a few months ago and there have been two updates since that.

    I have written to Google several times but what seems weird is that I did not even get a single answer trying to help me what the problem actually is. And this for over one year! Let’s hope this new drive on Google’s part brings something nice for me.

  69. > Stuey, if display:none is used to hide text, that can cause issues.

    Sometimes it makes sense not to show the complete text for some parts of a page, but to allow users to toggle/expand the parts they’re interested in. Obviously the extended version of the text has to be hidden at first, usually via display:none. How does Google handle this?

  70. Question about duplicate content

    1- Let’s say I write a lot of excellent stuff on a blog. My stuff is so interesting that a prominent news portal (with a hign PR) reprint with my permission the content of a post in its integral version, one day later. They do not link to the post itself but they give me credit for the content in the form of my name and business information without links. This is very good for my credibility but does it hurt my blog and be considered duplicate content, even though I am the rightful owner of the rights?
    2- Let’s say I wrote several articles at my pervious job, which posted my article in form of Web pages. I do not work there anymore but I keep all my rights to my content. Now I build a new web site and reprint in it my rightful content that already appeared in the previous web site that I have nothing to do with anymore. In a way we both share some rights to that content but I am the rightful author. Will that hurt the new Web site? How to legitimately keep the rights and post that content without being penalized?

  71. Great news, I am actually quite happy about this.

    I have multiple shareware products, a site mapper among
    them, which share parts of the same help file
    (e.g. localization — i.e. how users can translate the program).

    As I make my help files available on the website,
    this has caused me some concern about duplicate content.

    If one is “minor” penalized, it is only the page itself
    (of which there is a “duplicate”) that gets penalized, right?

  72. Excellent concept from Google, I’m sure they’ll continue to improve these features as we move along in time. While some people feel Search is an old industry, it’s still quite new, and it’s nice to see innovation is still coming through.

    I think this will be a great resource for legitimate sites that may not be aware of their actions, or, aren’t aware their actions are in poor taste.

    As a web guy, I think my clients will love to see this, although, from what I’m aware, none of my clients are using techniques that should get them banned. I think I’m better than that. I hope.

    Another great bit of info Matt. Thanks.

  73. “Ralf, I see that message too. It just means that there can be some uncrawled urls for a site, which in general is not a big deal.”

    Hey Matt,

    when you say it´s not a big deal, does it mean that 192 out of 5800 pages are indexed and the other are uncrawled? I would say it´s a very big deal. Or can we look forward to have this sides crawled and included back into main index?

    Maybe you can pass this to Vanessa ( sitemaps team ). There is an error on that page
    http://www.google.com/support/webmasters/bin/answer.py?answer=34480&hl=de
    The link does not show up. (looks like hidden link 😉 )

    THanx Martin

  74. Hey Matt – that is really cool, and thanks for the information! KUDOS to the Sitemap team for helping us out 🙂

  75. I just couldn’t let this line go un-responded-to:

    “even though it is a test for now”

    um … as opposed to other items at Google that appear to be in permanent beta?

    Sorry, couldn’t resist. I’m excited about the communication, though. I hope G keeps this program, and never stops making helpful use of it.

  76. hey, permanent beta is great for a developer.. that means anytime somebody finds a bug you can say “Duh, it’s beta!” or anytime they reccomend a cool feature you didn’t think of you can say “yeah taht’s coming in the next version”

    also.. you can quickly take it away too.. and say “it was beta…”

    although, that would be lazy and irresponsible.

  77. Ryan, thanks for taking the time to look at my site! Actually, we rank pretty nicely for the registry-related phrases (we’re #1 for [honeymoon registries]…but it made for an easily understood example. Travel terms are what we’re really focusing on at present, which is why you see those all over.

    Matt, we do have pages within the site specifically mixing those words, as you’ve suggested. In fact, we used to (gasp) cloak our home page and add a huge section of text and headings with all those combinations in it (but we’re better behaved now).

    But those pages on our site which are “registry”-dense don’t end up being shown in the search results…interesting! My guess is that anchor text in IBLs is vastly outweighing the page content itself (we do have a lot of IBLs from partner companies with “registry” in the anchor text). Especially since our home page is what shows up in these SERPs and our home page is what our partners are generally linking to.

    MC

  78. hmm. I saw the member’s post about the “site critique” new forum. Is this something new to search engine forums? My understanding is that there are already many se forums out there who already offer “free site critiques”.

    Am I mistaken about this? 🙂

    This is real good stuff Matt. Just toe the line about “which” sites you actually notify. Some of them might be being helped by a SEO anyway.

    A good suggestion by someone above; Website owners have to be responsible for who they hire to help them, but identifying the actual firm/SEO who may be spamming on their behalf would go a very long way to cleaning up our industry.

  79. Nice one!
    Google listened to my idea and turns sitemap into “the”communication channel with webmasters.
    I posted that idea here in a comment. if only i could find it….

    Maybe i should try to get a job @ google
    🙂

  80. Matt,

    Thanks for updating us on the sitemaps/spam colaboration, I think it’s great! Quick question though… I have a site that’s #1 in allintext, allinanchor, allintitle for the business name, but doesn’t show up at all when searching for it. The site is about 15 months old, and sites that scrape content from other search engines for queries that my site do show up on, and therefore have the title of my homepage in the text of their site, show up frequently. When I wrote Google help asking about it, they replied that since the site showed up when they did a search on the domain, that I could “be assured that your site isn’t penalized or banned from our search results”.

    Both that reply and what you said earlier seemed to imply that penalized = banned.

    1) Is banning the only penalty?
    2) If not, would penalties be something that you would notify a webmaster about?

    Thanks! 🙂

    -Michael

  81. hara_kiri_diy_seo

    well done google

    i remember the panic stricken words of one webmaster who wants, like me, to be pro-google anti spamdex

    “I’d happily fix the problem if only i know what it was”

    on a moral level, leaving people in the dark breeds fear of unjust punishment ….& risks engendering a user migration from google.

    on a practical level, google has a very difficult job balancing it all up

    on real terms, there’s also the reality that rank drops occur because of bad diy seo & not anything to do with penalties….. getting to read sitemaps well + google analytics can help here

    so pleased about this

    any more thoughts about my previous proposal of an up-to-date keyword rank checker on sitemaps?

    best

  82. best post this year Matt

    The partialy indexed is strange I have an SEO client (an entertainment booking agency) that post big daddy has lost some of the key pages in the index google is indexing the contatact us form but keyword.php (where keyword is on the domain)

    Has just dissapeard and the client is losing revenue and has had to lay staff off.

    Ive tried renameing with a more specific keyword and redirecing the old page.

    Given x pages how do i hint to google that one page is more imporant than another ill try seting all pages except the kwy ones to 0.1 in the sitemap and see if that works but some guidance would be great.

  83. See what happens Matt, with all these changes…
    I thinks it’s really sad.
    Businesses rely on Google. We know “it’s free”. But if there were no Google, would be another search engine.
    All these changes puts everyone in “changing mode”.
    Not knowing sufficient information, the most changes made by the SEO’s are many times wrong.
    I don’t want to criticise, Google does wonderful things, but my opinion is that lately Google has taking the posision of a bully.
    Let’s wait and see…

  84. Matt this is an excellent idea that will hopefully assist some seo’s in determining when they are barking up the wrong tree.
    I want to repeat the previous question, is there a difference between banned and penalties?

    Then I was wondering. Google must be swamped with reinclusion requests for websites that has no penalties, but simply aren’t good enough.

  85. Good move Matt. This will save hundreds of man-hours and thousands of dollars of lost revenue for legitimate web sites.

    The first compelling reason to use google sitemaps.

    Paul

  86. Hi Matt and all,

    We hear about hidden text often refered to as css hiden text, but what about the use of the noscript tag. We use this on the home page and sitemap page in case anyone wants to use the site, or contact us for an order that cannot be placed using our javascript shopping cart. It contains our postal address and phone number and a description of the general theme of the shop. is this noscript tag something that causes penalties? is this a vaild or not…anyone?

    cheers
    Jonny C
    UK

  87. Great new features to see, Matt — love what’s coming out of the sitemaps team.

  88. “Website owners have to be responsible for who they hire to help them, but identifying the actual firm/SEO who may be spamming on their behalf would go a very long way to cleaning up our industry. ”

    My site was banned. Doug Heil did the SEO.

    Still think this is a good idea Doug?

  89. T2DMAN

    *Exact* same scenario here, for a company I used to work for. Their site was riding very high – then one day it went bang.

    Never seen since regardless of reinclusion requests – all the same symptoms as you say.

    I no longer work for the company, but it’s been bugging me forever.

  90. The link to “Sitemaps” in the first paragraph of this article needs a http:// in the href.

  91. RE: “My site was banned. Doug Heil did the SEO”

    That’s a pretty wild aligation. Care to back it up with some proof or at least a shred of evidence?

  92. LOL Good one maroon. Care to share who you are, and care to share some proof?

    I thought so.

    It is good to know I’m so famous. Thanks for the chuckle.

  93. Hi Matt,

    I’m wondering how Google contacts webmasters about penalities…because my sitemap area doesn’t show anything wrong and I’m still listed in the natural searches. however I have been penalized somehow becuase for some of the keywords where I was naturally #1, I have been removed completely. I have emailed Google to ask for their help and been sent what seems to be a canned answer, not telling me I’ve been penalied but telling me to read the quality guidelines, comply and notify them. I’m confused as I dont’ see anything that is in non-compliance. The email was vague so I’ve assumed I’ve been penalized although I was not told so directly. I discovered that a few domains I had parked on my domain had been spidered as a separate domain. Sysadmin fixed with 301 redirects to make sure we didn’t appear to have “duplicate content on different domains”. Then one webmaster thought the digitalpoint’s coop links on my site could be the problem. I’ve removed that as well. I still don’t know if I figured out what it is I’ve done wrong….or if this is a glitch of big daddy so that I lost over 200,000 indexed pages. My frustration is with Google not telling me that I was penalized (if I am!) …and what it is I did wrong……when I really want to do this right.

    Was I penalized for digitalpoint’s coop? Or was I penalized for the spider indexing parked domains?

  94. Dave and Doug, your responses had me roflmao. Too bad that neither of you two got the point (and the joke), but that happens when you take yourselves so seriously fighting for truth, justice and what you deem to be spam. At least you both helped make my point.

    The problem with outing an SEO (doug’s idea that I quoted) is that anyone can make an allegation without a shred of proof, or at the very least make it a “he said, she said” scenario. Framing an SEO would be quite easy.

    for the record, no animals were hurt in the typing of this post, no sites of mine are banned, nor is Doug responsible for the SEO of anything that I am associated with.

  95. RE: “The problem with outing an SEO (doug’s idea that I quoted) is that anyone can make an allegation without a shred of proof, or at the very least make it a “he said, she said” scenario. Framing an SEO would be quite easy.”

    While anyone can make allegations, it would require proof (payment confirmation etc) that the spam was indeed done by them. I would have thought those sort of details were obvious………well, to some at least 🙂

  96. What like this:

    Hide a link in your tracking code in where few know that the is rubbish and you create a PR10 in no time – while the offense is on everyone elses website.

    And they blatantly advertise the “HIDDEN PART”.

  97. Dave, you actually think Google wants to play “People’s Court”?

    As it is they drop pages for dupe content even though they cant determine which page is the original.

  98. [quote]
    T2DMan, was your site in anything like an digital automated link exchange network? Maybe in 2004?
    [/quote]

    Matt, are you saying we should sign our competitors up for automated links networks? No, of course you’re not. But what I mean is this is starting to scare me a little. If google penalises people for certain types of links, or too many links too quickly then any webmaster can damage another site.

  99. Hi Matt,

    That’s going to be VERY helpful for those who don’t know why their sites get penalized / delisted from the index.

    But, instead of email notifications, you might as well create an API, so we can programmatically find out whether a particular site is penalized / banned or not. Just an idea!

    Anyway, this change is awesome as it is, too. 🙂

    SEOJunkie aka Sufyaaan

  100. Hello,
    Like A Cutlett noticed – if there is some website I don’t like – I should sign in this site to automated links network. And I should bouth some domains, create come content. When this content will be in google index (about 20k sites similar to the site I don’t like), i just create redirect (javascripts or headers) to domain i don’t like.

    Then only I must send spamreport and…, this site will be ban?

    Humm, that’s great idea, but some sites of my client in that way was kicked of google index ;/

    Somebody do something like above and it’s clear way to ban somebody site ;/

  101. TOP DOWN NOT BOTTOM UP.

    RE: “Dave, you actually think Google wants to play “People’s Court”? ”

    Yes, if it means a big drop in SE spam. Let’s say 5 site owners all quote SEO “A” as being the one who bulit all their doorway pages and cloaking. In my mind that WOULD warrant time from a Google employee to do some decective work. If all evidence states they are spamming ban them as they did with TP.

    IMO, until Google tackle this problem at the root the weeds (SE spam) will keep coming back.

  102. Google wont be playing Judge Judy just because you want them to. It is too easy to set someone (even you) up for a fall. Google gets little ROI out of that and it can only backfire.

  103. Great entry Matt.

    I think the problem is that webmasters when they first get involved in promoting their website online don’t understand how to optimise their website, so they assume by putting hidden text into the site will get them ranking higher in the search engines.

    It’s only when you start to take the time to read up on how to optimise your site correctly, that you realise hidden text is a definate no no. I have a friend who runs a UK chat room, I said to him include the keywords chat room more into your content on the page.

    Two days later I found that he had done that, but had hidden the text rather than write useful content on the page and included the keywords. To say he has now removed it and rewritten the text but it was interesting that his mind told him to hide the text rather than display it!

    More needs to be done to educate business owners when they start to delve into the internet – and hosting companies can do something about this by writing articles or information on its site – very few hosting companies provide SEO information, which is the first point of contact for many webmasters.

  104. Psst! Matt, the link to Sitemaps in the original post appears to be mislinking to somewhere nonexistent deep within your blog. Seems you forgot to use http: in front of the URL to Google Sitemaps page, and WordPress treated it as relative, instead of absolute.

    Did no one else notice that??

    Also signing up for Sitemaps. 🙂

  105. RE: No pages from your site are currently included in Google’s index due to violations of the webmaster guidelines. Please review our webmaster guidelines and modify your site so that it meets those guidelines. Once your site meets our guidelines, you can request reinclusion and we’ll evaluate your site. [?]
    Submit a reinclusion request

    If you find the issue and clean it up, then just click on the “Submit a reinclusion request” and fill out the form.

    Example: A client recently loaded Javascript tracking code to the website. On review, the no-script area which contains a transparant gif to action a browser load if Javascript is disabled was wrapped with a link element to the tracking script owner’s website.

    By general definition this would be a hidden link on every web page.

    With GSiteMaps loaded that could be considered spam, potential delisting, and the fact that the general public would intentionally leave a hidden link on every page [because it is suggested a functional part of the tracking script] is a little problematic…

    Google’s webmaster quality guidelines only suggest “Avoid hidden text or hidden links” but this particular ‘hidden link” is “tracking” they wouldn’t normally consider – “oh I need to remove tracking because that is the violation”.

    Is this an issue?

    Would it be worth adding a reference to Google’s webmaster quality guidelines – if it is?

  106. Well, as I am kinda curious,
    I will rephrase my original question 🙂

    If I have multiple similar pages
    (see my original comment as to why
    — in my case it has nothing to do with singular/plural)
    — and Google decides they are similar, what happens:

    a) all “similar” pages penalized
    b) all “non-first” “similar” pages penalized
    c) depends on how “similar” and “site trust”
    d) x = random(…), case x of …
    ?

    Personally I am guessing c or d 🙂

  107. I don’t see any penalties in my sitemap overview for any of my sites, but it seems that my site keeps getting delisted everyday, moving from 950, to 850, now to 722 results for my domain, when I have clearly over thousands of pages on my domain. Any idea what’s next to why this is happening?

  108. Matt – I am very interested in the Google Sitemaps Spam feature. Last week we dropped off the map for what seems like every keyword that was of importance to us. We no longer even show up for our company name which we previously had #1 & #2 positions for. Our site has been around for years and we don’t get involved with unethical practices although we recently discovered one of our competitors was hitting our Adwords account pretty hard with click fraud. I also noticed that he took our meta description and keywords from our homepage and is now using it on his homepage. I am curious to know if it is possible that our competitor somehow got us penalized (submitting to a link farm etc). We used to have over 700 pages in Google’s index and in a week’s time that has dropped to under 200.

    Any help or advice would be greatly appreciated!!

  109. RE “Google gets little ROI out of that and it can only backfire”

    “little ROI” on outing SE spam sources??? I would say the ROI is HUGE. No problem is ever resolved by treating the symptoms (spammy SEO customers) rather than the disease (spammy SEO companies). I would think Google already has such a tactic in the pipeline.

    No need to reply What a Maroon I know your retort already 🙂

  110. To those asking question about violating Google guidelines with no ill intent, I don’t think they can measure “intent”. Something to keep in mind. IMO, it’s never worth the risk.

  111. hi matt nice listening but sounds gibberish 🙂 one silly question:P, if all this is about the on page the what does google bot read code of the page or the rendered text in the browser. If code then why cant’t your algo check the repetition and stuffing of words. If algo can filter this then we can get more accurate results.

    Cheers

  112. Dear Matt,

    just wanna say a big “thank you”.

    I am a googler since the very beginning and very much satisfied about all developments google has been made.

    I am satisfied as well about the communication between google user / webmaster / site owners, it definetely makes google the ultimate search engine.

  113. Hi Matt

    Sorry – I’ve been out of town in Southern New Zealand contemplating the Antarctic – frozen away from the internet briefly, and missed out on this wonderful development while away.

    It looks like a great quality control step to communicate with webmasters committed to compliance with Google , but as I can see from the above posts there are likely to be many legit questions or interpretations based on G’s guidelines left un answered [ I’m not criticising – just observing ]. – I guess this is why it’s a trial.

    It kinda worries me [ but please keep it going 🙂 ] , because of some of the guidleines provide us with borderline interpretations. An emphatic yes/no on certain requests would be appreciated, particularily where other sites are adopting similar principles and are not apparantly penalized, which we can refer to.

    The other aspect of this quality control step is that legit webmasters can focus on reporting facts rather than BS. That’s gotta be good for all.

    On WMW and somewhere on your blogs several webmasters, including myself, indicated a willingness to pay for quality control feedback. I figured this breathed some further sincerity into the process for more in depth communications with Google. Any further thoughts on this one?

  114. Matt, I could have written Noah’s entry above. Ditto for my site and no response from Google with the exception of canned emails treating me as though I broke quality guidelines.

    My site: command has left me from over 200,000 indexed pages to 700-800 (fluctuates by the moment) with another odd behavior: The title and description of my site is being pulled from DMOZ instead of the meta tags and title area from my pages. What’s up with that?

    Am I being penalized for participating in digitalpoint’s link co-op? If not, what’s going on and why can’t Google tell me (us) what is going on so we can fix this?

    Dropping in indexed pages is hard enough but being completely left out of search results…affected our traffic 8%. (Thankfully 78% of our traffic does not come from SE, and only 15% came from Google. Now its down to 7% and dropping…)

  115. Hey Noah,

    You may be having a technical issue somewhere. It took me three tries to get your site to load from here, and the third try took approx. 45 seconds before i saw anything. Keep in mind that I’m on a cable modem, so it’s not a “I’m sitting and waiting for dialup to stop being dialup” issue.

    It could be the design of your site, or it could be your host. I’m not really sure which it is. I’m sensing a hosting issue.

  116. To all of you complaining about thousands (or hundreds of thousands) of pages being dropped from the Google index, here’s my take:

    A number of bugs we’re introduced with the roll-out of BD. The “missing pages” problem has been there since the beginning, but got lost in the melee that was the “supplemental issue” and the innadequate crawl rate issue. The 1st two problems have been largely addressed, but the most important problem/bug is still very much there.

    From my analysis it is clear that Google have introduced some kind of index “pruning” mechanism. The intention of this pruning process is to remove dead links – (URLs) with no links pointing to them – from the index. Such a feature is, of course, long overdue. However, there is clearly a serious bug in this new pruning process that is making it behave far too aggressively. In many cases the pruner is removing 95% of a Website’s pages, when it shouldn’t be removing any.

    In my case, for example, any page that I link from the Home page (PR5) goes straight into the index within 24 hours or so. If I then remove the link to that page from the Home page, that page is deleted from the index within a few days.

    Any pages linked from deeper pages (PR4, for example) are crawled relentlessly but never appear in the index – because the pruner deletes or blocks them.

    You would have thought that someone at Google would have put 2 and 2 together by now and taken a close look at this new pruning mechanism. Seems like an obvious candidate for the cause of the millions of missing pages.

  117. Matt:

    Great to hear that Google is becoming more proactive in notifying legitimate sites about problems. However, this still remains a one-way street . I would like to suggest a way to complete the trip, making it two-way communication.

    How about allowing legitimate site owners to petition Google why their site has plummeted in the rankings for an extended period of time. I am not talking about the whiners who complain about everything, nor am I talking about short term deterioration in the SERPS.

    I am speaking about site owners who formerly were on the first page of the SERPS for at least six months who have fallen off that pedestal for at least three months. Perhaps something was done outside of their knowledge that resulted in a penalty.

    That way the legitimate site owners have a solid forum to get the problem fixed. Anyone that has gone through this agony for more than 3 months deserves feedback from Google. As you can guess, my site is still going through this agony.

    Thanks again for enlightening us. Keep the window shade open.

  118. TheInsider, interesting take! Add to that the fact that Google is still caching and indexing old pages which have been 404’ing for a year or longer, and you see Google remove pages which people want in the index, and keeping pages which they want removed. LOL. (or Ooops?)

    Somebody should take the caffein-supply away from the sitemaps team, they’re clearly working much too fast and doing too many great new things! I can’t imagine the bribes they must use to get past the old politics of “never show the webmaster your cards” :D. Keep it up!! (How about a feature where you can specify which rank you want to have for which of your keywords? ha ha, just kidding / dreaming 🙂 )

  119. How about allowing legitimate site owners to petition Google why their site has plummeted in the rankings for an extended period of time. I am not talking about the whiners who complain about everything, nor am I talking about short term deterioration in the SERPS.

    While the intention is quite good here, there are two problems with that theory

    1) For every person who may have a legitimate beef with Google due to improper indexing, penalization for something that may have been okay before, etc. and so on (and I’m sure there are legit cases), there are going to be 1000s of people who complain because they lost SERPs for some putrid pile of monkey crap that never deserved to be there in the first place, or people who can’t get ranked and think they should. If Google devotes resources and time to dealing with those idiots, that’s time taken away from improving the engine.

    2) Most people tend to be biased when they look at SERPs, and the complaints in general reflect that. An informal glance at the complaints in this very blog, for example, are a good indication of that.

    “My site was ranked #1 and now it’s not.”
    “Why is so-and-so spammer site showing up for a SERP that I just happen to be going for?”
    “Matt, you’re too busy letting the spammers in.”
    “Search Engines Web has a problem with…” Hmmm…no…I probably shouldn’t go there. Should I go there, everyone? Nah, probably not. 🙂

    I think you see what I”m driving at. It isn’t your idea that’s bad, it’s how people would use it. And that’s something Google would have a horrible time with trying to address.

  120. RE: “Am I being penalized for participating in digitalpoint’s link co-op?”

    Hmmm, a link scheme designed soley to try and trick Google into passing PR, link pop and increase Google rankings. As the whole link co-op scheme is outside Googles guidelines I would say you have been ‘sprung’.

    Kathy, the “digitalpoint’s link co-op” is one of the biggest scams to hit the WWW in a long time. Dump it and any other trickery you have now before you lose ALL pages from Google.

    I bet if you read the Google guidlines you will identify more things that could cause problems with Google.

  121. Not all “misspellings” are spam. A variety of transliterations may be necessary to cover possible variations.

    Is Google able to determine the difference between transliteration misspelling requirements and same-language misspellings?

  122. Dave, IMO Google is not iterested in transliterations as they are the same word in another language. This is why one CAN use “content delivery” to direct a user (not SE spider) to a relavant language page. In other words, no 1 page should/would have the same words in x different languages.

    The spammers love to blur the lines between “content delivery” (ok with Google) and cloaking (not ok with Google). Matt recentlty spelt out the difference right here on his blog.

  123. Dave, you said this: Kathy, the “digitalpoint’s link co-op” is one of the biggest scams to hit the WWW in a long time. Dump it and any other trickery you have now before you lose ALL pages from Google. I bet if you read the Google guidlines you will identify more things that could cause problems with Google.

    There is no trickery on my webpages. I’ve read and re-read the guidelines and see NOTHING that I’ve done wrong….except those dp link co-op which I was told was approved by Google. I’m a 49 year old (gulp!) woman who has worked daily online since 1998 on this website. Its a large website with LOADS of great content that helps women with medical problems. I have no need for trickery. Its the place the medical community sends their patients. I appreciate the links you’ve provided but don’t appreciate the tone assuming I’m a criminal who should have known better. I didn’t know better. I trusted some very big guys in the industry when I signed up for DP. Plus…I was on the first page of search results, often in #1 position prior to the co-op link network.

  124. DigitalPoint has always caused me loads of problems..

    I found one threadon there where a guy was selling a copy of one of my sites.. same layout , content and pictures.. the “product” he was selling even had my email address left in the mailto: on the faq page

    I wouldn’t trust them for anything…

  125. Hey Dave, Dave, Dave, Dave, Dave, Dave, Dave and Dave,

    Can you guys like use initials in your posts or something? That section where at least two of you are talking is really confusing.

  126. Here, here Daves. I second that.

    Regards,

    Dave (a different Dave from those above).

    PS: Matt are you going to say anything at all about Google’s Uber Bug that is steadily deleting most of the Web from its Index? This must be the worst bug in Google’s history and yet you just pretend you don’t even notice the comments. Does doing “no evil” not include puting 1000s of small businesses out of business simply because Google don’t want to acknowledge a problem?

  127. Hi Matt,

    I have a quick question: Suppose I have a website with advertising on the homepage, but not on the sub-pages. I’d like to detect deep-linked referrals (from google, or anywhere else) to the sub-pages and show them the ads they missed by skipping the homepage. It would’nt be bait-and-switch becuase the sub-pages would contain the same core data regardless of how you got there; there’s just be an extra column with some product and suggested reading (which would be associates links to Amazon), and perhaps some extra info for signing up on the site, and whatnot.

    Is this kind of thing permitted by Google? Could implementing it get my site removed from Google?

  128. Hanford:

    What if you circumvented the whole issue by embedding the column info inside a Javascript on the subpages? Then Google won’t care what’s there at all.

    You’ll lose some potential customers among those with JS disabled, but at least it’s a safe bet and the amount you’ll lose won’t be that significant in all likelihood.

    Alternatively, you could put the content inside of an inline frame and display it that way.

  129. Dave (Original)

    RE: “There is no trickery on my webpages”

    Random links in the footer of pages pointing to unrelated sites IS trickery in an effort to boost Google ranking. That’s why you joined right?

    RE: “I’ve read and re-read the guidelines and see NOTHING that I’ve done wrong….except those dp link co-op which I was told was approved by Google”

    What about “don’t participate in links schemes to boost PR or ranking” or “Would you do it if SE didn’t exist”?

    You have been LIED to, Google has NOT approved the DP link scheme. I would love to see this ‘so called approval’. Care to share it? In fact, GoogleGuy has mentioned that if you DO choose to participate you run a real risk of linking to “Bad Neighborhoods”.

    You see, if they were REALLY serious about it being “advertising” they would use the Google APPROVED nofollow attribute. This has been mentioned by Matt right here many times.

    Unfortyunately the SEO industry is NOT regulated and they are 100000s of liers, cheats and unethical people that take FULL advantage of it.

  130. Matt,

    What is your take on Duplicate Content in regard to Web Services data? The data provider is licensing the developer to utilize their content, however, might Google see this as duplicate content? Is their measures to avoid this?

    A good example is an Amazon Associate using product, pricing and review information from Amazon E-Commerce Services on their own site.

    Thanks,
    Marc

  131. I reflected on my above comments and thought better of it in terms of how to improve effective communication between Google and supportive webmasters and siteowners.

    We’re on track with webmaster penalty notifications via Sitemaps and Email if it works. Here’s my 10/10 on the intention.

    But we’re finger biting still out here with current observations such as those seen at WMW “Major Change in Supplemental Result Handling” @ http://www.webmasterworld.com/forum30/34119.htm

    What I’m saying is that a very good bunch of webmasters are having a hard time with uncertainty, not knowing what’s happening. Some sort of reporting communication with the community on developments would be helpful for all of us i think. Something along the lines of “major issues” – subject A – issues with XYZ – ETA of fix DD/MM/YY – comments put into Sitemaps.

    It’s yet another vehicle towards better QC and speeding up siteowner adoption in my opinion which puts you well ahead of MSN and Yahoo [ just thought I’d say that 🙂 ]

    It might also improve monitoring all around and make a lot of people happier. Just my thoughts – have a great day.

  132. Ok guys and Matt if you are kind enough.
    Please answer this humble question.
    I did some 301 that had a strange effects :)) like:
    I 301 my http://www.mydomain.bla/ to http://www.mydomain.bla/index.php

    What happened is that if I search in Google my site there are no results for http://www.mydomain.bla only for http://www.mydomain.bla/index.php

    How is the correct redirect? I think I am a bit confused to say the least 🙂

    Would a 302 from http://www.mydomain.bla/ to http://www.mydomain.bla/index.php be correct (the way it was before)

    Thank you very much guys, feel free to respond to this lil Q.
    Pex

  133. Dave (Original),

    Would I be here ASKING if I was using trickery on my site? You stated: “Kathy, the “digitalpoint’s link co-op” is one of the biggest scams to hit the WWW in a long time. Dump it and any other trickery you have now before you lose ALL pages from Google. I bet if you read the Google guidlines you will identify more things that could cause problems with Google.”

    My comments is that except for Digital points co-op (which I was told was approved by Google) was the only thing I had on my site. I do not participate in trickery. Check my site, Dave (Original). You will see a person who works to maintain integrity amidst rude people who assume the worst instead of believing in honest people who didn’t know better.

    I’ll leave it at that. And as for your I help you attitude, no thanks. Rude doesn’t encourage me to ask for your help.

  134. Al, I agree. It would be nice to know if I was penalized (and if so, why only me and why not the other thousands in dp co-op?) or if I was victim to the search index page loss fiasco of Google.

    I haven’t been sent anything to tell me I’ve been penalized. Nor does my sitemap area yield anything other than good stats. No explanation….It would be just good to know.

  135. Dave (Original)

    Kathy, you are not reading what I’m writing, or anwering my questions. Only trying to help and tell you a very likely reason you are not doing well in Google.

    Oh well, you are not the first to be duped by DP and wont be the last. No skin off my nose.

  136. I recently had over 400 pages of my site indexed and today all the sudden I have 47 and that’s after big daddy. Is there another big change taking place?

    I’m concerned that I may have been penalized Google is not showing me that this might be the case. I do not do anything even remotely “tricky” on my site, it’s just an honest down-to-earth cooking site. What can cause such a sudden and drastic change when little has change on my site except added gourmet food products and recipes?

    Thank you,
    Scott

  137. I appreciate and understand you are saying I’ve been duped. I get that. What I don’t get is the continued attack on my integrity as though I’m a sleasy webmaster and participate in *other* trickery. I do not. Expect for DP (where I WAS duped) I have not done anything to deserve your accusations. I was here asking questions, trying to understand what it was that was a penalty….and instead of getting my answers, I was condemned and tossed into the Google Jail by Judge Dave Original himself. As I said, take a look at my site and tell me…what other trickery do you see? I’m here to learn…otherwise, I would be hiding myself under a rock with the other tricky webmasters.

    And BTW, prior to the DP coop I was ranked on the first page of the search results….for years. Usually in #1. It wasn’t for trickery that got me there. It was daily maintaining my site and adding content appropriately…since 1998.

  138. Dave (Original)

    RE: “I was condemned and tossed into the Google Jail by Judge Dave Original himself.”

    Care to point out where I did that???

    RE: “As I said, take a look at my site and tell me…what other trickery do you see? I’m here to learn”

    It also appears you have a mini link network going with all the domains you own. You should have them ALL as pages under the one domain. If they are not related enough then don’t link them together.

    RE: “And BTW, prior to the DP coop I was ranked on the first page of the search results….for years. Usually in #1. It wasn’t for trickery that got me there. It was daily maintaining my site and adding content appropriately…since 1998. ”

    Did someone say it was??

    BTW, did you read the forum link I posted. If not you should.

  139. Well, I submitted the stupid form.

    This is like confessing to a crime before you’ll be told what the crime is, LOL.

    Who gets these forms? Are there any live people over there?
    ——————-
    Before I can submit this form, I have to admit to some wrong-doing, promise that it has been corrected and agree to never do it again?

    I don’t even know what I’ve done wrong. I have no hidden text, no cloaking, no doorway pages and I only link to my own sites, all my pages are totally unique and I don’t use any software whatsoever.

    It’s not fair that I have to admit to some violation before I can even know what it is just to be able to submit this form. This is my living at stake, the Web is how I make my money and feed my family, and it gets taken away and I don’t even have the opportunity to know why?.

    I don’t believe my site has violated Google’s quality guidelines.
    This site has not been modified, I don’t know what you want me to fix.
    Yes, I have read and agree to abide by Google’s quality guidelines.

    If I did something wrong, all you have to do is tell me and I’ll fix it.

    E-Mail me, I’m listening.

    I’ve done none of this that I’m aware of, but my site went from good rankings and good listings to nothing overnight and I have no idea why.

    PME

  140. Kathy, not to speak for Dave – but I don’t believe he is calling you ‘shady’ but more towards ‘questionable’ technigues being used. Many webmasters “say – they do nothing sneeky” – but in reality don’t know what would be considered “sneeky” and/or rational or use Google’s lack of details in quality guidelines to bolster something new [e.g. link scheme — link coop? sound vague similar to me]… thus I assume the rational for Google SiteMaps [well to get a second or third opinion anyway].

    In my experience manipulation is manipulation – a coop sounds alot like a reharsh of a link farm or an aggregration of webmasters sharing a predetermined link structure… so is it a “link scheme”… then a violation to Google. IMHO.

  141. Mini link network? I can’t advertise products from my own store? I think you have gone over the edge of accusations. Products from our online store and paid advertisers….are in the “Sponsors” area along with one fun family website my members use with their children. Thanks for your input, Dave. Of course I understand (you can stop the hammers) that you think DP is a link farm. Its been off my page since before I posted the first reply here. I was just wanting clarification from MATT or Google because I still have nothing from Google that says I was penalized…and I’m just trying to learn. Its easy to point to fingers at what is judged “questionable” without really looking. 😉

  142. Hi Matt,
    this is my first post here, I hope not to be off-topic, but I would like to know if this Sitemap feature may help a webmaster to understand a strange effect of the new algos. Since some month it seems to me that some site (or pages) may have duplicate content problem due to affiliation such as tradedoubler….for istance I have a site I work for with more than 4000 pages indexed called home.asp due to the tradedoubler tracking code appended and now with PR 0. Can I suppose this is something that this site map feature can help to solve ?
    Thanks

  143. Kathy,
    One of the problems that Google’s Big Daddy is supposed to be addressing is the proper handling of canonical issues, in particular website pages that are referenced using both www and non www prefixes. Seems your HOME page is maintained in the Google index twice. Using the site commands:

    site:www.domain.com

    and

    site:domain.com

    produces two different index entries with different cache dates and even content (titles are different). Probably because you made changes in the 5 days between caching by Google.

    You might want to correct this with a 301 redirect (if you have not so already). It appears, however, that when you simply enter domain.com (no www prefix) in the address, the www page does appears.

    Matt has discussed this issue here before:

    http://www.mattcutts.com/blog/seo-advice-url-canonicalization/

    Bottom line, if Google keeps two copies of your HOME page in its index, you will definitely be sharing PR (according to Google) and possibly be getting some form of duplicate content penalty. This MAY explain your drop in rankings.

    All, of course, IMHO 🙂

  144. RE: and possibly be getting some form of duplicate content penalty

    That’s unlikely IMHO. While Google can crawl to both version ‘independently’ it cannot crawl directly from http://www.domain.com/index.html to domain.com/index.html unless a website uses absolute urls with and without on the same page.

    At worse a total website may have “holes” in website architecture (in one version or the other or both) but unless you have extremely bad coding practices www and not have no direct correlation than any other subdomain which according to Google is an external website like another other website that has dup content. If they don’t link to each other “no penalty”.

  145. hi pgaz (or anyone??)
    re: canonical issues
    what would be a syntax for an .htaccess redirect from site.com to http://www.site.com assuming that base url is index.htm? i currently have two sites listed site.com and http://www.site.com and lost most of my pages but 3 (including http://www.site.com) +1 (site.com). many thanks.

  146. Fathom,
    Google clearly does not treat www and non www pages like two external websites. Site command in Kathy’s site demonstrates that:

    site:domain.com

    returns both www and non pages.

    From a practical test, I’ve searched on snippets unique to one page on my site (that has canonical issues) and Google returns BOTH pages (www and not) correlated (indented) with the same domain. In some cases where the non www page is Supplemental, Google returns it first with the “real” (www) page only appearing in “omitted results”.

    Site had 301 redirects for non www to www in place since Oct 2005. All Supplemental pages are 1 to 2 years old.

    That sure appears to be a penalty to me. 🙂

  147. Kathy and Fathom,

    It appears to be even messier if you repeat the test that I mentioned above on Kathy’s site. Searched on a snippet from the HOME page and found Google keeps the following in its index:

    http://www.domain.com
    domain.com
    https://www.domain.com
    http://www.domain.net
    http://www.domain.com/index.php
    http://www.domain.com/?session id (multiples)

    Most of these appear in “omitted results”. All direct back to http://www.domain.com, but the fact that Google is keeping this many duplicate entries of ONE page in its index can’t be helpful to her sites rankings. Again, IMHO 🙂

  148. Dave (Original)

    Ok Kathy. You are only interested in shooting the messenger. I regret trying to help.

    You have quite a few issues with your site, other than the ones mentioned above, but I leave you to find out. Good luck.

  149. pgaz – all sub-domains show for that command if you use the root domain… please show one that is factually penalized?

    While it is advisable to 301 – so you get the most out of external influences – it isn’t because you’ll attracted an automated dup content penalty. [and surely Google would never manually do it for not using a 301].

    In fact – because some of your website could be indexed as domain.com and other parts are http://www.domain – it can appear like a penalty since you have 2 websites neither being fully credited with structured link paths… but this is in no way ‘dup content’.

    Well… Google may indeed prove me wrong – but 100s of millions of website without 301 at root means Google shouldn’t have the archive it does.

  150. Dave (Original)

    Agree with Fathom. IMO Google could not afford to have ‘problems’ with www vs non-www. If it did, about 80% of the Worlds sites would be penalized.

    As stated, it is ADVISABLE to use a 301 to direct non-www to www but it is by no means a Neccasity.

  151. Fathom,

    Good discussion – I agree as well that Google MAY not be explicitly penalizing sites with canonical issues, but a drop in SERPs that MAY result because of Google’s confusion over so many duplicate index entries pointing to the same page could produce the same effect, again IMO.

    If there is such a thing as a duplicate content penalty (i.e. two pages that have identical or nearly identical content in same domain), then how does Google differentiate between two distinct pages that really are duplicates of each other versus one page represented as the following:

    http://www.domain.com/?session id
    http://www.domain.com

    or

    http://www.domain.com/abc.htm?session
    http://www.domain.com/abc.htm

    when Google maintains two separate index entries?
    when Google returns both forms in search results?

    Only Google engineering knows.

    I believe the emphasis that Google has placed on these issues and the communications that Google has been providing webmasters since the end of last year suggests there is a problem. Again, Matts comments here:

    http://www.mattcutts.com/blog/seo-advice-url-canonicalization/

    discussed the issue. What wasn’t discussed is what the negative effects (duplicate content, etc.) this MAY have on websites.

    As an interesting aside, I took the test I ran on Kathy’s site and ran it on MSN and Yahoo!.

    Google returns over a dozen variations of the HOME page in its index
    MSN returned only one result (www.domain.com) – correctly
    Yahoo! returned 2 identical results (www.domain.com twice) – obviously a bug

    (As a second aside, I noticed in the last few days that the duplicate www and non www indexes have all but disappeared under Big Daddy on our 3 websites. Now only the correct www form shows up under site command. Other canonical forms still exist (malformed URLs). A positive development!)

  152. I want to thank those that contacted me to help. Thanks! I appreciate your comments and help very much. As for my search engine results, my pages are back as they were prior to last week when I went in search of the reason *why* and came here. Thank you, Matt (if that was you) or to Google gods for fixing things. Now I can get back to the needs of our members.

  153. >> If there is such a thing as a duplicate content penalty (i.e. two pages that have identical or nearly identical content in same domain), then how does Google differentiate between two distinct pages that really are duplicates of each other versus one page represented as the following:

  154. NOT SURE WHAT HAPPENED ABOVE BUT MY RESPONSE:

    Almost seems we have hijack Matt’s Blog! 🙂

    In my experience “dup content” isn’t really the problem (or rather how Google intreprets and then penalizes).

    Linking is needed to an original of the copy (on site mostly but also between different domains ‘crosslinking’).

    I’ll cite affiliates programs (referral pages) as an example of ‘duplicate content having a high survival rate…

    The main website develops links to their own page, and an affiliate with there own pages can be link to independently and never get penalized simply because there is no direct link paths between them and both page sets can appear in results based on their own merits and highly successful.

    Similarly without a 301 in place a root domain and the www sub-domain each have their own merits [notably links to them] and Google credits accordingly.

    Academically the only plausible way an un-301 domain could possibly be penalized – is have both versions of absolute urls on the same page and to the same page [root and www sub] then maybe plausible (I highly don’t it though – 75 PhDs can’t be that dumb to develop a macro that can crawl the world but never know where it is, been, or going to from one referral to the next! 🙂 )

  155. I have a site that suddenly disappeared months back from Googles index without a warning, and I have sent reinclusion requests over and over but have only gotten canned emails back whic is no help at all to anybody. I have read the guidelines over and over and have fixed whatever there was a slight chance that it might wrong but since I do not, intentionally, do anything wrong it is very very hard to figure out why a site disappear!
    I do not think that this new revolutionary thing will change anything for us grass roots (that Google actually lives off of!). Maybe bigger and important sites will be warned.
    One thing that still amazes me is that all sites with scraped content from my site is still out there even after I have reported them, but my own site is gone which is a shame since I get thousands of visitors from MSN and Yahoo every day so there should be a need for it in Google also, you would think!

  156. To Perl montain and others:
    This is exactly how I feel also after seeing the form for reinclusion request!! Not sure who came up with that stupid form and the questions. It’s like someone took your drivers license away and you had to fill in a form to get it back admitting that you were drunk and that you will never drive drunk again, even if you didn’t know you were drunk or was driving drunk!!
    This is REDICULOUS and don’t expect that you will ever get any emails!! Ever!!

  157. Hi Matt,
    I have a page a website that promotes an Acura Dealership. I also have another site offers valuable money-saving coupons for the same dealership. The site that offers the money saving coupons is linked to the main site. Is that a deceptive doorway page? How would I know if I was being penalized?

    Also, I have purchased some URL’s that customers are likely to type in when trying to find my dealership. They are redirected my site. Is that a bad practice? Would I be better off not redirecting it to my site?

    One last question, does Google see and the same way? Which is better?

    Thanks
    Todd

  158. does google see “b” and “strong” the same way? Which is better?

  159. “Dave, IMO Google is not iterested in transliterations as they are the same word in another language.”

    Transliterations are *attempts* to *represent* words taken from a language that does not use the romanized alphabet.

    In order to cover the possible variations, travel writers often have to “intentionally misspell” .. ie, Chitlom (a street name in Bangkok) and Jitlom or Djitlom.

    Koh Samui and Koh Samui. “intentional misspellings” are not necessarily misleading or SPAM.

  160. HI Matt

    I duunno how I dunno when but I`m over the moon that I`m back in google… now just an update and alice is not lost in wonderland anymore…

    If by any chance matt if your responsible let me know so i`m not barking up the wrong tree…

    If you did though many thanks

    Fly in zee wall

  161. “Koh Samui and Koh Samui”

    Oops! Should have been, “Ko Samui and Koh Samui.”

  162. Great post. Hidden text is the polution of search results. It’s time for Google to do something about this filth and it’s excellent to see that you are. This is good news for those of us who refuse to turn to the “dark” side of SEO.

    If you ever wanted to see a classic example of hidden text using the tag, take a look at the site http://www.perfectpage.co.uk. Their pages are pumped full of hidden text using the tag. To make things worse, they quite happily use the same techniques for virtually all their customer sites and pages too.

    It’s time to kick this sort of thing out of the search engine indexes, and let the honest webmaster and developers see their hard work gain the recognition it rightly deserves.

  163. Along the same lines: Is it ok to hide text on your website if you explain that you have hidden it there expressly for the search engines? Its a nice gesture but still spam I think?

    i.e. “This is for those search engines that don’t understand how to read Flash content… we’ve tried to incorporate the main points in our website.”

    Take a look at http://www.creativeadvertisingagency.com/best-seo-agency.html

    Normal looking flash site – then check out Google’s text only cache – what! How many lines of code have been hidden with a tag?

    And they rank at number 3 on google.co.uk (uk only results) for seo agency.

  164. Just a note Matt reading about this blog spamming etc do you think this nofollow is really nessesary…

    IF a turing graphic key + admin preview is activated why should it matter… this nofollow is just a pain for those that are legimately involved in blogs…. In fact adding decent replies and content make for more visitors at least in my own experience and tests…. Thing is in the eyes of a search engine IF I TARGET right on key to everything I do and with a decent content why shouldnt the webmaster get PR points for their efforts…. It`s one of the main reasons why a lot of people hang out on blogs especially webmaster blogs….

    if admin is there to judge the content I think it would be in their interest to give that little extra bit of SEO points up for their effort… kind of like a reward.. It certainly encourages more and better content when admin HAS to verify all posts who needs to take away the reward

    Word press and their like should all have turing by now and admin review… so the nofollow not only removes spammers but also many webmasters who use blogs to gain more PR… well done :0) … as if it`s not hard enough to gain already

    Think about it nofollow is just another way to mess up many websites especially legitimate link partners and wot not 2… it`s open for attack in many ways and stop many link campaigns DEAD!! You really know how to monopilise the net ;0)

    nofollow=

    Lost link structures due to sneaky webmasters and their nofollow attribute will lead to only one outcome poor ranks…. lack of business …. more annoying problems for the little guys… and breeding webmasters into desperate suitations opens up for other areas of spamming….

    The one way link myth is just crazy … and many gits on these sites are delibratly placing nofollow attributes ASSUMING the one ways will benefit them more …. geezz lol talk about a headache

    Think ahead before you really annoy so many webmasters google is just gonna end up the dictator of the internet… and dictators never last!

    i want google to be super accurate so I get the right clients to my pages but not like this… my business is starting to decay because of the nofollow madness!

  165. Hi Matt

    I dunno whats going on but thanks for deleting all the old pages google had kept… makes the site index far cleaner… didnt even know they existed lol

    Are google running out of disk space… I know how you feel :0)

    I have a campaign going to restart the campaigns.

    Anyway I hope you realise this NOfollow thing is lame… did a trace on it and lol its getting out of hand did a bit of monitoring on the interactive side for blogs and YUP they aint getting the same posts as before lol destroy the blog industry why dont you hehe

    Anyway I can start again soon and maximise on these tweaks :0)

  166. Interesting blog….
    I have just spent most of my afternoon reading this thread and would like to point out a few issues.

    Thanks to Kathy and Noah’s contributions.

    I am one of the webmasters that have been hit by what we call the:
    Very Odd – Big Drops in SERPS Today April 26, 06
    On the WMW forum.

    here a number of us are discussing what we think may be a ‘glitch’ with the Google Algo.

    For me, without going into too much detail. I have searched for information on canonical issues, sitemaps etc etc.

    Started on the 26th April when my website which was ranking very high for most of our keywords and most importantly our own company names, then DROP

    Pages still indexed in Google, PR still present, however a massive drop in search results, reinclusion request made, however no real response.

    For me personally I run a B&B hotel site in London, similar to something this post started about, however I have never taken part of any ‘black’ hat SEO, and really hope soon Matt/Google investigate this issue.

    Using sitemaps I was able to confirm that my site was not banned, however no real reason as to why Google dropped our sites, when Yahoo, MSN, and other SE’s ranking has be solid and consistent.

    For the most powerful SE in the world, I really hope that Google notice this…….[deep breath and wait for Matt to notice my post]

  167. I came here from one of Digital point forums post, regarding coop, being mentioned bad. COOP has been a better competent with many other advertising networks, simply because it’s free, but the draw back is dynamic and irrelevant links. I think If Shawn can read this, he better categories the links display so that links appear according to the category.

    Make static links by removing the feature of dynamic links.

    With these changes I think coop will do better, and webmasters need not worry about any penalties.

  168. google are running out of disk space i got an old 3 gig drive here if you want it matt ;0)

    But anyway who knows my http://www.products-directory.co.uk site got completly dropped yet has a decent pr it has very few backlinks at the moment around 12 includes one from this blog lol

    But anyway looking at it all…. me thinks google is dropping lots of suspect pages in terms of its structure… looks like a similar pattern as the site was set up with very specific keywords and ended up on lots of spammey pages… grrrr

    and deletion took place whats left is the quality pages… awsome really as it gives you a clue on what google may deem as decent..

    Either that or one of the students google hired was still tanked after the previous night and accidently pressed the delete key wiping out 2 terabytes of data … it does happen lol

    either that or another dodgy drive went down… I hope google are not using maxtor drives… sheesshh

    anyway guys look at my amazing level of links…. SEO = white hair/early grave :0)

    “SEO is like playing football drunk, upside down with moving goalposts, a hurricain, alien invasion and being blind folded all at the same time… that`s why rugby is far better”

  169. is it me or is this graphic protection always the same code

    lol

  170. I fully expect this system to be a complete failure, although it will most likely be a very quiet one.

    Why do I expect this? Because Google personnel themselves do not know when a site has been penalized.

    Here’s the completely idiotic response I received from Google after one of my sites was clearly and obviously penalized:

    Thank you for your replies. We apologize for our delayed response. Please be assured that your site is not currently banned or penalized by Google.

    Please note that we searched for your site and found that it is currently included in our search results. To see the results of our search, please visit the following link:

    http://www.google.com/search?hl=en&lr=&q=site%3Awww.example.com

    We understand your concern regarding your site’s inclusion in our search results; however, these changes are consistent with the normal fluctuations outlined in our previous email. As we add new pages and incorporate updates to existing pages, you may see changes in the ranking and inclusion of sites in our index. Because our index changes regularly, it’s possible your site will regain its ranking. In the meantime, we hope that you will review the helpful tips posted on our site.

    This response was an obvious cut and paste job that complete ignored, and contradicted, every shred of available evidence.

    Google personnel, by an large, seem to have lost touch with how their magic machine actually operates.

  171. Matt I got an idea

    Delete 80 percent of the database and it should be much better :0)

    Guess what folks I got a new site and in terms of indexing speeds

    MSN First PLace 9 hours
    Google Second – 32 hours (sort of well shady spider)
    Yahoo — still waiting …. they are getting better

    MSN is soooo much easier to work with ….. and i`m making cash with it

    So matt not to push you around or anything I know you got one hell of database… but like my own some things need to be dropped without mathematics :0)

    Matt whats going to happen once that DB of googles gets so big it will be eh kinda be in vain to use for websites… it`s already starting to display odd results.

    Quality of the results is not what they use to be even with this cleanup (go back to around september 2004 (you really had it sorted then) minus all the crap and it should be even better

    YAHOO and MSN are just easier to work with (kind of like the early most enjoyable days of google)

    when will the google madness end

    Here`s a prime example

    Matt I Own a company that has over 350 employess assigned for marketing with search engines and sites etc..

    1 I have this new site and if I expand the network at a speed any webmaster would drool over will I get penalised for such effeciency… just more myths floating around the net about link expansion and I hate getting messed about with google serps…rather not waste ones time with it

    Can you shake a stick at these myth makers pretty please.. or least give advice that is worthy of replying on this blog a yes or no will do.

    thanks

    PS.. your blog is not getting enough posts… I wonder why ;0)

    dam nofollow ….cursed it be!!!!

  172. Hi matt

    guidlines on automated queries..

    i did a test to see what the limits of it was and all that plus knock a few of the myths flying about down..

    I did a 30 minute automated query then searched for my domain name a few times inbetween…. tell me does this make your site disapear from the engines… as everything was fine and now its gone even though it was the pc and not the server doing the TEST!!! only (api is next .. man i wish you would supply a decent php script stop all this huntinhg around for bloatware software)

    I`m not sure if its theis reshuffle or maybe that …crazy SERP`s … good tracking software otherwise

  173. Matt:
    I’ve had good experience with Google crawling my sites in the past, but have recently noticed a drop off in referrals and traffic.

    I am an affiliate marketer, with very legitimate sites – including product links and content I write myself. I hear affiliate marketing (getting paid a commission for linking to other sites) is a new “no no” with you guys…. please advise on that one.

    I have also implemented Google Sitemaps on my sites. I have not noticed any mention of penalty there.

    I am a bit confused about the hidden text issue (which means I may be guilty of it, not knowing it was cause for penalty). I always thought hidden text was something you designated in your software program. I never thought that if you have the keywords in the same color of the background, they may be deemed “hidden” as well. Is this the case?

    So – a two-fold question: is it the background-colored text or my affiliate marketing (my livelihood) that is a potential problem – or both?

    Also – is there a way to fix the issue of paid links, so that I am still deemed “legit” by Google?

    Thanks so much – LJ

  174. Hi Matt,

    I just saw one site using hidden text and link and notified him of his actions. He also has over 10,000 backlinks in less than 30 days and some of it looked spammy. I was thinking of notifying google about this but he removed his hidden text links, but the backlinks are still there?

    I made a post on my seo site about this one and other some black hat methods done by others as well. http://isulongseoph.offshoreoutsourcingphilippines.com/

    Also, how does google declare gibbirish? How about if a site is not in english but still doesnt make any sense?

    Thanks Matt for your help.

    Alfredo

  175. Google sent me an email becouse my index page contain a javascript redirect to a main page of the same website. Why ?

  176. I’m very sorry – but who do you think you are? The GOD of Internet?

    When I use Google all I get is massive SPAM and other Trashpages – so clean up your datacenters before you start abusing webmasters that just use your techniques to get a little more benefit for their pages.

    For a long time now Googleearns billions of dollars – and does nothing to get away with Trash and Spam – maybe YOU earn too much with it?

    There are pages online for long times which must be noticed by google – but did anything happen? Not at all !

    So now you start abusing single webmasters? What did they do harmfull for Google?

  177. I’ve forgotten one thing !
    Quoting your article: “”This is a perfect example of a site that should be able to find out that their page conflicts with our quality guidelines. Google wants this hotel to know about potential violations of Google’s webmaster quality guidelines on its site.”

    Have you ever thought about millions of webmasters that have the right (!) to build their website as THEY want?

    “Google’s webmaster quality guidlines” – What ???
    Do I have to sign them for being in your index? Have you ever asked a webmaster to allow Google to index his webpage?
    So what right on earth do you demand ????

  178. I have to tell you that I am very unhappy with Google’s response in our dillema to get back on Google Adwords.

    We have been spending thousands and thousands of dollars a month just on google to avertise our products. We have delegated such tasks to one of our employees on a daily basis.

    It has come apparent to us that one of our employees was commiting click-fraud on a competitors website. We did not authorize such activity as a company and now we are paying the price!

    We have sent several letters to google regarding our problem. The employee was reprimanded accordingly and we also have new management.

    Our company relys majorly on internet advertising. And Yahoo & MSN just dont cut a good slice of cake.

    What can we do? We feel like kids in the time-out corner.

  179. Hi, This is really strange. My company website (We offer Backend services in data processing, multimedia and SEO) is currently at PR3.

    But, Some of the pages including http://www.sysconconsultants.com/multimedia.html were at PR4 just a few weeks back and then sudeenly about a week or so back to my shock I see them deindexed.. No PR at all 🙁

    Can anybody tell me whats up here? is there a glitch or an experiment or what?

    Thanks,
    Amod

  180. Our site has been around for 8 years and is doing okay. I was wondering if we should do a google site map? Most of our pages are pr5. I have heard of adding a site map may shake things up, and leave well enough alone. Has anyone done worse after adding one?

  181. Meta Tag Juicing Is The Black Plague of the Internet

    I am not sure why Google seems intent on penalizing people when it has created the environment within which the current Meta Tag Juicing has risen. It was the very Meta Tag rules that Google established that has created the rampent abuse by SEO Magicians on behalf of their clients in order to elevate their web site’s relevancy ranking.

    The current Meta Tag process does not work because it is open to abuse by WebMasters and Advertisers alike, the later of which pay Google lots of money. If you allow someone to input a Meta Tag to indicate the content of a document we all know very well that human nature and greed will be to mislead the Searcher into believing they have found a page about Apples when it is really about Oranges.

    What is the answer then?

    In our view, the answer is to let the document speak for itself. Provide a mechanism that reads the web page content and creates a Thematic Knowledge Signature of the page contents. With this available, the Search Engine spider / crawler / bots know that they have an independent content evaluator with which they can determine content value and relevance. All this can be done without human intervention, including advertising companies which is Googles main source of revenue. However this approach, although it represents an accurate reflection of a web page or document’s real thematic content, may be contrary to certain businss interests.

    Let The Document Speak!

  182. Knowledge Signature and Map of Matt Cutts Blog

    To follow up on my comment, I wanted everyone to have a chance to see exactly what I mean by an independently generate Thematic Meta Tag.

    If anyone is interested, they can see what a document’s Knowledge Signature looks like at the following link. There is no advertising in any of them. In fact, I used this very page to create the Knowledge Signature from. So you are looking at a K-Sig of Matt’s “Notifying Webmasters of Penalties” post and comment page.

    http://www.cirilab.com/MattCutts/index.htm

    Second, I created a Thematic Site Map of Matt’s blog site for everyone to see so that you could get an idea of how you can nagivate a blog or site based on the actual thematic content instead contained within it as opposed to a restricted Site Map that limits you to the Webmaster or SEO view of the content and where you should find it.

    http://www.cirilab.com/MattCutts_BlogMap/index.htm

    There’s no Meta Tag Juicing here and if there was it would be reflected in both the Knowledge Signature and the Thematic (Site) Knowledge Map. That’s the beautiful thing about our methodology of self descriptive documents and document collections.

    Let the document(s) speak!

  183. Are you still carrying on this one as trial?

    Akash Kumar

  184. I have noticed in my industry, the number one sights have 1000’s of links from garbage domains. Most of the links are off topic. I thought google was looking at the topics of most of these websites. There is one commodity tradig related website that gets 1000s of links from a garden supply store. I would say 70% of the links are off topic, but they site still rank very well. I thought google was looking more at the themes of websites linking in? Im not so sure?

  185. Matt, any idea if blogger will allow adsense rev share? based on author of story?

  186. Matt,

    After reading the ton of comments here it’s good to see the progress Google has made since you first posted this.

    It would be nice to have an alternative way for Google to contact webmasters that could potentially speed up the process of getting the site back in the index and the problems corrected.

    -Mical

  187. Thanks for the good news. This way a lot of webmaster will not feel left out.

  188. Thanks forthe useful info, MAtt!

  189. sitemp is truely useful. Not long ago, I started to use google webmaster tool and now google indexed more pages of my sites and the crawl rate increases…

  190. Looks like the Villa Magdala have sorted their webpage out now. They still have a PR of 4 so obviously heeded the first request.

    Tom

    http://tomjacksononline.blogspot.com

  191. Does Google crawl movable type blog also ?

  192. For some webmasters, they will not be able to find the violation, due to lack of knowledge. Does Sitemaps or the e-mail provide a Google penalty ‘code’ or specifics as to the exact cause of the issue. i not that you say that the site fails to meet Google webmaster Guidelines – but do you leave it up to them to find out why?

    Thanks…

  193. Cheers for the information Matt! I think I gotta change my sitemap details accordinly.
    Im trying to get a page rank and this information is very useful to me. cheers again!

  194. Hi Matt,

    I read whole blog and got ideas about google penalties. I found that in this blog you didnt explained anything regarding duplication. I wanted to know how Google consider duplication for a web pages from same site. I run many websites and some of them have duplicate pages.
    http://www.theispguide.com
    Regards,
    Stella Pike.

  195. Hello Matt,

    is it “noscript” and “noembed” tags indicated SPAMMING too ?
    I saw so many website / webblog used that way producing a lot of keywords cheating Adsense while their webblog currently in Malaysian language.

  196. Hi Stella,

    Thats good question. Many website may banned due to duplication. There are basically two types of duplication that may cause problem. In first case if you have directly copied content from other website and in other case the content of two pages of a website is similar. Internal pages duplication problem arise if site have thousands of pages and webmaster cant differentiate each and every page. Google didnt disclose any percentage about duplication but i think if you web pages is 40 similar than your chance will be high. Google take long time to identify duplication. Once Google found all duplication from your website then day by day your website ranking goes down and if you still not working on removing duplication than probably your website will be banned. I think this will help you lots.

    Regards,
    BHAVESH.

  197. Matt,

    Wouldnt it be kooler of Google to warn webmasters and allow then to fix what is wrong with their site? Rather then warn them and remove them. Im just curious. I had hiddent text in my site which I have removed and i got a warning e-mail and was told i would be removed for a month. I fixed the site immediately and removed the hidden text. Is there a way to stop the removal? Help.

  198. Matt,
    It seems to be a really common trend where people have text hidden by CSS, and they use javascript to rotate the text in and out of view. For an example, see navy.com and have a look at the fading news stories in the lower right.
    By using this strategy, am I risking my site to be penalized for hidden text, or is Google’s parser javascript aware enough to see that the text is rotated into view?

    Cheers,
    Erek Dyskant

  199. One of my sites was crawled back in May and hasn’t been touched since. It’s not in the index at all. I’ve tried various things including sitemaps, checking for inadvertant hidden text, remove unessary coding and cleaned up the html. It’s still not been crawled since. I looked for past history of the domain but there’s none – so I’m completely confused by this one.

    It’d be nice to get an email saying why – but I’ve not had one yet!

  200. Could we get an update on this article Matt, since it’s a couple of years old now. We believe we are one of the “good” guys, i.e. like your hotel example not the spammy tax one.

    We seem to have been subjected to google penalties 3 times now, and whilst the first 2 times we think there may have been an issue that we’ve then fixed (both accidental linking problems, none spam attempts) this 3rd time leaves us baffled – nothing has changed on the site since we were (apparently) re-included a couple of weeks ago.

    We wonder if we’re being penalised at all or if it’s just down to algorithm changes?

    I know you’re very busy, but I’d love to be able to discuss the problem with you if you can spare 5 minutes of your time, we’re all pulling out hair out here!

    – seb

  201. I am an SEO and I love writing web content and building quality real estate websites. My company is The Marketing Shop.com.

    I have a client site that is being penalized to the high heavens and I have no earthly idea why. Can you please help me? I work very hard to make sure the site has good, relevant content and try not to be spammy. It is KeithDobbs.com. I would love to know what’s up.

    Can you please give me some insite as to why the site has been removed entirely from the rankings for all relevant search phrases?

  202. Hey Matt,

    Well, that’s very good news. Glad to hear that Google is working with webmasters to solve the problems on their sites ( legitimate webmasters anyway! ). I for one am struggling to get a penalty removed from my site. My site has been penalized and I really dont know what the reason is. Since the penalty, I spent a lot of time working on the site. Removed spammy listings ( or listings that I thought were spammy, submitted by my users, changed the site’s layout, made it even more user-friendly, etc – but i failed to see any good results.

    So, the bottom line is that for me, a legitimate webmaster, who runs a site that used to have 7000 unique visitors a day and now it’s down to 300-400 visitors / day, this news is like fresh air.

    Thank you and will keep an eye out on Google sitemaps for some tips:)

    Warm wishes,
    Jonas

  203. Hi Matt

    Thanks for the blog, brilliant resource!

    I am sure you recieve many thousands of individual emails with people pushing their own agenda so I have resisted emailing you, however after discussions with others I think your perspective could be useful for others. That said can I trouble you with a query?

    We are a UK based retailer (circa 200,000 unqiue visitors per month) and have for the past 5 years been top of rankings for ‘tyr e s’ and other related searches – we are the UK’s largest at what we do with the deepest information resource for online surfers – we have recently recieved a penalty from Google, we think for putting in place a 301 redirect from a partner ‘white label’ version of our online store, diverting to our ex-partners own URL, as our deal had ended. I say it is a penalty but we recieved no notification, we still appear for our .com searches at the top but everything else we dont climb any higher than 50th.

    We have a number of commercial links, all brought on in line with the Google guidlines, some of which we have removed just in-case. We have documented all of this and resubmitted a few weeks ago.

    We are sure we have done nothing untoward and we are all very concerned that given we have followed all the rules, by the book, despite our resubmission no recovery looks like it is on the cards and no feedback or guidance looks like it will be forthcoming.

    Could you possibly help, please?!

    Many thanks in advance Matt

  204. Hi everyone,
    Matt you said previously (April 26, 2006 at 6:50 pm)
    “Stuey, if display:none is used to hide text, that can cause issues.”

    Please see http://www.liftingsafety.co.uk/category/load-arrestor-spring-balancers-1139.html
    this page is one (1764 pages in our website, average 500 in site:www.liftingsafety.co.uk, which means 1264 are penalised for some reason) of the pages from our website that has been penalised by google (we use display:none; to hide and show more information ‘if’ the user would like to read more, could this be the reason for the page being penalised?
    Also we use display:none; in the product pages in two instances: 1. To enlarge images on hovers; 2. To split information into tabs for easy reading; could this also be doing us harm?
    I’m sure this is not ‘the’, but ‘one of’ the reason why our website has lost 1200+ pages from site:www.liftingsafety.co.uk since june (when we revamped titles, colours, layouts, navigation, urls) this year. We have changed lots of things since we realised pages were dropping e.g. layouts, the use of headings, colours, styles, robots.txt, sitemaps. etc. I am now at a loss because i have no idea why our website is doing so poorly, after doing so well.

    Any help or advice would be of great help, as i say, i have no idea where to turn. Nothing seems to have any effect. I can’t help thinking its something small that’s having such a big effect, please let me know you ideas.

    Thank you, Philip

  205. Hi Matt,
    Thanks for the beautiful post, I have just seen one google penalty checker tool and it gives me conclusion that my site has got penalty from Google.

    Can i believe such tool?

    Also, I have checked with webmaster tool and everything is working fine, but still i have doubt that my site is under Google’s penalty…!

    IS there types of Google penalty? like serious,worst,bad…

    Looking for your valuable response…

    Ram Gunjal.

  206. Hi Matt,

    I’m woke up this morning to find that my site(s) appear to have been hit by a penalty – I have not received anything through webmaster tools, but the sites have dropped at least 50 places for all search terms. Obviously this is upsetting, but I must have done something wrong and violated the google terms.
    The problem is, I’m really not sure what it is I have done wrong, and it seems there is no way to find out.
    The only recent changes I made were to add quite a few links to some related sites, and a few to a non-related site, as well as add a new page containing some affiliate links. I’m guessing this is the problem, so I have now edited all of the links to add the ‘nofollow’ tag. I have submitted a reconsideration request and I hope this is the problem and therefore will be resolved quickly. I was hoping to get your opinion on this as well though – is the nofollow tag enough, or should I completely remove the links?
    And one more question; a sub domain to the effected site has also dropped by around 50 places in the search results – is this normal, or have both sites been independently penalised?
    Thanks for any advice you can provide.

    Regards
    Tom

  207. Hi Matt, my site trends-search.com was ranking first for many of the keywords on Google. It’s a site to allow users to view the latest news, blogs, photos and videos for buzzing things on web. The site is just 2 months old. I started getting more than 10k users per day. But just a week around the searches have dropped to around less than 1k per day. Also I noticed that some of my best keywords are now appearing on page 4+ which used to appear on page 1. Does it mean Google has imposed a -30 penalty for my site. On each of the content at the bottom I was showing the latest buzzing 100 keywords. I read on the web that keyword stuffing could be a reason for penalty and hence removed them. How long would it take for Google to reconsider the site. I have also requested for a reconsideration request. Please advice.

  208. Hi, Matt

    I am a webmaster and this is the fisrt time when am reading your blog. This is very intersting and very attractive blog. I think you are sharing very valuable information with us, thanks for this nice post. I am continously watching your video on you tube.

  209. Great info and thanks for sharing it. I do appreciate the efforts to include us in you information stream and all the effort that is going into it.

  210. Hi Matt,
    I am here to ask a few things I am not sure about. What are the core condition of a site get spammed?. Above in a post reply I heard about black hat seo methodes like doorway pages? So what as best practice you can suggest?

css.php