Improved SEO documentation galore!

One of the wonderful things about a search conference like SMX Advanced is that it gives us a chance to finish a lot of things we’d been meaning to do. Google just added a bunch of nice documentation in various places. We even did it in official places — much better than doing it on my personal blog. 🙂 Here’s a few of the things that I know we’ve done recently:

Robots.txt documentation

One of the things that I like about robots.txt and the Robots Exclusion Protocol (REP) is that it’s well-supported by all the major search engines and has been for years. But more documentation is a good thing, and several of the major search engines recently did blog posts about how they support robots.txt and REP. You can read Google’s robots.txt/REP post, the Microsoft post, or the post from Yahoo.

By the way: if you haven’t seen it, Google also produced a really nice booklet about robots.txt for publishers (PDF link). This PDF is perfect for regular folks that don’t live and breathe search 24 hours a day. 🙂

User feedback

We do appreciate getting suggestions and feedback from users, webmasters, and SEOs. I’m especially interested when people want to report spam, including paid text links. Google’s position on paid links that pass PageRank is well-known, because we’ve been pretty clear on the subject.

In a blog post earlier today, Reid Yokoyama put out a renewed call for spam reports. He gave a peek into the numbers of how many reports Google receives and how we prioritize (here’s a hint: our authenticated spam report form gets higher priority). Read his entire blog post if you’d like to hear more about webspam, paid links, and user feedback.

Just one additional note: we accept spam reports not just in English, but in many languages. For example, I’d love to get spam reports in Russian, spam reports in Turkish, spam reports in Romanian, or even spam reports in Arabic.

IP delivery/geolocation/cloaking

People ask me about cloaking software and technology all the time to find out how risky it is to use a cloak script when Googlebot visits (the short answer: it’s very risky). We did a blog post (and a video!) to describe the difference between things like IP delivery, which is serving different content to users based on IP address, and geolocation (which serves different content based on the user’s location). IP-based geolocation is a specific type of IP delivery that is within Google’s quality guidelines. Then we describe cloaking (which is serving different content to users than to Googlebot). I highly recommend that you read the post and watch Maile’s video for more information. If you’re interested in herding search engine bots in a whitehat/low risk way, that post will tell you what Google considers cloaking.

Nofollow documentation

Earlier this year, Li Evans pinged us with a good observation. We’ve answered a ton of questions about nofollow in various places around the blogosphere. Li asked us to distill the important bits about nofollow into a single page and place it in Google’s HTML documentation. We just pushed that live, so you can read more about the nofollow attribute if you’re interested. Thanks for suggesting that, Li.

Better definition for doorway pages

Michael Martinez was a little less polite than Li. He essentially said that Google’s documentation had a pretty sucky definition for what a doorway page was. Fair point. So we revamped the definition of a doorway page to be more clear:

Doorway pages are typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase. In many cases, doorway pages are written to rank for a particular phrase and then funnel users to a single destination.

I think that definition is much better than our old definition of a doorway page. Thanks for the suggestion, Michael.

Primarily for users

One of our quality guidelines used to say

Make pages for users, not for search engines. Don’t deceive your users or present different content to search engines than you display to users, which is commonly referred to as “cloaking.”

We recently clarified that guideline to say “Make pages primarily for users, not for search engines” (emphasis mine). Why add “primarily”? As one of the main authors of those quality guidelines, I can tell you that the intent of that guideline was mainly to discourage cloaking (which is doing something different for search engines than for regular users). Some people have misinterpreted that guideline as “You can’t do a single thing for search engines that you wouldn’t do for your users,” and that was not my intent when I wrote that guideline. Instead, the spirit of that guideline is that users should be the primary consideration. But it is fine to do some things that don’t affect users but do help search engines.

I’ll run through 3-4 quick examples of things that are perfectly okay to do for search engines, but that you wouldn’t automatically do for users:

  • Adding a nofollow attribute to a link doesn’t affect users, but can serve as a useful indicator to search engines that you don’t necessarily want PageRank to flow through that link.
  • Adding a meta description. When a user visits a web page, their browser doesn’t show the meta description data in any way. But you can suggest to search engines to show a particular snippet by using the meta description wisely.
  • You can tell Google your preference on www vs. non-www. Again, that’s probably not something that users see or that directly affects them, but it’s still a smart thing to do.
  • Submitting a Sitemap to Google or making it available to other search engines is not an action that you’d take for users, because users don’t see Sitemaps. But it can be a smart move because search engines can do better if you provide that information.

Just to be clear: Users are vitally important. I still recommend that you keep your users in mind at all times as you design and create a site. We added the word “primarily” to indicate that people can do additional things that users don’t see but that helps search engines do better at crawling/indexing/serving your site.

I can’t believe I just wrote 350+ words about a one-word change to our quality guidelines. 🙂 But I hope that gives some background and context.

Conclusion

No search engine is perfect, and everyone will have different opinions about what a search engine should focus on. But I appreciate the feedback that we get from users, webmasters, and SEOs. I know that the suggestions that we get help to make Google a better search engine. If you see me at SMX Advanced, please walk right up and say hello. I promise that I’m not frightening, and I’d like to hear where you think Google needs to improve. There will also be a bunch of other Googlers at the conference — don’t be shy about approaching them, either. 🙂

142 Responses to Improved SEO documentation galore! (Leave a comment)

  1. Seems like a lot of this may be contradicting what Evan had to offer in the Bot Herding panel earlier. I didn’t catch much of the discussion after the Q&A ended, but it seemed to me that there was some questionable points raised regarding the “primarily for users” argument and stance.

  2. As always Matt, very informative reading 🙂 See ya later this year at Pubcon.

  3. Excellent documentation and I appreciate all of the effort that went into it, by you and whoever else worked on it. It makes my life a lot easier to have “official” stuff to point to when trying to make a point.

  4. Eric, I sat in that session, and I think Evan and I agree completely. The short answer is that Google wants to judge the same page that a user is going to see. It’s easy to prevent all well-behaved bots from clicking on ads (e.g. with a robots.txt), and all smart analytics packages can break out the search engines separately from users.

    JLH, I like having it in official places too. 🙂

  5. Great read Matt, can I ask when a page fails to receive any PR after a considerable time (say over 6 months and its still greyed out) and its absolutely not dup content from anywhere else has Google defined that page under the new doorway definition hence the PR shyness to appear?

  6. Thank you for this one Matt. The new clear definitions are much better. I have been telling people for a long time about the ‘doorway pages’ thing. Either they think im nuts for saying you can use them or dumb for saying they are bad ideas.

    Know your team made it make sense. As an entry point that is useful landing/doorways are fine. Yet as I have been saying forever a page that is just…

    KEYWORD
    key-word keysword
    link link

    is total garbage and pointless.

    Keep rocking it man Ill keep reading.

    peace

  7. Some nice improvements from Google in there.

    No matter how clear your writing there will always be some that try to twist the words though.

    By the way, what’s going on over at 74.125.19.18 then?

  8. Is that what you were doing during the last session and lunch?? 🙂 Workaholic! 😉

  9. Beyond making sure you’re not doing anything to get yourself blacklisted, SEO reaches diminishing returns very quickly.

    Worry about writing good stuff, getting good incoming links, and getting repeat visitors first. Worry about SEO second.

  10. Glad you did this post. I had a few questions about IP delivery and “cloaking” after this mornings sessions. Will do some more research as I have an old script that creates crazy long dynamic urls with session ids. Time to fix it.

  11. thanks matt. great to see google evolving.

  12. Are all paid links a violation of Webmaster guidelines – or are there differences and exceptions when it is perfectly fine to get a paid link WITHOUT the NoFollow?

  13. hey matt regarding choosing the www/non www, could google add the option to select a country specific domain as well. For example I’m in Australia and a lot of the businesses I work for prefer to register the .com and .com.au version of their domain and point one or the other to the main domain. This can cause dupe content issues. I know it’s possible to Rewrite etc etc and I do that, but I thought it would be much simpler for webmasters to control this in WM Tools?

  14. One of the most damaging things in a capitalist system is a monopoly. Google has a monopoly on internet search.. admittedly they have a monopoly because they do it so well. However, because of the monopoly almost all websites are frighteningly exposed to the whims of the ‘algo change’ or some of the mystery -30 -950 or minus whatever penalties. I think Google really has a responsibility to inform (proactively) what is going on in terms of these penalties. Not just de-indexing or hacked sites reports available in the webmaster console…. those are pretty easy to figure out.

    Matt, you listed a number of resources above. What about a place a webmaster can go to communicate with Google about said penalties or changes? Check out webmaster central. It’s littered with webmasters who had great organic traffic one day.. and none the next. This has got to be the most frustrating thing for webmasters trying to do the right thing and follow all Google’s guidelines. I think it also engenders a lot of animosity about Google’s monopoly. Google would do a lot for itself to cut off this criticism at its knees.

  15. Wow,

    I thought I was tough on Google with some of the articles I write. This guy Michael Martinez is the king. He is smart and funny, I had a big laugh reading his article.

    I think Google did a good job with all this new documentation and I am glad to see it.

  16. One thing I think Google is still missing the boat on somewhat are sites that generate almost all of there PageRank from what I call artificial backlinks. It appears to me Google sometimes does things to limit e.g. counter sites. However I still see way to many PR8 photo sites and software sites that force anyone that uses their free photo album software or other types of software to link back to them. So even if they are not selling links or only selling a few links that would make it difficult to detect, my concern is all the good rankings they receive with these artificially generated backlinks. I suspect if Google devalued these forced backlinks from these PR8 photo album sites they would really only be PR1 or PR2 sites. I know there is truth to what I am saying because I see these types of sites use these artificial backlinks to create powerful internal pages on hundreds of different topics that all rank fairly well in Google. My concern is not them writing about hundreds of differnet topics, I think that is fine.

  17. Excellent Post Matt! Some ‘things’ are much clearer now.

  18. Matt,

    This whole nofollow thing is clear. But there’s something I don’t understand. We sometimes set up a directory with just one purpose, to show something to a customer. Our experience is that it is absolutely necessary to block these directories from the search engines. Even though we don’t link to it, nor does anybody else, Googlebot figures out the existence of these directories and tries to index them. I’ve heard other people complain about this as well.

    How does Googlebot figure out the existence of these directory, and more importantly, why does it even care to index the content when there are no links to it?

  19. Very good stuff Matt.

    I’m with a couple comments above however, in that it does not really matter how clear Google tries to make things as SEO’s etc will spin things anyway in order to make their points.

  20. I am always happy to see the major search engines releasing more detailed documentation. I am also happy to see some old definitions redefined. Clarity, it seems, is a good center for this release.

    @ miss universe:
    I would think that in most cases paid links are simply put not a good idea. While I’m sure it’s okay in certain situations, you still run the risk of someone reporting paid links after which you will go through the motions to have the penalty reverted. Best to just avoid them and play safe.

  21. Paul, normally we view doorways as these specific things, so I don’t think that would account for it. It’s possible that the links to that specific page are just carrying no (or very little) weight. But remember that we only push PageRanks to the toolbar every 3-4 months, so it could just be that it’s missed 1-2 pushes, too.

    g1smd, I haven’t been keeping a close eye on datacenters the last few days. Maybe someone at Google will magically see your question and email me to say “Oh, that’s project X.” That would be nice, because I’m feeling lazy tonight after the question and answer keynote at SMX..

    Kate Morris, you could argue that I have a few work-too-much tendencies. 🙂 But I wanted to get the post live in time to mention it during the SMX session.

    john andrews, there’s still 1-2 things on my to-do list. I recently re-read the “what is an SEO?” page, and I don’t think it’s aged as well as some of the other quality guidelines. I think that we could rework the tone a little bit. But that will have to wait for another day.

    josh, I’ll pass that on.

    Thanks, Tom Forrest. If you want to send in examples via the spam report form, that’s the perfect sort of thing that we enjoy checking out.

    Peter (IMC), Googlebot is good at finding new pages; there’s a lot of links out there on the web in strange places. Sounds like you’re doing the right thing to block any pages you don’t want indexed, though.

    Doug Heil, there may be a few people who deliberately misinterpret what we say, but that’s all the more reason to improve the documentation so that fewer things can be misconstrued. 🙂

  22. “Adding a nofollow attribute to a link doesn’t affect users, but can serve as a useful indicator to search engines that you don’t necessarily want PageRank to flow through that link”.

    This implies that nofollow does not always prevent PageRank flow.

    Is the statement still correct if “can serve” is replaced by “serves” and the word “necessarily” is removed?

    This would make it definitive rather than optional.

  23. Matt, many thanks to you, and all the other Googlers who have worked to improve the documentation for webmasters and webnotsomasters like yours truly. 🙂

    I am one of the few “webmasters” [maybe the only one] who attended last year’s SMX Advanced trying to improve only one site rather than thousands. Several attendees requested expansion and clarification of the guidelines. The Goggle Team responded immediately by publishing the requested information by the end of the day and even more in the days following.

    I had not revisted those pages for some time, until reading the this post. Wow! The section “Creating a Google-friendly site” is an especially nice addition.

    The price tag is too steep to attend this year’s event so I’m following the bloggers covering it (thanks to Bruce Clay’s team). Frown at the blackhat SEO’s for me, and please continue the great job you are doing for those of us who are “primarily” building web sites for users – while learning and trying every thing possible to help those users find our content.

    Again, thanks.

  24. Dave (Original)

    Matt, please tell you have NOT changed the Webmaster guidelines wording based on feedback from SMX? You do realise that those who attend mostly have their own agendas and do NOT represent the majority of Webmasters. About .0000001% at best.

    Doorway pages are typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase

    Sounds like a typical page after being optimized by a SEO.

  25. Dave (Original)

    This implies that nofollow does not always prevent PageRank flow.

    Perhaps Google DO flow on PR at their discretion. However, the use of nofollow ensures you wont get the blame if you do link to a bad neighborhood.

  26. Hi Matt,

    Thanks for the post and enthusiasm.

    The post said this re: CLOAKING:

    “Serving different content to users than to Googlebot. This is a violation of our webmaster guidelines. If the file that Googlebot sees is not identical to the file that a typical user sees, then you’re in a high-risk category. A program such as md5sum or diff can compute a hash to verify that two different files are identical.”

    That’s not very much in fact. Not much more than what was up already. And that’s a problem because at the moment the Google definition of cloaking is vague:

    1. What about REST? URI’s represent different resources. Yet more than one representation of a resource could exist at the same URI. Performing an MD5 on a URI is not good enough.

    2. The wording used to describe cloaking is always “different content”. Google should be more proactively specific in this regard. Does Google mean to say that serving the same content represented in different formats/structures (xml/html/plain text/Javascript-enabled) is cloaking? Or does Google mean to say that serving different content (regardless of format or structure) is cloaking?

    Implications of the above:

    If cloaking truly means serving different content as opposed to the same content with different structure, then webmasters will be able to push client-side implementations (as Google themselves have done) using the fragment-identifier and Javascript, and then serve the same content, albeit in different (plain-text) structure to Google.

    The world is clearly going client-side. This needs to start happening!

  27. Since the “orignial” Dave’s comment appears to be inspired by my previous one, I guess I need to follow-up:

    1) Matt, and the other Googlers, stated their reservations for being too transparent with the Guidlines because they are well aware that devious people could and would take advantage.

    2) They also realized that the feedback given from the less devious among us in fact DO represent a large percentage of those frustrated web content authors out in Googleland – if not the majority.

    3) I don’t really believe Google would allow any SEO’s to optimize the Guidelines. Matt’s job is hard enough.

    I am, however, grateful that Google is working towards improving them. Hopefully, they will continue to improve the documentation so that it doesn’t take an industry insider or software engineer to interpret them.

    I confess that I have my own agenda: I write content in the hope that readers will benefit. In order for that to happen the readers have to be able to find my web site. With all the misinformation in the SEO world, it really helps when I can get extensive and clearly written help directly from the source.

  28. “Yet more than one representation of a resource could exist at the same URI. Performing an MD5 on a URI is not good enough.”

    When I read up on REST, I see that people recommend to put different resources on different URIs. I did a quick search and found that recommendation at http://www.xml.com/lpt/a/1510.

    It just seems intuitive to me. What also seems intuitive is that things that rely upon heavy AJAX are not necessarily appropriate for crawlers, just as the middle of my Gmail session is not necessarily appropriate, and can easily be blocked from crawling It’s typically the end result of the AJAX interactions/sessions that are saved and accessed on different URIs.

    Am I off base here?

  29. Hi Matt,

    Hope you must be doing well, seems great post you writes here.

    I have a website having Zero PageRank in Google but whenever I checks it’s cached version in google through cache:sitename.com it shows me PageRank – 3. And it happens just opposite for those websites / pages who are having page rank but when we check the cache of those pages it shows zero pagerank.

    I will appreciate your views and thoughts on this issue. Is there something google assigns pagerank to the cached version of sites and pages or its just coming simply without any reason.

    I hope you will got a little time to clear my doubts on this query

    Thanks,
    Mohit

  30. Hi Matt,

    what I still wonder is if it’s okay for Google if a website is using language detection (based on the browser-language setting) and redirecting to english, german and so on site.
    For users it’s more convenient, though I still don’t know if Google is rating this negatively. Many people would love to hear a clear statement on this.

    The other question is how to do the best approach for country-specific content. Is it better to have http://www.abc.com for the international english version and http://www.abc.de for german, http://www.abc.us for USA and so on or is it better to have http://www.abc.com and http://www.abc.com/de or http://www.abc.com/us ?

  31. Matt,

    “As one of the main authors of those quality guidelines”

    But you need to keep those quality guidelines “minty fresh”, right?

    There isn’t much about publishing fake contents to game GOOG system. That caused som confusion recently. Time to call spam as spam 🙂

  32. MichaelDuz, you’re right. I’m not sure whether I’ll change the post though; I was thinking about how other search engines such as Yahoo and MSFT and didn’t want to exclude them.

    Harith, I covered this in the You and A earlier today. I didn’t see you in the audience? It’s not that far for you to come, right? Couldn’t you just zoom in over Canada?

  33. Disincentive

    Matt, when it comes to fighting spam Google does an excellent job. It is one of the reasons I use Google. Further your blog is great because it shows a human face to Google, even though you are not Google.

    However, I have seen many sites that are clear violations, have been reported and nothing ever comes of it. A year latter they are still near the top. I feel this is a huge disincentive for those who are spam fighters on the Internet. It gives the message why bother to report anything. I could give a specific examples here but I do not think you want that on your blog. However, I am at a loss to understand how many sites get away with it and others start adopting their practice as nothing is done.

  34. Matt,

    Thanks. I wasn’t in the audience, but look forward to meet you one day.

    I’m in Copenhagen-Denmark and it takes around 10 hours to fly to Seattle 🙂

  35. I really appreciate the continued flow of information from Google.

    The only significant improvement I have ever really wanted from Google is simply that I would love to see it more and more transparent.

    I understand the reasoning that if all the secrets were out, it would be too easy to game.

    The more information that is put out like this the better for all of us.

    So thank you for more!

    dk

  36. Hi Matt,

    I like the new definition for Doorway Pages. I think it better describes what Google means with a doorway page. But it also raised a question. I like to do everything white hat as a SEO. I mostly do SEO for relatives and friends and i tend to use so called “Landing Pages”. These are pages that contain information about a specific product or service and are optimized for a specific word or set of words. They serve unique and relevant content and do not automatically forward any visitor, so that the quality of the pages meet the expectations/needs of the visitor. The only difference is that these pages aren’t accessible via the standard navigation, but more often are linked to via separate links in of around the content on the page.

    Would you or the quality raters automatically identify those pages as spam or doorway pages?

  37. I would like question about WAP (mobile) site maps. and does it resembles with cloaking….to my understanding its not. its some what similar to IP detection where server through results according to IP.

    So will how google crawles a wap site where server through content based on Device?

  38. It’s good to see that there is new documentation on Robots.Txt and Nofollow.

    Robots.Txt implementation still varies a lot cross-engine, especially Wildcards etc.

    Some interesting notes on Geo-Location,

    Can I ask, what is the impact of the “Geographic Location” setting in Google Webmaster Tools?

    The PubToilets.com Team.

  39. Cheers for this Matt. I’m now going to have to go and update all my documentation for clients with your new and improved definitions :p

  40. What about protecting sites from “evil” bots?

    Say I want to block automated queries to my server (for instance: more than 100 queris from the same IP within a given time) WITHOUT blocking Googlebot AND not being banned for cloaking?

  41. Hi Matt

    I appreciate you giving us further information and clarification of Google’s documentation guidelines however it would be most helpful if Google would communicate with the webmasters via the message center on issues specific to their website. For example, I received a penalty and believed in good faith that any violations were corrected and sent a number of reinclusion requests on the webmaster tools center. To date there has been no reply. I am aware that you are concerned about spam and this may be the reason for non-communication to webmasters but it would be most helpful if you clarify how the message center works .

  42. Hi Matt,

    Glad to see the guidelines had a mini update, especially liking the new definition of Doorway Pages and the addition of ‘primarily’. Just one word but it makes a great deal of difference!

  43. purposeinc: if all the secrets were out, it would be too easy to game.

    Would that be much worse than the current situation, whereby Google keeps its methods secret?

    So SEO companies spend lots of money working them out.

    So we pay SEO companies to tell us what they’ve worked out.

    So some of us then know some methods – but only if we’ve paid the right money to the right people …

    And still when I type a hotel or restaurant name in, I get annoying directory sites that don’t give the URL of the one I’m looking for. 🙁

  44. Sounds like a typical page after being optimized by a SEO.

    That comment shows how little you know of SEO.

  45. Hi Matt,

    One thing still missing is the case of countries where people speak several languages. Neither the IP detection nor the domain extension (.us, .fr) can help in that case.

    Why not to use the HTTP_ACCEPT_LANGUAGE when defined and a default language when not defined plus a link to change the language.
    This is necessary only for the base URL: http://www.domain.com.
    For directories, we can have specific directories for each language.

    Regards,

    Pierre

  46. Thanks for the detailed post, Matt.

    I have a question about recognizing when a site has tripped on one of the guidelines you highlight here– is there a tell-tale sign we can look for? I have a sneaking suspicion that we may have unknowingly fallen into that trap, but would love a way to figure that out for sure.

  47. Matt, is it considered cloaking if a site changes the page text and graphics based on which country the viewer is originating? How does Google decide when a site is trying to do something “wrong” and when its just trying to help their own customer experience? Lots of sites nowadays are trying to cater to the viewer based on their language preferences.

    Also, I am assuming you will attend SES in San Jose in August?

  48. Hi Matt,

    Thank you for this juicy info! The Google booklet will certainly help me educate my clients and students in a few SEO basics (that ultimately will help them better understand and appreciate why I do what I do).

    I can attest to Google’s quick response to spam reports ~ the offending site was gone the next day.

    Keep up sharing the wealth 🙂
    Bonnie

  49. Matt,

    Thanks for this update. The additional clarity makes our work a little simplier.

    Thanks again,

    Mark

  50. thanks for clearing some very important issues. We have debated these issues for months on digitalpoint forums. While I lost some of my arguments with the clarifications it feels good to know that suggestions don’t fall on deaf ears.

  51. Glad to see the engines standardizing (attempting to anyway) on tag attributes. The lack of standards with rel=nofollow was one of my main points in the bot herding session.

    I also heard rumors of rel=external nofollow – what’s up with that?

    Good meeting you yesterday.

  52. Hi Matt,

    On this subject I was hoping you could help out with a problem we’re having with a site redesign for a large site (500k pages, 1.3 million visitors / month)?

    Our designers want to use tabbed navigation on a lot of pages to allow users to easily “flip” between different pieces of content. Imagine a product page, and within the page there’s different tabs for “summary”, “product spec”, “full information”, “reviews” – the idea is to use CSS to hide the inactive pieces of content until a user clicks on them.

    This would mean that Googlebot would index all the 4 pieces of content as a single page / URL – but I’m concerned that this could be misinterpreted as being hidden content. I wouldn’t be concerned about a manual review (it’s a great site and we’re not doing anything naughty), but is there a risk of us being potentially caught in an automated filter because of the hidden content?

    Other sites we see using this technique either block the hidden content to search engines (which we don’t want to do because it’s a lot of unique content) or use separate URLs for each tab (which we don’t want to do because it would create far too many new pages, some of which would be very low on content).

    What would you recommend in this situation? If we use CSS to hide the inactive content until a user clicks the tab, is there any risk at all of being caught in a spam filter? Or I’m I just being yet another paranoid SEO? 🙂

    Thanks in advance,
    Scott

  53. Are you always this friendly? Is it due to all that google stock or were you just born so friendly? 🙂

    It was a very informative post, btw.

  54. I’ll try to respond to the comments more in a bit, but I wanted to point out that we did change Google’s “What is an SEO?” page to remove the “you should insist on a money-back guarantee” part of that page. So the tone of that page isn’t perfect, but it’s still a little better than it used to be.

    Mark, try reporting it again? If the case is that clear, I’d like to think that we would take action on the site. If that doesn’t seem to do anything, I might show up in our webmaster help group to ask folks why Google didn’t take action.

    Bonnie Parrish-Kell, glad we took action on the site you reported, and that booklet is pretty helpful, right? I liked it a lot.

    Adam Audette, you must be listening to different rumor machines than I do. I haven’t heard of this “rel=external nofollow” of which you speak. It was nice to meet you yesterday too, though.

    Mark Knowles and oral seymour, glad if it helped. Brandon, I try to be pretty friendly. 🙂

  55. Matt, love the changes and love that you guys often talk about getting the spammers.

    Unfortunately, it often seems to be just that, talk. We regularly report (only blatant) spam, and follow up to see if action has been taken. We have reported, multiple times over the last year, a high ranking doorway page that is so blatant it has “Doorway” in the title tag…nothing. We have reported blatant sub-domain abuse+doorway pages (google “any city”+apartments) to no avail. We have reported a site that is hiding 800 words with a -500 div tag…nothing. If there’s no follow up, what’s the compelling reason to keep reporting? We’re trying to help you guys, but if valid reports don’t result in any action I think you’re going to see the reporting of spam from credible, authenticated sources drop way off.

  56. thanks for clearing some very important issues. We have debated these issues for months on digitalpoint forums. While I lost some of my arguments with the clarifications it feels good to know that suggestions don’t fall on deaf ears.

  57. Sounds like the “external nofollow” may just be a misunderstanding and/or vicious rumor of randomness. All I found was http://www.mattcutts.com/blog/comments-on-thomas-claburns-piece/#comment-107762 in regards to WP using nofollow for external (aka outbound) links.

    I was told it was found as another attribute/class in the rel= tag on some sites and commented on by yourself. Never mind 🙂

  58. Adam,

    rel=”external” has been used by some to bypass the XHTML guidelines that prohibit the use of target=”_blank” by using javascript to do the same thing.

    I haven’t heard of it having any potential SEO effect or implementation.

    Patrick

  59. view the source of matt’s blog, he uses rel=”external nofollow” which is just different things seperated by a space.

    It could be rel=”external monkeys nofollow chicken” for that matter which would be the same as rel=”external” rel=”monkeys” rel=”nofollow” and rel=”chicken” all listed in the links. As far as I know all Google will care about is the rel=”nofollow” one, unless of course if there is a secret rel=”monkeys” beta program going on. 🙂

  60. Douglas Gottlieb

    Lots and lots of sites use image replacement techniques with CSS to allow both SEO friendly headlines and pretty graphic images of text. Hell, practically every modern CSS book preaches this technique. But many people worry that hiding text off screen or under the graphic could be misconstrued as deceptive, causing Google to not index them.

    Many people have weighed in pointing out that “the spirit of the webmaster guidelines” seems to imply that Phark image replacement is okay, but I’ve never seen Google directly state that Phark is okay, so people around here remain a little uneasy about it.

    Can anyone point to an official “Phark is okay, so long as the html text and the image are a match” type of statement from Google?

    Thanks!

  61. Hi Matt a question

    We report the many websites that are trying to “cheat” Google.

    But how long does it take before Google takes action? One week, a month or longer.

    BTW..We see the black hat, like cyclist taking EPO – sooner or later you’ll probably be busted 🙂

  62. Just saw Mark´s comments… ok..

    Our SEO guy are at the SMX he would love to say hello to you.. U got a fan there:-)

  63. Matt,
    When is Google going to get back to me about the WebMaster Guidelines area? I’ve emailed devs, adwords reps, use the spam report tool to try to talk to your team, etc. You seem to not want to address my issue. Can you at least confirm that this is the case?

    I know what they say about assuming..but… You probably have my email address through this blog and it’s spam/ip filtering ability…can you send me a message on this issue?

  64. Hi Matt, glad to see the updates to WebMaster Guidelines. As more of a developer than seo, I usually just listened to Rhea or whoever we had in house on Google’s stance, but I think reading them for myself I got a better feel for the tone you are trying to set. I’m definitely going to use them as a guideline for our younger developers/designers. Unfortunately I think a lot of them get too wrapped with trying to do something different or use the latest DOM/AJAX widgets to stand out, that UX is completely lost. So it’s very nice to have focus on that user-centric ideology from an search perspective is an added reinforcement.

    I also enjoyed the session yesterday with Danny, and the comments and justification/explanation as for why Sphinn was a ‘legit’ use of a new domain. Forgive me if this isn’t the appropriate place, but with all the talk of spam in this thread, I was wondering if you could speak of any spam issues in a vertical I haven’t heard a lot about: Jobs?

  65. Brian, can you try the reports once more? Those two things sound like stuff that we’d be interested in.

    Ah. Yup. The values in the rel space are separated by spaces, like JLH points out.

    Scott, what’s the issue you’re trying to get addressed?

  66. Dave (Original)

    Matt, how about 1 non-ambiguous statement?

    “Any Site/Page Deemed By Google As Deceiving or Attempting to Decieve Will Be Banned or Penalized. Google’s Decision is Final”.

    That comment shows how little you know of SEO.

    Or, how much I do know about most SEO’s.

    SE0=Myth and if you design/write pages for users, yourself and Google’s algo are singing from the same song book. Optimize for SE algos and you are doomed to failure.

    I liken most SEO business to a Lawn Mowing business. Anyone can start one.

  67. Matt,

    This isn’t webmaster-related as such, but I did notice an odd Google Maps-related bug (since Scott brought the bug topic up), and I figured you’d know who to pass it on to.

    Try the following address on http://www.google.com/maps :

    444 Yonge Street, Toronto, ON

    Zoom in 1-2 levels, then look at the “Link to Page” link (specifically, the latitude/longitude coordinates).

    Then type in the following address:

    50 Eagle Street West, Newmarket, ON

    Look at the “Link to Page” link. The same latitude/longitude coordinates are listed as the ones from Yonge Street previously.

    An odd thing I noticed is that if you do generate a map for the same address twice (e.g. the Eagle Street address), the latitude/longitude do correct themselves…just not the first time.

    It’s no big deal to me personally, because I worked around the issue. But there are probably others who got trapped by this at some point. That’s all I wanted to say.

  68. I personally praise G every nite and include G in my prayers…can that be added to the algos with my site details and a do-follow? 😉

    That No-Follow page on webmaster tools is very nicely done, thanks for the clarity, so much controversy out there about the no-follows, this really clears things up!

  69. Matt, the guidelines say: “Don’t deceive your users or present different content to search engines than you display to users, which is commonly referred to as “cloaking.”

    What is if someone presents content to users and not to the search engines. Would that be perceived by Google as a violation? Would you define that as cloaking?

  70. Matt, I think that has too much value trustrank in Google.es
    Many newspapers in the results, and are always the same, this happens in many searches.

    There are results of newspapers about finances when you search for a car…
    These results are bad for users.

    Is it because they are in Google News?

  71. Reading the post above, it is obvious that google is very focused on reducing spam, and you are in charge of that. I am sure that the reason that you want to eliminate spam is so that users can more easily get the results that they are looking for. However, as a layman, focused on real estate, I have to say that some of the absoloute worst sites imaginable are in the top 20 results using the keywords “Miami real estate”, and some sites that actually give the user what the user is looking for, are very far back. It almost seems as if google feels that in eliminating spam, users will get the best results; or, if a webmaster uses various techniques to get other sites to link to their site, it deserves a higher placement. Google needs to develop techniques, whereby a great site that gives users the information they seek, is exactly that, and is recognized as that. I have such a site, not in my own opinion, but in the opinion of users; yet it is hardly recognized by google, except for a few search terms, where it appears as #1. Inferior sites that users are never going to use, appear in the top 20 search results, and superior sites sometimes dont appear at all.

  72. Hi,

    maybe this question has already been asked somewhere in this blog, if yes, sorry, I did not find it 🙂

    I am still very unsure what exactly the “report links” campaign means. When I do a search on Google Germany for “kredit” (means “loan” in english), I am getting spammed by lots of sites which do contain informations about loans, but in the end, you will only encounter affiliate links which make those webmasters rich.

    A quick link: research on Yahoo confirms that almost all links are bought or are set by a network of sites, which are created for the only purpose of pushing websites for specific keywords.

    So what confuses me, as I think that surely a lot of people already reported those sites, why is there no punishment? I think it’s kind of clear that no experienced webmaster or blogger links to a pure info / affiliate-site just because the content is so great or something like that.
    A real impact is necessary to show all people (including me) that Google takes buying links serious.

  73. Matt I am a regulary reader from Europe and I am a enthusiastically flash developer.

    I am wondering myself how to optimize in the best way Flash websites. I a read a lot of blogs about this topic but I would like to ask you when I have a flash website and I am providing the same Flash content as a HTML content for searchengine bots is this cloacking?

    In Flash, there is a possibility to provide URLs for visitors so that a direct access to a specific content is possible. When this content is available as HTML version (same URL) for bots this isn’t cloacking, is it?

    Thank you

  74. I had some questions aswell, but after talking to people, I got it rectified. I had problems with my site too that got me a penalty today. But I have worked hard on it and made it to the google guidelines which I thoroughly read today. I have already submitted for reconsideration after being sure that it meets the guidelines. I knew it was a penalty because my site started ranking at 5th page all of a sudden since this morning.

    But I have a question. Recently since a lot of days I was facing BOT attack at my forums, and today it just stopped after the penalty. There were multiple instances of browsing from same IP, route which I was unabled to ban completely, so it returned with a new IP again. I wanted to know that can that be a spam attack and cause me a penalty. How that just stopped today after I received a penalty. It might also be a cause? Since I dont know much about it, I seriously hope that my penalty goes off because I promise to not commit the mistakes again in future. And this attack or spam from that bot was not me at all. It brought my server down everyday.

    Any help will be great Matt, looking forward to your reply.

  75. I like the post Matt, and appreciate the added visibility and clarification that Google is providing. I published recently about the additional visibility and responsiveness that Google is providing and I think it’s time for another post of similar nature ;D As an SEO’er I’ve still got plenty to learn and information like this helps a bunch.

    I’ll be directing some of my clients to the robots.txt publisher document as well. It clearly describes what I otherwise couldn’t sum up so clearly 🙂

    Keep it coming!

  76. Hi Matt,

    Thanks, as always, for the info. But the nofollow thing is more confusing to me now:

    “using nofollow causes us to drop the target links from our overall graph of the web”

    So if I nofollow a link

    1) at the home page, will it be considered as nofollowed at the whole site
    2) suposing I have the same link twice in a determined page, once normal and the other one nofollowed: Will it be considered as if I had nofolloweb both?

  77. Matt, thanks, I re-reported the links about 1:30pm. One of them removed the hidden text mysteriously last night, and it’s been there for months, methinks they read your blog 🙂 However, the hidden text is still showing in the Google cache of their site, please do something about these nitwits 🙂

    Also, if possible can you comment on the spammy sub-domain doorway pages abused by the apartment finder companies? If you search for a city+apartments, the SERPs are ruled by sub-domain spam. Is Google planning on doing something about these, or should we take that as a sign that you approve of that technique? Thanks!

  78. Matt,

    I am curious about Google’s position on links from sites that provide certifications. Some sites like MacAfee’s Scan Alert provided certification that a site passes a hacker test. This certification would seem to add valuable for users. When a site such as Scan Alert forms a list of sites that pass this certification, site owners link to it because of the information is gives their users. To get certified you must pay a fee but you are also getting a service which is valuable to web users. Does Google give any weight to any kind of certifications a site may have and are the links from the certifier considered paid links?

  79. Matt-

    Are there still issues with Googlebot ignoring robots.txt files for images??? I know this has been an issue in the past but I’m still seeing it on numerous websites. The images as still in the Google index. 🙁

    What can be done??? (the robots.txt has been up for months)

    thanks,

    funnyman

  80. Matt, how about 1 non-ambiguous statement?

    “Any Site/Page Deemed By Google As Deceiving or Attempting to Decieve Will Be Banned or Penalized. Google’s Decision is Final”.

    This would be a suicide mission on Google’s part, even if it were the right thing to do.

    Google makes this statement.
    Blackhat politickers intentionally misinterpret this as “doing evil”.
    Mainstream media (the great unwashed idiots that they are) get a hold of it.
    And Google is evil. Correct, but evil.

    Leave it the way it is. If people can’t understand it or have trouble with interpretation, they shouldn’t be calling themselves SEOs or web marketers or social media optimizers or visitor experience optimizers or anything else.

  81. Matt, I have a question in regards to nofollow tags on a press release site. I’m the owner of PRunderground.com and we currently do not use the nofollow tag on links, because I’ve heard that Google views press release sites differently and PR is not passed anyway.

    I know sites like PRWeb.com don’t bother with nofollow tags either, but perhaps you could give me a solid answer on this?

    So basically my question is this: Do press release distributors need to bother with nofollow tags?

  82. Multi-Worded Adam, I will pass that feedback on.

  83. Dave (Original)

    You shouldn’t ass-u-me I mean *replace* when I never mentioned the word or anything remotely close.

    The “suicide mission” would be Google catering to blackhats and gutter journalism.

    IMO, there is simply too much documentation and the *average Webmaster* is likely overwhelmed by it all. IF Google simply put that (or similar) at the TOP of every page of the guidelines, I believe it would help immensely. Best of all, there is NO ambiguity.

  84. Matt

    Could you offer some insight into how Google goes about its manual clean up. For example, I am curious how a site that games the system is removed and if there is any way to suggest these site for removal.

    A recent example I found was a site that is less than 1 month old ranking first page for mortgage, home loans, etc…highly competitive mortgage terms with zero content and a number of links from .ru country codes, the site was lockerroomphx dot com and curious how the human element could help filter these type of sites out of the system faster…?

  85. Thanks, Matt. You da cow! No, YOU da cow! Disney really understands us!

    You shouldn’t ass-u-me I mean *replace* when I never mentioned the word or anything remotely close.

    Just to clarify…I never thought you meant “replace”, and I’m not sure how you made that leap. Whether they replace a paragraph, replace a section, add a paragraph, add a section, mention it in the Google webmaster blog, hire a skywriter, or anything else they decide to do, it’s immaterial.

    To make this abundantly clear:

    If big G were to make that statement in any form or fashion whatsoever, there would be negative repercussions because of the tone of the statement and the misinterpretation by blackhats, intentional or otherwise.

    In other words, it isn’t what is said that matters, but how it appears. It took Matt 350 words to explain the context of a one-word change because he clearly understands issues of interpretation and context. He knows that people will try to screw with the spirit of the guidelines, so he’s trying to clarify. I don’t think he’ll succeed 100% (not because he didn’t do a good job, but simply because I think it’s impossible for him to do so), but he’s trying.

  86. Some nice updates, and while at SMX. I thought I was a workaholic 🙂

    For the IP delivery, I would love somr clarification on multi-variant testing. With PPC it’s easy enough for us to test using split run redirects but with the content part of the site we need to keep the URLs the same for both the users (so they can link to the URL) and the bots (so it’s not hard core cloaking). So really the only way to do in depth MV testing is with some form of IP delivery. Tools that use JS replacements for MV just don’t cut it for all tests, like say a complete page redesign.

    I know many of the big boys use IP delivery for their MV testing (example platform: http://www.sitespect.com/). Can a company feel safe going with a platform such as that? Can there be a standard way to let a search engineer know we are MVing and not cloaking for SE reasons? Maybe a standard MV disclaimer in the foot of the page not matching what the bot sees?

  87. Valuable updates…Thanks !

  88. Matt in which languages can we send a spam report ?

    Any language or you have a list of languages accepted ?

    Regards

    Manuel

  89. Can a penalized website gain high PR again if they they don’t break any rule again?

    Or would it make it hard for the penalized website to rank better? Penalized as in decrease in PR.

    Thanks!

    regards,
    Zafar Ahmed

  90. Matt, Can you please clarify this for me? If there are multiple links on the same page to the same url does it make sense to no-follow all but one of these links? Or, since there was at least one no-follow for the linked url, google will therefore not follow any of the links – even the links that were not no-followed? (for example a search result’s page where there is a link from a photo, a content text-link and ‘read more’ text link that all link to the same url)

    Thanks

  91. matt, I tried to post a reply but I don’t see it in the comments section. If I try to re-post it, it says that the comment was already made.

  92. Your web master guidelines area lists only one company in the Quality guidelines area. Why? Are there not companies that sell cloaking software which annoy Google? Are there not companies that sell linking schemes which Google could list? Are there not competitors of WebPosition that are much harder on Google in how they perform queries?

  93. Matt,

    I just read you comment policy after my first post was not included. I thought my post about paid links and whether certificate links are considered paid links or not was on topic with respect to the section of your blog reading “We do appreciate getting suggestions and feedback from users, webmasters, and SEOs. I’m especially interested when people want to report spam, including paid text links. Google’s position on paid links that pass PageRank is well-known, because we’ve been pretty clear on the subject.” so I am not sure why it was not included. Most of these links don’t seem to use nofollow, so my question is are they adhearing to your guidelines or should they be reported? Thanks.

  94. Matt,
    Would it be possible to give the url of an actual site that utlizes a doorway page?

  95. Your web master guidelines area lists only one company in the Quality guidelines area. Why? Are there not companies that sell cloaking software which annoy Google? Are there not companies that sell linking schemes which Google could list? Are there not competitors of WebPosition that are much harder on Google in how they perform queries?

    That’s actually a good point/idea. Perhaps a searchable database of software/linking schemes that are known by big G to violate ToS is in order? (With of course, the requisite disclaimer stating that just because something isn’t in there doesn’t mean that it doesn’t qualify, etc. and so on).

  96. I would like to use a specific example and maybe you could clarify if what I am doing is deemed wrong.

    First we had a technique of using the title in the URL because back in 2005 it was rumored that helped ranks.

    Example.
    http://www.prnewsnow.com/PR%20News%20Releases/Government/Public%20Services/PreSentencing%20of%20the%20Drug%20Involved%20Offender%20TRI%20Releases%20the%20Risk%20and%20Needs%20Triage

    Now for the past 5-6 months I have used a cleaned up url for the same page. It makes user navigation easier and when customers contact us they have an article number to tell us which makes finding the information much easier via the phone than the vague descriptions we got in the past.

    Example.
    http://www.prnewsnow.com/Public_Release/Public_Services/210115.html

    Now google tends to pick up the title url first then update it to the article number url in a few weeks. More and more they are picking up the article number one first but that process seems to be slow since tons of our old links on the web reference the old method.

    We didn’t do this to build a “gateway system” and we do only reference 1 page to 1 page with same/same content. It was just a fact of growth that we had to change the way we did business to be more user effective.

    I want to shut the old system off but until the natural linking of articles kind of breaks the threshold to allow us to maintain our traffic am I at risk for not doing doing so now?

  97. Multi-Worded Adam & Scott,

    I would suggest removing that company name, and accordingly that item of the Quality Guidelines to read:

    “Don’t use unauthorized computer programs to submit pages, check rankings, etc. Such programs consume computing resources and violate our Terms of Service. Google does not recommend the use of products that send automatic or programmatic queries to Google.”

  98. Matt; How does Google feel about the latest SMX conference and that many sessions discussed ways to deceive Google?… or things to do as long as Google does not find out? I know you personally spoke up debunking many things the speakers talked about, but what does Google actually think about how the SEO industry is being run by blackhats now?

  99. I’m not an SEO-versed person; rather, I’m a developer who’s been asked to fix page rankings. Even with these additional docs, I still don’t know how to do something I would think would be SOP in teh world of search and Windows/IIS webservers. Can you please help?

    We have a client (http://www.diabeticcareservices.com) which has an SSL certificate, naturally, and is hosted by a third party. All of their pages are accessible through either http:// or https:// protocols. Owing to the third-party nature of the host, I can’t simply add an ISAPI_Rewrite dll to the site, and the robots.txt file is used to block various folders, such as /checkout, /account, etc.

    Our “search” practice insists that this is messing with search engine rankings, yet it seems to me (ah, blissful ignorance) that our circumstance ought to be pretty commonplace, and somehow handled already by the search engines themselves. When I’ve instituted SSL for a site (e.g., https://www.secureachsystems.com), I’ve never actually tried to limit SSL to specific pages. Moreover, when I go to, say, http://www.amazon.com/somepageURL, and I insert ‘s’ after the http and click Go, I still see the exact same page.

    How do I use the current robots.txt file in the root of the web app to disallow https:// protocol searches? Keep in mind that I cannot actually access the web server, but even if I could I don’t want to create a ‘misdirect’ for https:// because that’s a valid path for users who have logged in to their site account. I’ve seen a post somewhere that said to use a different robots.txt file in the root of the https:///www.mydomain.com, but that is exactly the same site as the http://www.mydomain.com, so I’m really at a loss. Any chance you acn help me at least head in the right direction?

    Thanks,
    John

  100. Dave (Original)

    Just to clarify…I never thought you meant “replace”, and I’m not sure how you made that leap

    No leap. You said “Leave it the way it is.” I’m not suggesting any change to any text.

    If big G were to make that statement in any form or fashion whatsoever, there would be negative repercussions because of the tone of the statement and the misinterpretation by blackhats, intentional or otherwise.

    More assumptions. The day any SE cares what blackhats think is the day they start loosing ground.

    In other words, it isn’t what is said that matters, but how it appears.

    ROFL, OK 🙂

    I still say 1 non-ambiguous statement (along the lines of what I suggested) at the top of each guideline page is a step in the right direction.

    Matt; How does Google feel about the latest SMX conference and that many sessions discussed ways to deceive Google?… or things to do as long as Google does not find out?</blockquote?Considering Google is a major sponsor, I’d say they condone it. I posted a link a few weeks back to show Google sponsors conferences where SE is taught and Matt deleted the link. Go figure 🙁

    Matt told me, after stalking him, so long as risks are disclosed, it’s fine to educate on how to spam the SEs. Go figure 🙁

    Wonder how other SEs feel about Google sponsoring “how to spam the SEs”?

    At least it answers the question on why so many ethical Webmasters are totally confused as to what is right or wrong in SEO.

  101. Dave (Original)

    Just to clarify…I never thought you meant “replace”, and I’m not sure how you made that leap

    No leap. You said “Leave it the way it is.” I’m not suggesting any change to any text.

    If big G were to make that statement in any form or fashion whatsoever, there would be negative repercussions because of the tone of the statement and the misinterpretation by blackhats, intentional or otherwise.

    More assumptions. The day any SE cares what blackhats think is the day they start loosing ground.

    In other words, it isn’t what is said that matters, but how it appears.

    ROFL, OK 🙂

    I still say 1 non-ambiguous statement (along the lines of what I suggested) at the top of each guideline page is a step in the right direction.

    Oops….

    Matt; How does Google feel about the latest SMX conference and that many sessions discussed ways to deceive Google?… or things to do as long as Google does not find out?

    Considering Google is a major sponsor, I’d say they condone it. I posted a link a few weeks back to show Google sponsors conferences where SE spam is taught and Matt deleted the link. Go figure 🙁

    Matt told me, after stalking him, so long as risks are disclosed, it’s fine to educate on how to spam the SEs. Go figure 🙁

    Wonder how other SEs feel about Google sponsoring “how to spam the SEs”?

    At least it answers the question on why so many ethical Webmasters are totally confused as to what is right or wrong in SEO.

  102. No leap. You said “Leave it the way it is.” I’m not suggesting any change to any text.

    Okay, so you made an incorrect leap to a conclusion.

    No change means no change of any sort…be it addition, deletion, or modification. You took what I said completely out of context.

    If you add your paragraph to the text, above, below, beside, in, out, up, down, whatever…it still constitutes a change to the section. It may not replace text, but I also never used the word replace, nor did I ever at any point intend to…you just stuck it in for no logical reason. You can argue that all you want, but it is what it is.

    Now don’t keep arguing long after you’re proven wrong. I’d like to think you’re better than that.

    More assumptions. The day any SE cares what blackhats think is the day they start loosing ground.

    How is that an assumption? That’s the following of the thought to a predicted scenario based on repeated past precedents.

    Google makes an algorithm update…blackhats piss and moan.
    Google cracks down on link exchange…blackhats piss and moan.
    Google cracks down on paid links for SEO purposes….blackhats still haven’t shut up about this one yet.

    And unfortunately, mainstream media (blithering idiots that they are) do about 5 seconds’ worth of fact-checking, interview the first SEO they find about the topic (which is anyone even remotely whitehat), get some bogus ad nauseam rhetoric from said SEO, publish it as fact, and that’s what people believe. And while blame in that situation ultimately lies on the individual to do their own due diligence on a story to find out how true it is or isn’t, the reality of the situation is that said research simply does not occur and that the blind (SEOs) lead the blinder (media) to the conclusion they want the blindest (the great unwashed) to believe.

    Perception, as such, becomes reality. It’s not right. It’s not fair. It’s a complete crock. You may not like it. I damn sure don’t like it. I’d almost bet money search engines don’t care for it. But that’s SEO Sphinn at work.

    Oh wait…did I say Sphinn? I mean spin…I don’t know how I could have made that Freudian slip.

  103. Dave (Original)

    MWA, move on buddy, you are flogging a dead horse. I personally don’t give a hoot what you assume. That is based on your past history of posts to may forums.

    Weekend/family time, so I bid you farewell 🙂

  104. This is all about Google losing money. I have a feeling the fact that there is no CFO now for 10 months, click fraud is in the press, and this attack on all things seo/links is a sign of what is to come for google. They are losing money and they don’t know what else to do. So they put forth this sacrificial employee Matt Cutts to be the bad guy in the seo community all the meanwhile they blow so much smoke up his ass that he thinks he is some sort of a google god. All this so they can maximize their dollars, to show a profit each quarter, cause if they don’t there goes their stock price spiraling down.

    So Mr. Cutts, go ahead put out all your confusing seo advice. Penalize sites all you want, cause people to lose money and lose their homes, break up families, do your worst! You are just going to piss people off and they are going to come up with even more brilliant ways to fool you, some just to prove a point not even for profit.

    Just remember, whatever you do comes back to you 7 times worse. So when the news media talks about how much some poor soul is paying each time a looksie loo clicks on his site and the public eventually learns not to click on sponsored ads any more, what will become of your position then?? Pride comes before the fall. Learn already. In thinking they are wise, they have become fools.

    Go ahead and delete this if you don’t like reading the truth, the point is you still read this and the message has been delivered. I pray for you.

  105. Having both Yahoo and Msn search engines support Google’s wildcards in the robots,txt.file makes this SEO tool a whole lot easier to deal with.

    Thanks to JLH for once again alerting me to this important change.

  106. Yep, Adam is dead right and yes, it sure is frustrating beating that horse.

    Dave; I would be surprised if that was Google’s official stance on the teaching of spam. It would be totally against what they are trying to accomplish for their very own users. For the life of me; I would not be able to understand that kind of stance from Google at all, Except if it really is only an issue of money. If that is true, that is very, very sad. I do know that some sessions did not say a tactic was risky at all. Matter of fact; a few speakers touted a tactic as a great thing. Yes; some said you can do this as long as Google does not catch you doing it.

    I just think that teaching spam is something NOT needed. I don’t get why some think that knowing how to spam is a good thing even if you have no intentions of spamming. It doesn’t make sense to me at all.

    I think back to the many home builder association meetings I attended in a prior life. They all had speakers who were builders in the city they were in. Each city has it’s own set of building codes…. not law, but simply codes to follow and abide by so the house being built does not come to a halt because the builder didn’t put 4 nails per shingle, but instead put 3 nails per shingle on the roof. It’s not against the law to do that, but if caught by the inspector, (Google), he would halt the building of the house until the code that was not followed was fixed. I relate that to Google and SEO’s with Google being the inspector and SEO’s being the builder.

    You know what would happen at one of those home builder conferences if a speaker discussed “how to build a home with short-cuts” by not abiding by every code for that city? That speaker would be thrown off the podium by other builders and not permitted to go to that conference again. It isn’t a case of a builder having to know all the blackhat ways to build a house at all. He didn’t have to be taught about them. It just was not allowed. Period.

    It seems to be the very same thing in the SEO industry, only that this industry not only teaches spam, but the inspector says it’s OK to teach as long as the risks are disclosed? It’s hard for me to believe that is Google’s official stance on this. I’d have to read it coming from Google on their website.

  107. Doug Heil,

    “It seems to be the very same thing in the SEO industry, only that this industry not only teaches spam, but the inspector says it’s OK to teach as long as the risks are disclosed? It’s hard for me to believe that is Google’s official stance on this. I’d have to read it coming from Google on their website.”

    Haven’t you ever heard of this one:

    keep your friends close but your enemies closer” 🙂

  108. Sure I have. To what end game though? If a bunch of new people to SEO are attending these conferences, to what result would Google want to achieve by just allowing spam to be taught to these newbies? It’s one thing to be a “friend” to a spammer through email or private discussions, but quite another to actually sponsor some spammers being asked to speak at a conference where new people are trying to actually learn something. They don’t need to hear “how to spam and get away with it” in order to build a good site that not only visitors of Google but visitors of the site really like.

  109. Doug,

    I guess Google and Matt have to deal with the current reality of the SEO industry, not to invent it. And the sad reality is we have some popular names going around under “SEO” titles, while in practice they are nothing more or less than filthy spammers preaching spam in conferences and on forums such as Sphinn, for example.

    Luckily there are out there ethical SEOs too and they are good examples for the young generation of SEOs.

    I wish to see more ethical SEOs speaking their minds in conferences and forums 😉

  110. Dave (Original)

    “keep your friends close but your enemies closer”

    If that involves the teaching of SE spam at conferences sponsored by Google, then it’s a “cure” worse than the disease and the ethical Webmasters are paying a high price for Google’s friendships.

    I wish to see more ethical SOS speaking their minds in conferences and forums

    Thanks to Danny Sullivan and other “leaders” who have a vested interest in SE spam, the ethical people are not cool and are seen as do-gooding-nerds to be drowned out at every opportunity.

    Yep, Adam is dead right…

    Yeah, just like you were both “dead right” about the new human edited SE, called……….what was that name….Mah…M something 🙂

  111. lol Dave; Well, I don’t know how Mahalo is doing or not doing, but I do know this fact:

    Jason Calacanis is very right when he speaks badly about SEO’s. It appears the industry wants to be known as cheaters and fake content writers. That seems to the in thang right now, so Jason can speak again if he wants as I’m very sure he is speaking for the majority out there.

    It’s nutty stuff though; it just really does not matter what Google or any other engine puts up there in their webmaster guidelines. It really doesn’t. I guess it may matter to new people who want to learn, etc, but those same people then go out to read the many blackhat blogs and social sites out there and to read goings on of conferences, etc, so does it matter in the end? Sorry, I’m just not feeling very good about this industry. The talking heads are all about the monies. As many have said and was stated at that SMX conference:

    “Leave your ethics and morals at the door; you are a marketer.”

  112. Matt, regarding cloaking.. I have seen some sites recently that change the order of the text displayed on a page based on whether it’s a bot or a user. The overall content of the site is basically the same but for the user they dump of the heavy keyword stuffed text at the bottom of the page and for the bots it’s presented first. Sometimes as much a few paragraphs of text and links. What is Google’s stance on this practice?

  113. Matt, Since 301 redirection is not possible in blogger does the change in Webmaster tools solve the problem of it or we have to do it by other methods.

    I heard that If you use HTML redirection the blog will be penalized by google. Is that true ?

  114. Jim; I’m not matt, but what you describe is pure spam. It’s a naive site owner who probably has been reading places like this:

    http://www.searchenginepeople.com/blog/blackhat-seo-ethics.html

    Matt; why doesn’t Google respond to outrageous SEO blog posts out there? The comments following that little article are revealing as well. I just don’t get why Google is not more proactive in addresses the bogus claims and comments by Spammers/blackhats.

    “Google Cloaks”

    This is what I mean by it really does not matter what Google states on their site. Read the latest comment by slightlyshady saying that Google cloaks so why can’t we all cloak.

    I really feel the industry is in big trouble of totally being run by spammers. If Google does not get much more strict about things, it will continue it’s downfall. Do you know how many people are praising that shady character’s post? Many of them. I really feel like I’m totally by myself with issues like these. Well, maybe just a scant few more, but for the most part by myself. I just don’t get it Google.

  115. Doug, I agree – pure spam. The sad thing is that it’s not a small naive site owner. They are the largest company in my niche and it’s not a small niche market. They’re also the largest buyer of adwords in this niche. Kind of makes you wonder if whitehat is the way to go.

  116. Doug,

    Btw, this is something worth both Sphinning and reading, indeed 😉

    Reflecting on SMX Advanced 2008: Rise of the Black Hat SEO

  117. Who cares about spinning? It should be a link to the actual article Lisa wrote:

    http://www.bruceclay.com/blog/archives/2008/06/smx_advanced_goes_dark.html

    That’s why I posted what I did in prior posts.

  118. Seems that you have to wear many hats to rank well on the organics. In the niche ive been monitoring lately, several of the top sites participate in link buying schemes. Some of those sites I knew about before they bought them, so I was able to see just how dramatically the purchase of links helped. I don’t see any sites getting penalized or dropping from 100s or thousands of links being devalued, so Google is either really, really slow, or simply doesn’t know about them, even tho they have received reports on it.

    It does not appear that Google is doing enough. For example, Matt had to tell atleast two people in this blog post alone to go submit another report. Once should really be enough, after all, you’ve taken the time to make a report or several reports to help Google do its job, and then nothing comes of it. You get to the point where you’re just not going to submit reports anymore.

  119. Dave (Original)

    lol Dave; Well, I don’t know how Mahalo is doing or not doing

    Well, you’re not alone, the rest of Planet has idea about “Mahalo” either. Sorry, couldn’t help myself after yourself and Adam so passionately defended them at every turn 🙂

    Jason Calacanis is very right when he speaks badly about SEO’s.

    In general, yes, he is correct.

    This is what I mean by it really does not matter what Google states on their site.

    Of course it matters. They just need to be patently clear, non-ambiguous and stop sponsoring “SEO” conferences where SE spam is taught. Hmm, I wonder if the Worlds largest ISP sponsors conferences where E-mail spam is taught?

    It’s little wonder the masses believe blackhat SE spam is just part of “SEO”.

  120. Yeah, just like you were both “dead right” about the new human edited SE, called……….what was that name….Mah…M something

    Are they still around? Yeah. So we’re not wrong yet, either. Right now, there’s nothing one way or the other to establish a conclusive stance on success or failure. It’s too new.

    But do not count out Mark Cuban. People may hate Calacanis (and you’re an idiot if he irritates you), but Cuban knows what he’s doing and has a track record of success. He’s not going to buy into something for no reason. Let the thing play itself out fully…which it hasn’t done yet. It’s not even close.

    So if you’re going to resort to cheap shots for no apparent reason just , you’re going to have to do a lot better than that.

  121. Dave (Original)

    Right now, there’s nothing one way or the other to establish a conclusive stance on success or failure. It’s too new.

    Didn’t stop you & Doug a Year ago…I guess now it’s becoming apparent that they a just yet another tiny drop in an Ocean of SE’s you are backing away. Pitty you couldn’t apply that common sense a Year ago 🙂

  122. Dave (Original)

    for no apparent reason

    My “reason” was made quite clear.

  123. My “reason” was made quite clear.

    Okay, so you had none. Good. Very clear. I got it. Thanks.

  124. Dave (Original)

    Spoon feeding time 🙂

    Yep, Adam is dead right…

    From Doug.

    You see, when your track record of predicting based on assumptions is known, it’s safe bet to say you are wrong…………………….yet again. When Doug chips in to agree with you, it’s a sure bet you are both wrong 🙂

  125. Come on guys, relax. The issue isn’t mahalo whatsoever; it’s blackhats who don’t give a hoot about the se guidelines.

  126. Doug: I know that, but Dave needs to stretch to make himself right in his own head. I at least tried to give him enough respect to explain to him why what he’s saying makes no sense, but I’m done with it.

  127. You see, when your track record of predicting based on assumptions is known, it’s safe bet to say you are wrong…………………….yet again. When Doug chips in to agree with you, it’s a sure bet you are both wrong

    * shakes his head *

    It’s okay, Doug. He’s obviously got a hate-on for excuses known only to him and quite frankly, I’m disinterested in anything he has to say anymore.

  128. Dave (Original)

    See, you say this “but I’m done with it.” and then start flogging the dead horse all over again. And you wonder why I don’t believe anything you say? LOL!

  129. Matt,

    Long time no post 😉

  130. Glad I got a chance to catch up with you at SMX Advanced. You were a little less mobbed than you are at Pubcons. The highlight of the conference to me is still the 10 foot picture of you that said “Stop being afraid of this guy!” 🙂

  131. Matt —

    Having all the nofollow documentation is a good start, but it would be doubly good if the documentation was clearer.

    “Essentially, using nofollow causes us to drop the target links from our overall graph of the web. However, the target pages may still appear in our index if other sites link to them without using nofollow, or if the URLs are submitted to Google in a Sitemap.”

    Can other sites mean links on my site on a different page?

    If I have a site with 3 pages:
    Home Page: Links to Page A and Page B
    Page A: Links to Page B
    Page B: Links to Home Page

    If I choose to nofollow the link from page A to page B, would this cause Page B to be dropped from your index, assuming it had no external links, or would the Home Page link to Page B be sufficient to keep it in the index. The documentation clearly says other sites, so read strictly, I would believe Page B would be dropped. However, my previous understanding would be that Page B would remain (due to the link from my homepage). Also, if it will remain, would Page B suffer any additional page rank penalty beyond the loss of the page rank being passed from page A?

    Thanks,
    Andy

  132. Hi Matt, how are you?

    Mr. Cutts, with respect to ‘doorway pages’ could you provide some clarity?

    Is it not possible that a supposed ‘doorway page’ does indeed provide some, and I will others to challenge me, content and end-user benefit, regardless of there or not this supposed ‘doorway page’ is actually in fact optimized for a particular keyword or keyword phrase(s)?

    For example, would a direct response site be considered a doorway page? After all I would love all search engines to index my order page which is directly linked to my sales page…I mean my doorway page. Semantics perhaps?

    What about a very long, one page website that pre-sells an audience on an affiliate product one is promoting; one that targets an audience and delivers a great product review and links to other cool resources. Would this still be considered a ‘doorway’ page?

    Sincerely,

    Daniel Tetreault.

  133. Harith, I just did a post. 🙂

  134. Great summation… just finished reading one or two of the posts that you covered here…

    p.s. doorway pages have been around since 1995… they still suck for a user.. now… if you can just provide a form to have google remove domain place holders since they ABSOLUTELY provide NO benefit to the user… AND they screw up Google’s reading of bounce behavior..

  135. I disagree with the concept that different content based on the user-agent is necessarily cloaking! For example, I make sure to use the exact same CONTENT for a user visiting most of my pages but often strip certain elements based on user-agent; thus I can keep the absolute address of a document the same no matter what platform is getting it, but leave out irrelevant objects for some browsers (in the case that a visitor is viewing on an smartphone, a document will serve thumbnails instead of large images, some JavaScripts/CSS are stripped, iframes are dropped, etc; the same document will be essentially the same if viewed on a desktop browser except presented in a desktop-friendly fashion).

    There are also times that a developer may opt to serve different JavaScripts or CSS files based completely on desktop browsers that still do not affect the actual content–a Firefox CSS file vs an IE CSS file, etc. Or loading various CSS and JavaScript based on OS as well!

    As long as the end-user is still getting the same information, then Google should not punish a site for changing the exact format of the presentation–that should not be labeled “cloaking”.

    That’s it for a rant/concern.

    I do have one question regarding the use of services such as Blogger. I just posted an article on a Blogger account as a way to introduce myself to the Blogger community. After posting it, I wondered if Google would frown on it as duplicate content as I have posted the same article on my own site. If so, how can a content producer share his content across multiple avenues without being labeled as spamming? From my personal perspective, I see the posting of an article on a system like Blogger, etc the same as a syndicated columnist publishing his work in multiple publications. I know that there would be readers on Blogger who had never been to my site to begin with… and here is a new avenue to share content with them.

    The question is how do honest people actually use the internet as an honest tool without getting punished by Google?

  136. While we are on the subject of documentation I am wondering if I can find an answer to a question that has been bugging me for a long time that I have yet to see an answer documented for it.

    I have a website that ranks between 4 on page 1 and 11 (top) on page 2 for the most popular search terms for this site. I notice that when I rank high on the first page on local results (google.com.au) I am usually low on the web results (google.com) for that search term (2nd page).

    Where are the stats on what is the most popular search engine (google.com versus localized, eg google.com.au or google.co.uk etc) people use for local searches.

    It seems to me that your average surfer would not be web savvy enough to use the local google search engine and would just use google.com instead of the search engine for their country.

    If this is the case then knowing which search engine to optimize for would be crucial for those wanting to make sure that their website can be found.

  137. Sean Davidson

    Matt, you say (and have said in many different ways, many hundreds of times): “The short answer is that Google wants to judge the same page that a user is going to see.” – which I get. But as a guy just trying to get relevant links in front of users at Google, its not that simple. It would be, if the web were little more than text, and some illustrative images, but Web 2.0, and an “engaging” and “dynamic” user experience fly in the face of such assumptions.

    Take for instance my dismay, that after having developed a cutting edge image searching technique/platform in flash (actually, lean AS3), I find that presenting a greatly simplified version of search results to Gogglebot, along with any non-flash, non-js-enabled browser/user is considered to be cloaking. Is it Google’s position that Flash, Silverlight and other such technologies are only for cute, and minor “bling” on a site, and never for the main method of content delivery?

    Isn’t progressive enhancement, just cloaking? You’ve said no, but I’m pretty sure that the only USER of my site that can’t use CSS, JS or Flash is a search-bot (or a Blackberry user, and they are not likely a user of our site/service). And so, by serving the same search results, but in a radically dumbed-down form, to non-JS, non-flash users (or Googlebot), I am only trying to get *relevant* links in front of Google’s users.

    If a Google user wants to buy a “happy family on beach photo”, and I’ve crafted the crawlable results such that a link to our flash-based search engine results for a “photo of a happy family on beach”, have I not applied the cardinal rule: “keep the users first”? – But to do so, I have no “sanctioned” methods to apply other than to “cloak”.

    Look, I think spam is the petulance of the Internet, and loathe link farms, and “cloaking” has been one way those offenses have been perpetrated. But I strongly disagree with you, that “cloaking” is always a “black-hat” approach. It is becoming more and more necessary as the bot-to-browser capability gap has increased. And it is only going to get worse, as the next generation of browsers are being built with no assumptions of a “primarily text-based static web page” in mind.

    -respectfully

    search flies in the f simple site designer, that

  138. Hello Matt,
    i’m running a very large site and having sub domains to support local sites to each of the markets/countries.
    i registered all sub domains in google webmaster tools and defined each of them to it’s relevant country (Geographic Location)
    most of the sub domains recognized by google and updated BUT the German on (de.mydomain.com) is not set during the last 12-14 weeks!!
    can you provide with some tips about what need to be done in order to “help” google update this info?

    thanks

  139. Where can I find the complete SEO documentation?

  140. Very good post matt, interesting read

  141. pshhh! how is someone from google so involved with SEO, its like the judge dining with the mobsters, the whole system is skewed, it should be based more on relevant content then all this hoopla about links.

  142. Hi Matt,

    I’d like to have a website available in multiple languages using language detection but I am worried how it will be interpreted as cloaking. If you read this post literally, that is what it seems like since it is going to deliver different contents based on the user’s browser. How can one provide a good language-detection experience and still be SEO friendly? Here are the features I’d like to have:
    – Automatic detection to avoids the often-seen content-free language selection page.
    – Crawlers should be able to index all available languages
    – Not dilute PageRank by being treated as different sites or content
    – In my specific case, I’m using PHP so I’d like it to be the same pages (index.php) and not
    different files (en/index.php vs es/index.php etc), this would simplify maintenance,
    so if this a solution can have this property too, it would be even better.

    Thanks in advance,

    – Itai

css.php