Communication in other languages

Presumably you saw my post about talkorigins.org, a site that was recently hacked so that the front page had spammy porn text and links. Google temporarily removed talkorigins.org from our index, but we emailed talkorigins.org to alert them that they had been hacked. We also made it possible for talkorigins.org to confirm the penalty in our Webmaster console tools. Once the spammy porn links/text were gone, Google reincluded the site in our index within days.

So Google tries to alert hacked sites of problems; that’s good. But we also email many sites for violations of our quality guidelines, such as hidden text. Take for example the case of trouw.nl, a leading Dutch newspaper. They wrote an article criticizing the fact that Google temporarily removed trouw.nl from our index for hidden text, and emphasized their belief that Google should have alerted them to the removal. In fact, Google did email trouw.nl.

What exactly was trouw.nl doing? By using Cascading Style Sheet (CSS), dozens of words were hidden on hundreds or thousands of pages on the site. Here’s the code that was on the front page of trouw.nl in October, for example:


<div class="indexKeywords">
dagblad trouw, podium, nieuws, achtergronden, kranten, verdieping, opvoeding, onderwijs, religie, filosofie, natuurtochten, gezondheid(s)zorg, cultuur, natuur, milieu, stijlboek, recensies, boeken, chat, polderpeil, maandaggids, dinsdaggids, woensdaggids, donderdaggids, vrijdaggids, weekendgids, letter, geest, letter&geest, boekrecensies, novum, laatstenieuws, rss, handheld, dossiers, trouwkabinet, illegaletrouw, ephimenco, schouten, spotprenten, spotprent, len, tom, modernemanieren, cryptogram, zusje, kritieken, nieuwskoppen, horizonreizen, relatie, parship, schrijfboek, webshop, trouwcompact, compact, animatie(s), Flash, video, radio, strip(s).</div>

See the indexKeywords div? If you examine http://www.trouw.nl/trouw.nl/styles/basic.css, you’ll see that the properties of that div are


.indexKeywords,
.indexLinks,
.copyright,
.nostylesheetText {
display: none;
visibility: hidden;
width: 776px;
}

The net effect of that CSS div is to hide those 60+ keywords in a way that is completely invisible to users. In case you’re wondering, trouw.nl also used the indexLinks div style to hide multiple links as well. It’s interesting that the definition of indexKeywords remains in the CSS of trouw.nl, even though they’ve removed the actual hidden text.

In general, I do not feel that Google is obligated to notify every site that we remove from Google’s index for violating our quality guidelines. Our webspam team does not have infinite resources, and our primary goal has to be to protect Google users by keeping our index clean. However, in this case Google did email trouw.nl (in Dutch) to alert them about their hidden text. I’ll include an excerpt of the email that we sent to multiple email addresses, including webmaster at trouw.nl and support at trouw.nl:

Geachte eigenaar of webmaster van trouw.nl/,

Tijdens het indexeren van uw webpagina’s is geconstateerd dat enkele van uw pagina’s technieken gebruiken die in strijd zijn met onze kwaliteitsrichtlijnen. Deze richtlijnen kunt u vinden op: http://www.google.nl/webmasters/guidelines.html
Om de kwaliteit van onze zoekmachine te waarborgen zullen enkele van uw pagina’s tijdelijk uit onze zoekresultaten verwijderd worden. Momenteel staan de pagina’s van trouw.nl/ op het punt om verwijderd te worden voor een periode van ten minste 30 dagen.

In het bijzonder zijn de volgende technieken geconstateerd op uw pagina’s:

* De onderstaande verborgen tekst op trouw.nl/:

dagblad trouw, podium, nieuws, achtergronden, kranten, verdieping, opvoeding, onderwijs, religie, filosofie, natuurtochten, gezondheid(s)zorg, cultuur, natuur, milieu, stijlboek, recensies, boeken, chat, polderpeil, maandaggids, dinsdaggids, woensdaggids, donderdaggids, vrijdaggids, weekendgids, letter, geest, letter&geest, boekrecensies, novum, laatstenieuws, rss, handheld, dossiers, trouwkabinet, illegaletrouw, ephimenco, schouten, spotprenten, spotprent, len, tom, modernemanieren, cryptogram, zusje, kritieken, nieuwskoppen, horizonreizen, relatie, parship, schrijfboek, webshop, trouwcompact, compact, animatie(s), Flash, video, radio, strip(s).

As you can see, we tried to alert trouw.nl that we were taking action on their hidden text and hidden links. We mentioned the page with the issue (in this case, the root page), and we included the actual hidden text. The rest of the email goes on to describe how to request that Google reconsider the site for reinclusion in our index. After trouw.nl removed the hidden text and hidden links, Google reincluded the site.

I understand that trouw.nl was frustrated to be removed from Google’s index, but our users have told Google repeatedly that they hate webspam and don’t like seeing pages with hidden text secretly buried on the page. Hidden text is also not fair to other sites that try to compete for similar queries without hiding words from users.

In this case, I believe that Google did more than any other search engine does:
– We provided our webmaster guidelines in Dutch at http://www.google.nl/webmasters/guidelines.html (“Avoid hidden text or hidden links”? “Vermijd verborgen teksten en verborgen links.”)
– We scheduled the site to be removed for 30+ days so that users wouldn’t get hidden-text, hidden-link pages back in response to searches.
– We made it possible for trouw.nl to confirm that they had a penalty via our Webmaster console.
– We emailed trouw.nl in Dutch with the exact page to check and the exact text to look for.
– Once the site removed the hidden text and hidden links, we reincluded trouw.nl.

In reviewing this situation, I believe that the webspam team handled this issue in a way that protected our users but also tried to alert the site to issues. We will continue to work to improve our communication so that legitimate sites receive even more information to help them with webspam-related issues.

Q: So you’re not just working on webspam in English?
A: No! We are continuing our anti-spam efforts in many different languages, as you can see from this situation. In fact, I expect Google to focus even more effort on other languages in coming months. I’m extremely proud of our webspam team members who are located in Europe (and other places around the world). I’m also sending one of the best people on our Mountain View-based webspam team, Brian White, to Europe for six months in 2007 to provide the webspam team in Europe with more even more visibility and more support.

Q: Matt, you still love Dutch sites and the Netherlands, right?
A: Yes. 🙂 One of my favorite authors is Dutch. Janwillem van de Wetering is the best existential mystery writer in the world, without question. Because of him, I can’t wait to drink jenever in Amsterdam someday. But we also have to protect Google’s users and the quality of our index. 🙂

Update: I’ve been in contact with someone at Trouw.nl, and as always, there are two sides to the story. The email addresses we tried to use to contact Trouw didn’t exist, so Trouw couldn’t have received our message. This situation shows that the idea of contacting site owners is solid, but we can still find ways to improve our communication and webmaster outreach. Trouw has also added an update at the end of their article saying much the same thing.

90 Responses to Communication in other languages (Leave a comment)

  1. If they’re publishing hidden text (especially if they’re doing it in a way that indicates the intention of hiding it rather than an error), they ought to be frustrated with whoever set that up, not with a search engine that caught it.

    I just wish you were better at catching it, or even of doing something about sites that are reported. When a client of mine complains to me about a site beating them, I find hidden text on every page of that site and report them via the webmaster console, it would be nice if I could tell the client the site would get dropped from the index, but so far, I don’t believe that’s happened with any site I’ve reported. Admittedly, there have only been about five of them over the past four years, and some of those would have been before there was a webmaster console, but 0/5 is not a great record.

  2. Such shameful behavior and they dare to draw even more attention to it by writing an article about it themselves!? “Trouw” means “loyal” btw. Not sure towards who…I think they should consider themselves lucky for being included again.

    “I understand that trouw.nl was frustrated to be removed from Google’s index”
    I can understand it’s frustrating for a burglar to get caught too, but that doesn’t mean I feel one bit sorry for him, or her. Their explanation for using the hidden keywords: “without them the pages would be hard to use for users who don’t have a certain application installed on their computer yet”. What application?? Sounds like pure nonsense to me. They assume their readers are as incompetent and ineducated as they are. I just knew they were going to make up some vague reason, before I even read the article. Say something vague computer-related and most of their readers will buy it. At least it fits the rest of the their desperate-sensational content.

    Glad to see it doesn’t change your opinion about the Dutch though. Bad newspapers is afterall a global issue, just like spam. About a week or two ago some EU guy praised the Netherlands for fighting spam, actually going after spammers etc. so I guess we must be doing something right, besides producing good jenever.

  3. Awesome post, Matt. The insight you provide is exceptional. I just want you to know that some of us “lurking” on your blog recognize that.

  4. Hey Matt,

    I just have to say that the quality of your posts is great. I love the detail and time you take to give us an insiders look at what is happening with Google and the search algos. This makes for great reading for an SEO novice like myself, and it’s incredibly educational. Thanks!

  5. Ditto what several others have said – very high signal-noise ratio with your posts and quite informative … and entertaining – thanks!

  6. Thanks for the interesting explanations of what goes on behind the scenes in these cases. Sorry if this has been suggested before, but how about an RSS feed of our Webmaster consoles?

    Pros:

    1) If I’m techie enough to be watching the console and understand it, I’m techie enough to subscribe to it’s RSS feed

    2) In some/many cases it’s unlikely Google’s email would reach the appropriate technical person to rectify the situation – or not in a timely manner

    3) If some basic stats were included in the feed and updated weekly, you’ve given every webmaster a great reason to subscribe

    4) I would love for my community members to see what I see on the Webmaster console via our shared community Google Reader page. Having a feed to the console data insures that if something goes awry, someone in our community will almost immediately notice it.

    Cons:

    1) Someone has to write the code to produce the RSS feed

    2) Privacy of the feed concerns could be an issue for some people

    3) Other stuff I have no idea about since I’m out here and you’re in there…

  7. Very interesting to read.
    Absolutely worth a compliment for Google’s way of handling this off, in contradiction to Trouw’s childish reaction.

    Besides, when Brian White will be in Europe, you sure can find an excuse to pay him a visit and make a little trip to Amsterdam. 😉

  8. Its nice to know that you alert webmasters before taking actions. Polite but Normally everyone needs to know the guidelines (even if they are only in english, or just the major spoken languages). It’s a bit of shame for Trouw, which is a big newspaper in holland.

  9. I can understand it’s frustrating for a burglar to get caught too, but that doesn’t mean I feel one bit sorry for him, or her.

    This is the best perspective anyone has ever taken on the situation. Amen and hallelujah.

    The question I’ve always had on this situation is this: if people do this, and you know people do this, why not expand the webmaster guidelines to include some common examples and stop at least some of it upfront? Like if you know people are using hidden text like this, show a few examples of things you know about. Unless I’m missing something (like some form of a TMI scenario.)

  10. Johan, thanks for your take. I agree, but it can be challenging for site owners who are not power SEOs to understand why users really don’t like hidden text.

    Jeff, Brian B, alek: I really appreciate that. Comments like that make me want to blog on weekends and days off. 🙂

    Jim Kloss, the consensus from the hacked sites comments was that perhapts the first step is working on better ways to contact sites. Things like the ability to email more specific email addresses, not just the catch-all kind, or the ability for the webmaster console to communicate more directly. But perhaps in time, RSS might make sense as well.

    Multi-Worded Adam, that’s a whole other post by itself. I’ve always considered our webmaster guidelines a bit like Peano’s axioms. That’s a small set of 4-5 rules which, when assembled the right way, let’s you construct a great deal of modern mathematics. In the same way, the webmaster guidelines tries to provide the principles from which people can build up an understanding of what to avoid. So we would tell people “avoid link schemes” rather than “avoid buying links, avoid triangular links, don’t bulk-email thousands of people asking for links, don’t buy software to spam thousands of guestbooks or blogs, etc.” But over time, I’m gradually coming around to the idea of providing more details and examples though.

  11. What’s the take on cloaking, if hidden text brings a full removal? (Eg cloaking content to a search engine but an ad-screen, aka signup form, to the user)

    Are there any tools out there to recognize hacks like the one that happened to talkorigins, on a fairly dynamic site? Could it be that Google can spot those kinds of things faster than anyone else, including the webmaster?

  12. I think you Googlers owe it to yourselves to find a better means of communicating with Webmasters when you’re taking on higher profile sites. I am guessing most of your attempts to reach Webmasters actually succeed.

    But maybe you should be more agressive about encouraging Webmasters to sign up for Webmaster Central (although you would probably take some heat from people like me who would ponder the ethical fairness of such encouragement).

    Ultimately, people either know what is on their sites or at least have a responsibility to learn what is on their sites. It’s unfortunate when a spammy SEO slams a client site without explaining what they are doing, but people often take avoidable risks without doing adequate preliminary research.

    I know that if you tried to contact me by email, using my WhoIs information, it’s a sort of hit or miss thing getting to me. And if you attempt to send email to the postmaster or other “logical” or “traditional” accounts you’re SOL because they’ve all been disabled due to spam.

    I think that confirming penalties through Webmaster Central is a more reliable route, although I understand your concerns about not wanting to do that for people who control hundreds or thousands of spam domains.

  13. Well i did’t get banned yet can you tell me here ,who and who got banned..i think its a rumour!

  14. JohnMu, you can check for cracked sites with site: and look for unusual pages, especially with extensions that you don’t normally use, such as .dthml. Some crackers do that so that the site shows up normally, but they can still make any pages that they want. You can also check the “top words on your site” report in the webmaster console. If it’s all porn/pills/casinos and your site is about teddy bears, that’s a definitely sign.

  15. Thanks for the Dutch focus today 😉 But Matt, both webmaster@ and support@ adresses bounce (only after 15 min. but OK)

    Final-Recipient: rfc822;support@trouw.nl
    Action: failed
    Status: 4.4.6

    I’ve written on my Dutch blog that this is both Trouw and Google’s mistake. The adresses shouldn’t bounce, and Google should check if stuff bounces. If it bounces, it’s pretty useless to declare that Google has contacted the organisation that got banned, agreed?

    On an egocentric sidenote: I was the first to try to explain (someone else discovered it weeks earlier actually) the Trouw.nl ban on my blog, and I guessed automatic CSS-hidden-keywords/links-in-250.000+pages-detection, from which a semi-manual ban occured. Is that correct?

  16. Aren’t these scammers Dutch? You know, the ones that say you have won the gazillion dollar internet lottery? Good to see Google is cracking down on these spammy pages or sites. We cleanies are working hard to earn a buck.

  17. Hi,

    Will you create a French webspam team ? Lots of people doesn’t know making web…

  18. You’re trying to sneak away, Matt :-). What’s the take on subscription cloaking? It’s one of those things that I absolutely can’t stand when I run into in the serps…

    The Google-queries you mention are all good – but all slightly delayed. I just felt it was interesting that Google would probably notice that a site got hacked way before anyone else, you would probably even notice it automatically and perhaps even trigger something. That’s pretty neat :).

  19. Hi Matt,

    With all the negative things that have been put out lately about Google and it’s reporting, is there anyway to highlight the positive and give us an idea of how many of these types of email Google has sent?

    Mostly I am just curious to know, but also it would seem to kinda debunk the claims of many if you said “We have sent over 50,000 (or whatever) emails out to webmasters reporting the reason they were removed from our index.

    Love your blog.

  20. When you come to Amsterdam, let us know, so I can buy you that Jenever 🙂

  21. Hey Matt,

    when Brian comes to Europe, we’ll have quite a few more examples like these if he wants them 🙂 And if you’re ever coming to the Netherlands, be sure to mention it upfront, we might arrange a nice meeting with some SEO’s 🙂

  22. Another good post Matt, but one that is sure to leave many people frustrated, because they report sites that include significant hidden text, but it’s very unusual for anything to be done about them. The post will be seen as another case of hitting a larger company’s site for the sake of making an example to scare people away from spamming – that’s the way the BMW case was seen.

    I know that Google prefers to do things algorithmically, and I realise that it’s probably necessary to have some bad instances in the index for trying out new ways of dealing with things, but the use of hidden text makes Google look like a child (agewise), and it still isn’t being dealt with algorithmically. So wouldn’t it be better if the sites that are reported for hidden text are actually dealt with, rather than frustrate webmasters who are being unfairly beaten by them in the serps? You wrote:

    Hidden text is also not fair to other sites that try to compete for similar queries without hiding words from users.

    It’s a very good way of seeing it, but it would be much better if sites that make spam use of hidden text are dealt with when reported.

  23. Hey searchenginesWeb; A burglar is caught stealing. He says that what he was stealing was “relevant” to what he needs. I guess you think that’s okay as well, right?

    From your viewpoint, it’s okay to steal from “other” websites who abide by the Google guidelines as long as you don’t get caught, and as long as the invisible text is relevant. Relevant to whom? Where do you draw the line if invisible text is okay by you? How about if every website in the world simply had flash and images and NO real text, but simply hid the text from a browser and showed the text to a robot? Would that be okay with you?

    You make no sense buddy.

  24. You’re trying to sneak away, Matt . What’s the take on subscription cloaking? It’s one of those things that I absolutely can’t stand when I run into in the serps…

    I agree that it’s very undesirable in the serps, but it isn’t cloaking Cloaking is something quite specific, and that isn’t it 😉

  25. This is an excellent insight for those of us who are really into our SEO.

    I’ve always wondered about the use of CSS for this for a couple of years now. It’s a very complex issue. What I’m surprised about is the fact that no one else here has also mentioned “fahrner image replacement”…

    For those who aren’t aware of fahrner image replacement, it’s a CSS technique which allows you to replace normal text with content with an image. It’s is often used in heading tags when a designer whats to use a fancy font. It’s a technique I like to use a lot, for example, on http://virtualfunction.net/services/ I have used it for the ‘What we do’ heading text.

    As you can see, this is a legitimate use of hiding text with CSS because I have replaced the text with an image that reflects the text in header tag. While I am aware there are other methods, as someone who also has a bit of a design background, I opted to use this method as it means my site makes sense for people who rely on screen readers, or using something like a mobile device.

    Anyway, getting back to the point, what does Google about this? It would be wrong to start sending emails and alerting webmasters left, right and centre because of this.

    My guess is that people who are using hidden text do it with fairly large quantities of text, which means if there is is some threshold where it will assert the use of keyword stuffing, mainly because most people using image replacement are only replacing a small quantity of words. This however is just what I am assuming. Would it be possible to confirm that this is roughly how Google works, just so us CSS web designers can get some sleep at night.

  26. SearcH EngineS WeB:

    Neither Google, nor any other search engine, has any obligation to contact any site owner when they remove a site from the index.

    It doesn’t matter whether or not the hidden text is “relevant” to the page or not. What matters is what Google thinks about it. They have chosen to call it spam, if the hidden text is just for search engine benefits, and we all know that. I don’t altogether agree with Matt when he says, “… but our users have told Google repeatedly that they hate webspam and don’t like seeing pages with hidden text secretly buried on the page“. I think it’s more a case of website owners telling Google that they don’t like to be beaten in the results by pages that employ techniques that they are afraid to use because they could be penalised. They feel it is unfair, and they are right. I don’t believe that users are concerned one way or the other, because they don’t see the hidden text.

    If Google dealt with sites that are reported, instead of just a few of them, then the playing field would be much more even, and hidden text would become too dangerous, and would largely fade away. It’s not dealing with hidden text that keeps it going.

  27. Hello Matt,

    Although I hate spam as much as most people here the following makes me wonder…
    Search engines like Google demand that websites deliver the same code to their spiders as to the browsers of their visitors. If they do not they get banned. I think that this is quite logical, because you can not do a good search if the pages that are linked in the results do not match the ones that are indexed. But if they do, they can still be banned!
    Trouw.nl for instance, delivered the same code to your spider as it did to normal browsers. That makes, in my opinion, the fact that these hidden words/links are indexed a problem of the spider and not one of the website/webmaster.

    And what about hidden content that’s put in for accessibility reasons, like ‘direct to …’-menus or menu headers? Are sites that conform to the current best practices concerning accessibility in danger of being banned?

    The ideal situation would be a spider that acts like a normal browser: it indexes hidden content but does not show it in the results, unless people ask for it (advanced search/preferences).

  28. Just thinking about it….(Typical this had to cross my mind as soon as I clicked post) One difference between the way I do my image replacement for typography and the way spammer hide text for keywords is slightly different. I do not make use of visibility: hidden, or display: none, which are obvious tell-tell signs for keyword stuffing.

    This in fact leads on to an all new level complexity with this issue, and questions from me, which I guess are some what representative of the CSS design community…

    Being able to tell how things like display: none work is a very complex affair. For instance another example of where this might be used is in print style sheets. I like to hide certain text when printing because it’s wasting ink. Typically this will be navigation related. Just to pull another example out of the bag. If you look at a print preview for http://virtualfunction.net/services/development/, The left and top navigation is hidden using display: none. This is because anyone who wants to print the page proably doesn’t want to see the navigation or any of the design decor. Likewise, there’s nothing worst when you print a page and you find that the left nav bar has decided to make the last two words of the body column get cut off. Will I assume in this case GoogleBot ignores stylesheets with the media type that is not screen or projection.

    Going back to image replacement again. The way I do mine is that I make use of overflow: hidden, and then set a padding-top (or height in the case of IE 5.x via a CSS hack) to be the height of the replacement image. This doesn’t hide the text like visibility: hidden, or display: none, but instead forces the tag text to be rendered out side the box area allocated for the background image. Basically this leads on to the question, what if the wiser spammers manage to evolve and exploit techniques like this for their own uses, instead of legitimate typography rendering. Google would need to have a very advanced understanding of CSS to be able to work how CSS is being exploited using these techniques. While it’s probably not too hard to detect when it’s being used legitimately and when not is not such an issue (based on my previous post that image replacement is used sparingly on most sites, and is restricted normally to h1-h3 tags), it’s hard for Google to understand when text is being hidden on a large scale using complex CSS methods where this visibility of the tag not explicitly set by some display or visibility attribute set on the tag in question or one of it’s parent tags, but rather implied by a series/combination of attributes (in this case overflow, padding-top, and height). If GoogleBot is clever enough to understand CSS to this level, I have some serious respect for programmers behind all of this!

  29. Yes, I have hidden text in my Web page. Even though I know it’s against Google’s policy. Why? Not for spam reasons.
    My Web page is a single page web application based on the Google Maps API and written in Javascript. I never redraw the page, even if the user uses the application for a number of subsequent requests. The page has a desktop-like menu bar and a bunch of status- and error messages that show up under certain conditions. The menu bar has sub-menus I hide as long as the user doesn’t open them. Could I do this without hidden text? Yes, but then I had to generate the text of the menus etc. from within the Javascript code, changing the text of the menu items when needed. Not only is that bad coding style (design elements inside the code), but also cumbersome because the page comes in multiple language versions that all use the same code. Or I could hide the text from within the code, thereby ‘obscuring’ the hidden attribute from the robot. But that leads to a very unpleasant visual effect, as all the text is first drawn when the page is rendered and then hidden.
    I understand (and aggree) with the intensions of the Google policy about hidden text. But with more and more Ajax-applications around the implementation of the policy is becoming inadequate. Why not just IGNORE any hidden text instead of punishing the Web page by banning it completely from the index??? Ignoring hidden text renders it useless for spammers but alows using the hidden attribute where it leads to a better user experience.
    (BTW, I can’t tell if Google punished my page for using hidden text)

  30. So subscription cloaking is always fine? Good to know 🙂 (but tell me, how is serving the bots something different than the average user not cloaking?) Isn’t “hidden text” just the same, only on a much smaller scale?

  31. I had to blog about this topic:-
    http://www.webworkshop.net/blog/?p=15

    JohnMu:
    No, what you refer to is not fine for Google’s users, imo, but it isn’t cloaking. We had a long discussion about this a SearchEngineWatch some time ago, and, although it took time, it was eventually agreed that it isn’t actually “cloaking”.

  32. Sorry – I forgot to include the SearchEngineWatch URL:-

    http://forums.searchenginewatch.com/showthread.php?t=12191

  33. S.E.W.: The day you actually make a coherent and correct point, you’ll have already shot yourself in the foot thanks to the hundred-someodd posts you’ve made like this one.

    As far as your example goes, it’s too far-fetched for one reason: large sites will and should have generally invested a great deal of money into their sites, including on their hosting. This is web design 101 stuff: keep your server as up-to-date with patches, fixes, etc. as you possibly can. And if you’re using a hosting company, do your homework and make sure they’re doing the same thing.

    So IF your scenario plays itself out, this would ultimately be the responsibility of the site owner.

    Not only that, contacting a site owner unless big G knows it’s a hack for sure sets a very dangerous precedent. All it would take is a scumbag webmaster to register two or more domains under different names, host them on different servers, and all the ingredients for a “hack” are in place.

    As far as whether or not users see hidden text, I can see where users would complain because they don’t see it. “I searched for dagblad trouw, ended up on trouw.nl, and there’s no information on dagblad trouw.” In other words, I don’t think they’re complaining because they see the hidden text. I think they’re complaining because they don’t see the hidden text.

    I’ve heard this complaint before, and even made it the odd time.

    By the way, Matt, I had to look up Peano’s axioms because of you (never heard of them before). You triggered the autodidactic nerve in me, and I hate you for that. 😀

    I do have a counterpoint to your comment but in the interest of trying to keep things from going tangental, I’m going to keep it to myself until you make that whole other post on the subject (which I expect will be very, very soon). 😉

  34. As far as whether or not users see hidden text, I can see where users would complain because they don’t see it. “I searched for dagblad trouw, ended up on trouw.nl, and there’s no information on dagblad trouw.” In other words, I don’t think they’re complaining because they see the hidden text. I think they’re complaining because they don’t see the hidden text.

    Good point. I hadn’t thought of it that way round.

  35. I wish I had so much help for my site wich ist punched bei the minus-31-penalty.

  36. JohnMu, I dislike all cloaking.

    S.E.W., I’m pruning you. I’ve warned you before about the crazy bolding and italics and stuff. When other commenters complain about you, that tells me it’s time again.

    Multi-Worded Adam, glad that I got to educate as well as entertain. 🙂 Everybody should start with Peano’s axioms and see if they can reconstruct up to, say, multiplication at some point in their life. It puts hair on your chest. 🙂

  37. Note to PhilC – The sites I report for javascript redirect/cloaking in the google webmaster tools are usually removed within a week to ten days.

    Matt – a suggestion about having multiple entries for spam listings.

    For example, usually when I report javascript redirection/cloaking in the google webmaster tools there are 4 to 6 spam listings per page of google results, I would like to be able to enter ALL those spam pages at once, instead of having to enter a separate report for each javascript redirect/cloaking listing on a single results page.

    Make reporting this crap easier and you’ll get more reports.

  38. You’re fortunate, Lots0. Many times we see forum posts that say the opposite.

  39. In the past, we’ve seen Google say that all reports are read and kept, but that they only deal with the worst spam by hand. Perhaps that’s why you have successes, Lots0.

  40. There’s a thread in the Google Groups where Adam details the steps taken after a spam-report a tiny bit – the reports seem to go to a queue and probably glanced over every now and then. “Really bad” things are handled manually, the rest is used to fine-tune the algorithms. The idea is to handle it in a scalable way: instead of manually penalizing sites based on the reports, they prefer to find a global way to penalize all sites using similar tricks. Spam reports from within the webmasters console (with your login) are treated with a bit more weight. There’s a Firefox extension that will help you to do it quicker (but it doesn’t go through the webmasters console): https://addons.mozilla.org/firefox/3875/ . Assuming the spam-reports are mainly used to tune the algorithm, I think it would make sense to report any and all spam you spot in the serps – it gives them more datapoints. And to me that includes all WMW threads I spot 🙂

  41. >>>…they only deal with the worst spam by hand.

    That could be Phil, I usually only report the bad stuff.

    >>>Many times we see forum posts that say the opposite.

    Phil, you should know better than to believe everything you read on a forum… 😉

    >>>Assuming the spam-reports are mainly used to tune the algorithm, I think it would make sense to report any and all spam you spot in the serps…

    A lot of folks don’t have a clue about search engine spam realy is, they just report sites that rank above theirs, cuz everyone knows ‘their’ site is the best and any site that out ranks their site MUST be spamming…

  42. What if I did this on just one page?

  43. What about an automated email sent to the email address in the whois info, with a link to a google page with a generic message in many different languages?

    Domain name:
    trouw.nl

    Status: active

    Registrant:
    DAG000288-PINKR
    Dagblad Trouw
    Wibautstraat 131
    1091GL AMSTERDAM
    Netherlands

    Committed to ADR: yes

    Administrative contact:
    HON001817-PINKR
    B.A. Den Hond
    +31 (0)205623106
    postmaster at trouw.nl

    Registrar:
    Getronics PinkRoccade Nederland BV
    Paalbergweg 1-3
    1105AG AMSTERDAM ZUIDOOST
    Netherlands

    Technical contact(s):
    PIN001031-PINKR
    Getronics PinkRoccade BSS-CSD
    +31 (0)205704700
    tesdomeinen at getronics.com

    Domain nameservers:
    ns1.pinkroccade.net
    ns.megaplex.nl 80.79.97.1
    ns1.megaplex.nl 80.79.97.2

    Date registered: 09-08-1994
    Record last updated: 29-11-2006

  44. Hey Matt,
    I hope Google is very serious now about doing something about the spam in Dutch language.
    Exactly 13 months ago (nov 10,2005), i asked you during the jagger updates if it was worthwhile reporting all spam in Dutch language. You replied “I would definitely feel free to do a spam report (even for Dutch sites) if they’re still there”.
    I noticed some small changes a bit later but after a month or so, all sites were reinstated and now, 13 months later, the spam is still there.

    From your post, especially about the webspam team in Europe, it becomes clear that spam in Europe is treated differently than in other countries or continents.
    Why not simply try a search on “website optimalisatie” (Dutch for web site optimization) using a Dutch interface language, and try to find 10 spamfree sites in the first 30 results ?
    It are these so called SEO’s who stuff their client’s sites with anything from links hidden in 1 pixel images, very small fonts, cloaking, text in noscript without a preceeding script etc. Name one trick from the 90’s and you’ll find it in Dutch languages with prominent positions.
    I don’t really understand why someone has to come over from Mountain View to Amsterdam to let the European antispam team know what is spam and what is not.
    But well, i’ve waited 13 months, i can wait another half year…
    send that man as fast as possible. He’ll be very welcome and while he’s in Amsterdam, let him drop by in Belgium. I’ll keep some of our best beers in the fridge for him.

  45. @PhilC:
    Of course subscription cloaking is cloaking! Googlebot is given access to content based on the name of its user agent. What’s the difference to ‘normal’ cloaking then?

  46. JohnMu wrote…

    … the rest is used to fine-tune the algorithms. The idea is to handle it in a scalable way: instead of manually penalizing sites based on the reports, they prefer to find a global way to penalize all sites using similar tricks.

    Exactly. That’s what I posted. The problem is that, even though spammy hidden text is almost as old as search engines and a lot older than Google, Google (and the other engines) cannot yet find a way to deal with it programmatically, and the sites that use it continue to have an unfair advantage over sites that don’t want to risk using it – as Matt said. Since Google cannot find a way to deal with programmatically, I suggest that it’s better to deal with it by hand when it is reported. NOT dealing with it when it is reported actually causes it to continue.

    @Sander:

    I don’t want to derail this thread with a discussion about cloaking, but if you read the Search Engine Watch thread that I linked to, you’ll understand what I mean. Even Danny Sullivan changed his view of cloaking during that thread 😉 The nutshell version is that all the pages are available to both spiders and people – they both receive the same pages – which means that it isn’t “cloaking”. It’s undesirable in the serps, but it isn’t cloaking.

  47. bong hitz, if it was the root page and the text was bad enough, the whole domain might still go away.

    Danny, one thing we’ll do is put a 30 day penalty in place for hidden text the first time, then 60 days if we see it come up again, and so on with the penalties getting longer each time.

  48. hi matt
    this is my first comment on your blog i have been reading it for quite some time now and found it very informative and engaging …. you have written in your blog “We made it possible for trouw.nl to confirm that they had a penalty via our Webmaster console.” my question is how do i find out if my site has a penalty in the webmaster box and if my site does have a penalty how do i find out what it is for.
    vipin shetty

  49. Matt,

    this sounds like a reasonable approach. However, did you check for bouncing e-mail? Neither of them did existedt the time you must have sent them. We wil create them today “just in case”.

    And may I state that it has never been Trouw’s (or any other of our titles) purpose to violate your rules; we do want to end up as high as possible within the bandwith of your guidelines. At the moment we are investigating how this code was entered into our site in the first place; as you may have noticed neither http://www.volkskrant.nl nor http://www.nrc.nl has violated the rules, even while maintained by the same team of webmasters.

  50. unrelated to this topic, but I’m asking since for this topic i really wanted to track the comments: would it be possible for you to install a subscribe to comments plugin like this one: http://txfx.net/code/wordpress/subscribe-to-comments/ ?

  51. Hi Matt,

    I plan to launch some kind of vocabulary site. The main info block per word will be visible via a:hover. It contains a world translation, synonyms, etc in a kind of bulb div.

    It means, I will load a lot of info which normally has a visibility: hidden; attribute in CSS. Is it a violation? Should I change my concept?

  52. Joost, do you mean something like the “RSS feed for comments on this post” link right below the last comment? The one for this thread is: http://www.mattcutts.com/blog/communication-in-other-languages/feed/

  53. JohnMu: no, allthough that does half the job 🙂 that plugin mails you on new comments 🙂

  54. Hi Matt,
    I have a question – our site uses CSS to display layers of text, i.e. certain elements of the page are hidden, but by clicking on a link it displays to the visitor. Is this likely to cause us problems? We had a SEO guy (specialising in google) tell us we should cut them out, but we feel it would be detrimental to the user experience. We’ve been using them for a couple of years and the hidden text is indexed in google, so I’m guessing that it is OK as long as there is a link to make the hidden text display?

  55. I’ve heard it said that Google will not ban a site for hidden text if the hidden text only emphasizes what the page is already about. For example, a site about Tampa Real Estate that has a whole pile of hidden keywords in a div layer – all having to do with Tampa Real Estate – would NOT be banned by Google even if it was reported via the webmaster console, because that site really should be showing up for “Tampa Real Estate” – in other words, the webmaster is not trying to gain relevance for terms that are unrelated to the site.

    My feeling is that this is still a bad practice and an unfair practice and should be cause for banning.

    Matt, can you provide any insight into this?

  56. Hi, I’ve wearily added hidden text to my site for accessibility purposes. I say wearily because I fear being penalised by search engines for this. It’s just the occasional heading to many the page structure more accessible to screen-reader users without messing with the visual appearance. Am I safe with this or do I risk being penalised by some automated robotic process or even by a human?

  57. By the way, if you wanted to read some more about this in Dutch, the first blog that mentioned it that I know of is Planet Multimedia BlogNoot:
    http://www.pmmblognoot.nl/2006/12/trouw_doet_aan_.html

    And Usarchy.com also did a good job talking about this, including specifics:
    http://www.usarchy.com/2006/12/trouw-google-penalty/

    (I should add that Google knew about the issue before that, but these sources were among the first outside Google to talk about this situation.)

  58. Can’t believe that someone put it in a style sheet, I would never of thought of that one. Good to see something is being done in Europe – I have seen hundreds of spammy sites over here, I just thought that nothing ever got done about it, I can think of loads of sites that are still sitting there in G that would have been picked up long ago had it been in the USA.

  59. Johnmu wrote,”Assuming the spam-reports are mainly used to tune the algorithm, I think it would make sense to report any and all spam you spot in the serps – it gives them more datapoints. And to me that includes all WMW threads I spot :-)”

    I concur. A standard is not a standard unless it is applied across the board. WMW may be a fine site run by fine individuals, but they are breaking the rules that all others must comply too and should be reported just the same.

    Reminds me of, “All animals are equal, but some animals are more equal than others.” George Orwell

  60. A standard is not a standard unless it is applied across the board. WMW may be a fine site run by fine individuals, but they are breaking the rules that all others must comply too and should be reported just the same.

    I don’t know what makes you think that WMW is breaking Google’s rules. They aren’t. I don’t like being stopped by a login page any more than anyone else does, but don’t accuse them of breaking Google’s rules.

    Let’s be clear about it. There are many reasons to spot page requests from search engine spiders, and deal with them differently because they are search engine spiders, but that doesn’t break any search engine guidelines. Removing sessions IDs in clickable URLs is a very common example. Giving some people a login page is another example. Neither of them break any search engine rules/guidelines.

  61. PhilC is totally completely and absolutely correct in each and every post in this thread. I encourage you all to read that SEW thread where he was “trying” his very best to explain the REAL definition of “cloaking”. What it is, and what it is not.

    The WMW forums is NOT cloaking. Cloaking is “always” se spam. Period.

    PhilC is exactly right. Period. 🙂

  62. BTW: WMW is practicing a form of “content delivery” and doing so just like gazillions of websites do every day of the week. There are many forms of content delivery. “Cloaking” happens to be a specific form of content delivery. Cloaking is always se spam. What WMW is doing is simply detecting if the user agent is logged in OR not. If not logged in, you get the page that asks you to do so. If logged in OR subscribed, you get the other page with content. Easy stuff. NOT cloaking. Cloaking is always spam.

    Well done Phil.

  63. Thank you Doug 🙂

  64. Sorry you feel this way, but I still feel it is against google’s spam policy, be it true spam or not. Right in the spam report one box that can be checked is, “Page does not match Google’s description” A sign-in page that is also a sales pitch for a membership that cost money is definitely different than the page the description was pulled from and the content that the SERP is based on.

    I know that you all like WMW, I read it as well. I don’t think they are doing anything spammy or evil, however imagine if another site used the exact process that is very well documented. I understand their motives in protecting the content written by its members from not-so-good rouge bots. Imagine if there is another site that doesn’t have anything to do with webmastering, didn’t have the email addresses of the important players at google, didn’t sponsor conferences in concert with google. Would they be allowed to do the same thing? Would it be okay if they served googlebot the exact same pages that user would see, after signing in for free with a valid email address? Would it be fair if the log in page also had an upgrade option for the searcher to spend a little money on before they get to the content that was promised in the google snippet? Then would it also be fair for our webmaster to take the mailing lists he’s put together and sell it? Oh yeah, this site is about mesothelioma attorney. Doesn’t seem too fair now.

    The point is this. We hear that spam reports aren’t responded to on a case-by-case basis as they want to build a scalable solution to handle the spam in the index as whole. We are also told, even in this very post, that google frowns on cloaking. WMW is cloaking, no matter how you look at it. The landing page is not the page the snippet is created from, it’s a redirect at best, but we know its cloaking as the practice is well documented within its own forum. So from this I must assume only a few possibilities: 1) Cloaking is allowed and not automatically detected or 2) It is not allowed and if detected can be manually overridden 3) It is only a manually applied penalty under some sort of judgment system. #2 and #3 Seem the most likely which actually is the most disturbing to me. What is this decision based on? Content? Relationship with the web spam team? Finances? Politics?

    Can Matt and the rest of the web spam team be assured that users to WMW are not being forced to pay anything to see the same pages that googlebot crawled? Yes. The same could be said about or mesothelioma attorney site, but they would never know if the webmaster is profiting from the practice on the side.

    If I want to serve my pages to googlebot and then make people accept cookies and sign in to view the pages but not get banned by google, who do I call? Where do I apply for a dispensation from the webspam team? I’m not in the position to hold conferences with food and drink in Vegas I’m just a lowly one-man-band without these resources. Is there some sort of level a site needs to attain to be immune from cloaking penalties, like NY Times or WMW? Is this based on traffic? Pagerank? National presence? The phases of the moon? Will I get an email from someone telling me that my site is now important enough to google that I can feel free to start harvesting names and delivering sign in pages for content crawled under a different URL? Will you?

    In the movie Awakenings there’s a line where they are discussing whether the catatonic patients were conscience during their outages and the doctors says they weren’t without any real proof, “Because the alternative would be unthinkable” This reminds me of the blind acceptance we must have in some of theoretical penalties imposed by google on sites. We cannot believe there is something that a competitor can do to your site like link to you from crummy neighborhoods as that “… would be unthinkable.”

    I’m not trying to rip google here, actually the opposite. I respect their intentions and tend to believe what I see more than what I read. Seeing that they allow WMW cloak, or at least serve up a log-in page, on a click from natural SERPS leads me to believe that there is no such thing as a cloaking penalty. I cannot imagine a room of paid surfers making judgment calls on whether or not a site is worthy to be allowed to cloak. This is the beauty of google, that they are algorithmically based. You too could own the number one SERP position for “Ebay Auction” Not very likely, not very probable, but possible as they haven’t sold it to Ebay. Ebay has earned it. We have to believe this as it’s what gives us hope for better positions every day and “because the alternative would be unthinkable.”

    By the way I have no intention on sending visitors to a sign-in page I convert quite well without tricks.

  65. I’m sure there are quite a few googlers who will gladly take you up on that drinking in the netherlands thing.

    I happen to know some very experienced drinkers at google-nl 😉

    And btw, I had to press back and fill in the spam protection thing properly, the answer was ’11’ and I entered eleven, what happened to your view on giving people the freedom to fill out forms any way they like?

  66. JLH:

    The mistake you are making is that you think the register page is the same URL as the one you click on in the serps, but it isn’t. That’s why you think that the “Page does not match Google’s description”. The page of the URL you click on does match the description, and you are free to view it if you want to.

    I’m not a WMW fan, as you suggested – I don’t use the place. I’m just stating that WMW does isn’t cloaking. It is auto-redirecting to a different page under certain conditions, but not all conditions – not all people get the login page. It’s a conditional auto-redirect. I know that some newer people confuse what cloaking really is, because some confusing things have been written in forums, and newer people believe them. By “newer”, I mean people who arrived after Google arrived. So I’ll explain precisley what cloaking always was, and still is…

    Take a 10 page website as an example. Now create another 10 pages for it – one page for each of the normal pages – and design those new pages to rank highly in the search engines. We’ll call those new pages “engine pages”. When a person requests one of the pages, send it. When a search engine spider requests one of the pages, send the “engine page” equivalent of it. People never receive the “engine pages”, and engines never receive the normal pages. That’s what cloaking is. A set of “engine pages” were made for each search engine, because they all ranked on content, and they all had different algorithms.

    Confusion has arisen through the years because some people have misunderstood cloaking, and they have (mis)used the word for other things, such as IP delivery, auto-redirecting, and even hidden text. Unfortunately, people who weren’t around in the days when cloaking was commonplace and fully understood, have believed it, and we’ve ended up with the situation where the word “cloaking” is wrongly applied to all sorts of things and, since cloaking is a no-no in Google’s guidelines, all sorts of non-spam things have ended up being thought of as spam. But cloaking is cloaking, hidden text is hidden text, and auto-redirecting is auto-redirecting. WMW auto-redirects some people, but not all. They don’t cloak, and what they do isn’t spam.

    My personal view of being stopped by a login page is that, when the content is free and only requires me to register, I prefer that it is listed in the serps so that I know it’s there and available if I choose to register. But if the content is paid only (for money), I prefer it not to be in the serps. That’s a personal view though.

  67. PhilC, I agree with you on the terminology , you are right with the definitions, i was mixing terms. Cloaking, redirecting, or blue spider monkeys, whatever you call it, redirecting some people to different pages is wrong IF the googlebot was led to believe all users would get that page. IP delivery (geo targeting, lanuguage, etc) and user agent delivery and the such are all good uses for such a practice. I’m just against the decptive nature of seeing the snippet in the SERP and then a user clicking the result expecting to see that page, only to be sent to some page that ask them to sign-up. Others have spent time developing summary pages, their own snippet pages, etc to give the user a feel for what’s inside so that they can make the decision to sign up or not. WMW is using google as a free auto-preview tool to entice new sign-ups. Granted this isn’t their intent, they are trying to combat content theft, but that is a net effect. The problem of course is that large well known sites do this and google has to index them from a PR standpoint, and its their index so they can turn off the servers tomorrow if they want, they don’t owe us anything. With Matt calling out BMW publically, I thought they had moved in a way to show us that all are treated equally.

  68. I don’t like being stopped by a login page either, but I’d rather know that the content is available if I choose to see it, than not know that it’s there at all – except when it has to be paid for. There are different views on it, and they are all right views. I think my view is in the minority, but it’s right for me.

    My take on the BMW case, and this Dutch newspaper one, is that they are examples of what can happen, possibly with a view to scaring people away from hidden text, but, unfortunately, I think that’s all they are.

    I want to reiterate that the ONLY reason why spammy hidden text continues to flourish, and put clean sites at an unfair disadvantage, is because search engines won’t deal with it. They can’t yet deal with it algorithmically, and they won’t deal with it by hand, even when it is reported to them. It’s not as though they can’t afford to deal with it by hand.

    If they would deal with it by hand when it is reported, hidden text would largely become a thing of the past – and quickly, simply because competitors would make the using of it to be too big a risk.

    Even though the engines don’t want hidden text, they are the ones that allow it to continue, and keep clean sites at an unfair disadvantage. These examples may make a few people think twice about using hidden text, but they won’t do anything in the overall scheme of things, and the unfair advantages will continue to flourish.

  69. http://blog.outer-court.com/forum/79322.html

    Philipp Lenssen has tackled the cloaking issue, maybe we’ll get a definitve answer 🙂

  70. Nice post. I just have to say that instead of Amsterdam, you should go to Schiedam, near Rotterdam, which is famous for its jenever. 😉

    http://en.wikipedia.org/wiki/Jenever

  71. Matt

    What do you propose to deliver the textual content of sites fully built with Flash, built by designers who cannot always understand the constraints of SEO, who don’t have the right skills but just use Adobe/Macromedia Flash studio properly to deliver the user experience their client is paying them for ?
    These people end up with a nice site, but why should they pay for a full rewrite of their Flash to get indexed ?
    What else than hidden text can help these sites to be listed for their content ?

    Thanks for your inputs (the webmaster guidelines remain pretty vague about this)

    Regards

  72. We already have a definitive answer.

  73. Time for a different view.

    Most webmasters out there (and I talk to many of them, working for big companies, web building companies, or small entrpreneurs who build their own sites), don’t have a single clue about SEO. Most think search engine ranking is about metatags and links. The content and code on the site are seldom recognised as factors.

    Apart from spam, on most sites hidden text can be anything from sloppy programming to helping their self built site search system (as Trouw claims).

    Wouldn’t it be a lot easier for everyone, including Google, if Google just ignored hidden text? You can detect the text, you can detect it’s hidden, so why bother?

  74. Or Trouw could fix either its site search system or the copy of their site if they wish for relevant results to be found (the former would be easier).

    I don’t think that’s an explanation most people would be prepared to stomach. If the page were about the words that were being stuffed into the hidden div, they could just as easily have been included. Not only that, the same logic that I used (I hate to toot my own horn, but in this case it’s relevant) about users not being able to see words that should appear in results still applies.

    If User A searches for Phrase B, and Page C doesn’t have it visible somewhere, User A might (and quite often does) complain. That right there is reason enough not to have it on any level…why upset the people you’re trying to market to?

  75. Hi Matt.
    Just wanted to ask you can it affect my ranking if website have hidden divs, but those divs are not for keyword stuffings but for site stuff like, signup form and some information boxes. But they are hidden until user hits some button.

    BR,
    Armands

  76. “What do you propose to deliver the textual content of sites fully built with Flash, built by designers who cannot always understand the constraints of SEO, who don’t have the right skills but just use Adobe/Macromedia Flash studio properly to deliver the user experience their client is paying them for ?
    These people end up with a nice site, but why should they pay for a full rewrite of their Flash to get indexed ?
    What else than hidden text can help these sites to be listed for their content ?

    Thanks for your inputs (the webmaster guidelines remain pretty vague about this)”
    =========

    hmm. That’s easy stuff. You can detect if the user agent requesting the page/site has flash installed or not. If no flash, direct that agent to the html site. If flash, direct that agent to the flash site:
    yourdomain.com/flash/

    Easy stuff.

    My firm is building a brand new business “right now” that will do exactly as I have wrote so you all can see it, view it, and actually learn from it. 🙂

    And good designer or programmer, OR SEO for that matter should know exactly how to have both flash and html. If you don’t know, well then, you ain’t any good. LOL

  77. What do you propose to deliver the textual content of sites fully built with Flash, built by designers who cannot always understand the constraints of SEO, who don’t have the right skills but just use Adobe/Macromedia Flash studio properly to deliver the user experience their client is paying them for ?

    If you’re using Flash properly (i.e. not using it to build a full site unless it’s something along the lines of http://www.homestarrunner.com ), then you wouldn’t be asking this question.

    Remember, Flash is a plug-in, NOT a core browser component. So if you’re worried about the optimal user experience, you’ll also consider those users who cannot or will not install Flash.

    Once you’ve considered those users, you’ve simultaneously answered your own SEO question.

  78. It would be nice if Vincent Dekker, the writer of that article, would write a new article about the fact that Trouw received an alert about the hidden text. Haven’t seen that yet, but when he does i will send it to you Matt.

    By the way, is it a standard procedure to send e-mails to alert webmasters? Or does this only happen when ‘leading’ websites don’t follow the guidelines?

  79. Can I be boring for a moment, and put the final nail into the Subscription Cloaking’s coffin?

    The Google guideline that is claimed to show that what WMW does is cloaking is this:-

    Don’t … present different content to search engines than you display to users, which is commonly referred to as cloaking.

    Notice those words, “than you display to users“. When a person is looking at the serps, s/he is not in WMW, and is therefore not a user of WMW. Only when logged into WMW does the person become the site’s user. If the pages that the site’s USER receives are the same as what the search engine spiders receive, there is no cloaking.

    Consider that guideline, and especially the words, than you display to users and it is obvious that what WMW does is not cloaking.

  80. Hi Matt, I think that maybe is hard for a SEO noob, aka just web developer to get a answer here, but I’m giving it a try, I’m concern about this topic and specially about the “hidden” thing in CSS. My big question (I research on web and did not found something clear) so it is NOT OK to use CSS like ” display: none; visibility: hidden; “? My worries is about some sites I design that has CSS Drop Down menus, that use those css properties intensely, what about expandable DIV/P/CSS TABS ? is the best thing to avoid use those things at all?

    I know you are a busy guy, but please answer this 🙂

  81. Mauricio Quiros – The concern is really with content on a page that is deliberately hidden from a visitor but visible to search engines, in order to artificially try and boost ranks in the serps.
    You want to avoid using any text that is hidden from visitors as let’s be fair – if your visitors don’t need to see it, it doesn’t have to be there at all 😉

    In the SES London Conference recently it was stated (I believe by Yahoo!) that there are legitimate reasons for using hidden text, but personally I would just try and avoid it altogether. It is much better making a site which doesn’t need to resort to such measures to overcome it’s technical problems.

    Btw – there is nothing wrong with hiding some elements in css, as long as you are doing it for a legitimate design purpose and not to hide text, keywords and other elements that are not required for the structure of the page.

  82. There are still a huge amount of sites that uses thise type of keyword spamming. I noticed droomhypotheek.nl finally got removed. The first complains on that site were in 2005 or so, and was indexed a week ago.

    I’ve also seens a couple of sites that rank kinda high, simular to droomhypotheek site (which used noscript-tags), but then they use noframe-tags.

    How about google just stops indexing noscript/noembed/noframes..

    I removed 1 url from my site which was a href styled to display:none; which was a link to a html-type sitemap for all those search engines that don’t yet support sitemaps.

    Ah well, good to read google is taking action on removing those spamming sites. Thmbs up on that

  83. Hi again, I had no answer before and I’m very curious still. Do you know whether I risk being penalised for having a couple of hidden headings (such as “Navigation” and Special Features”) which are there purely for accessibility purposes? I’m really looking forward to some kind of informed insight…

  84. It amazes me that Google even warns such scam artists and get them back in the index that fast.

  85. It’s bunch of bulls, I reported three cases on google.com.hk and I still see the spamming sites with big piles of hidden div text ranking on top positions for so long! Google anti spam team seemed to do nothing about it.

    details, please check out my post here:

    http://elvis.hk/Google/Found-Spam-Sites-Of-Hong-Kong

  86. What is Google’s policy for using hidden text for the use of ADA purposes? If a site contains a great deal of text within images and blind users accessing the site via screenreaders such as JAWS are able to view the content that contained within images. Would such a site be penalized for this?

  87. Google spoils those spammers.
    If you are one of them, don’t worry. Google loves you!

  88. Hehehe, Very good job of yourself Matt, But l can say l also did these kinds of things in the beginning of the way to alert google that we can keep something from them 🙂 or in that way, from users as well. l think we all did, As we mentioned how they became fimiliar, we are making them and if some1 gets more traffic we start complaing them 🙂 l dont accept the policy of what they do, but l want to make i t visible that we all did it in our history one or more times. thanks..
    l did here
    http://www.warezworld.net Look at the footer but they are not very hidden :p

    Very good article, its not very long time l have met with ur blog Matt, but l already read many of ur professional articles. thanks..

  89. on the one side i think google takes a good job crawling stylesheets to show up hidden texts. but on the other side you are set blackhats and webmasters trying to build an accessible and usable site on the same level.

    i hide content, yes. but i don’t hide it to push some keywords… it’s unnecessary for people who have healthy eyes and can see the layout i made. disabled users using a braile system need this extra texts… so how google will differ between those?

    i will not build my website layouts on the guidelines of google… then it will looks like the crap i’ve using every day so if google thinks it’s god and can decide what’s good for users and what not so you will bring us back to html 4, frontpage made sites… thank you. 🙁

  90. I understand the need to protect the index but can someone give some advice how to work around this issue in the following case:
    Im trying to show my users a list of products, each product has an image and text block.
    Above the list there are 3 buttons and i use JS to change the dimentions of each product div.
    button one only shows the image (4 columns), button 2 the image and a intro of the text block (2 columns), btn 3 shows everything in 1 column. At the moment the webpage defaults to btn 2, eg: there is more text in the div than the part im showing the visitors.
    Im a correct to asume that google will penalize me for this?
    eg: there are no ajax calls, all text info is loaded at the first request.

css.php