Grabbag Friday

My wife left me–temporarily. She’s going to China for two weeks, along with my mother and my mother-in-law. It sounds like the beginning of a bad joke, doesn’t it? But it’s true. And my forlorn, lonely, bereftness-osity means: it’s time for webmaster questions again!

Same guidelines apply as last time:

Ask whatever you want. I’ll tackle a few of the questions that are general. Please make sure you read the most recent comment guidelines so you know to avoid “what’s up with my specific site?” or other questions that won’t apply to most people.

Comments that ignore the comment guidelines will be pruned. I’ll add a couple more requests. First, please don’t ask me about legal stuff (“Dammit Jim, I’m an engineer, not a lawyer!”). And once someone has asked, “Dude, what’s up with topic X?” please don’t repeat the question–once is enough. I’ll let questions stream in today and tackle some of them this weekend.

192 Responses to Grabbag Friday (Leave a comment)

  1. I have a question right off the top:

    1) Why do your regular readers rank lower on the totem pole than your relatives? I don’t like this whole idea of being an afterthought. 😉

  2. Okay, now the serious question:

    Does that rule about not repeating a question apply to a question that was asked in another thread but not in this one?

  3. Hello hope your family will have a nice trip,
    …some comments on sitemaps please, it seems updates in sitemaps depend on pageviews of a site?

  4. My question is: is there someone looking at italian web SPAM reports? I reported 2 websites for spammy behaviours and they’re still there after months.

    Thanks

  5. Matt … thanks for whatever change that was made in yesterday. It fixed the SITE: command issues we’ve had since June 27th.

  6. What are some general guidelines and recommendations you would make to people who desire to increate their sites visibility on google? There are 2 sides to this of course, one being the need for content. That aside, what tools, tips and “mechanical” things are most valuable? It is very hard to cut through the SEO “noise” and determine what actually does or doesn’t help so any advice would be appreciated.

  7. Is anything being looked into in terms of spam being in local searches? I’m not talking about the Local Search that comes up before the search results, which I think is great, but if I search for [term smalltownname] it’s almost always filled with sites that are generic and have built a site around make a page for each town in the US.

    For example, if you search for [web design hackettstown] the entire first page doesn’t list any web design firms in Hackettstown which I’m assuming the user would be looking for. The same applies for almost any term like this [pizza hackettstown], [mortgage hackettstown].

  8. can you put an end to some myths about having too many sites on the same server… or having sites with ips too similiar to each other, or having them all include the same javascript off a different site

    or other weird things like that and if they have an effect on ranking.

    I have a lot of current and previous employers who believe things like “we can’t put all our sites on the same server” or “we can’t make that form just one include” or “we can’t register them all with the same whois information”.

    Even though every one of these sites is a legit site completely following the guidelines, and they dont’ even link to each other.., and they’re not in any type of link scheme.

    That’d help here.

    Also, dispell any other crazy SEO myths out there…

  9. Hi Matt,

    What conditions cause Google to use the DMOZ snippet when there is already a valid META description tag on the page?

    If this is a trade secret, I understand, but please tell us that you cannot divulge this information if that is the case.

    Thanks,

    Brian M

  10. Hi Matt,

    Firstly, thank you for opening up for a QA session with us. We met in Boston as well, you seem to be a fan of one of my sites.

    I am having trouble understanding the problems that we face ever time we launch a new country. Typically we launch a new country with millions of new pages at the same time, based upon our merchant needs. Additionally due to our ambitious PR team we get tons of links from our network of sites as well as the press during every launch. For the second time we launched a new site and within about one month almost all results move down 2-4 result pages.

    The first site that had this affect was French, and did not come back for 5 months. Now we have launched a site in Australia and the site has dropped down and has lost 90% of its traffic. Should we just plan not to get Algo traffic from Google for 6 months when we implement a new country?

    Thanks in advance for any help

    Aaron Shear

  11. Hi Matt,
    greetings from Germany.

    Why is Google still showing the Pagerank at the toolbar. Why showing very old PR data? When is the right time to stop showing pagerank? It hasn´t any value anymore – only confusing the little webmasters … or is that the purpose? It would be cool if Google terminated the Pagerank …

    Jojo

  12. Hi there, I’m wondering if there is any truth to the fact that Google favours over ? I am a baffled bunny as I am finding sooo many conflicting opinions, and can’t imagine the bots would not be able to realise they are the same – unless W3C is rearing its head ito of seo? Please please oh please can you help – or tell me where to go.. 😉 (double entrendre was unintended…)

    and yes site: is working just dandy!!!

  13. Hi Matt,

    what’s Google’s stance on noscript content? Can a site using dhtml drop-down menus where the links are uncrawlable safely put all the navigational links in a noscript tag?

    Thanks

  14. Hi Matt,

    My simple question is this:

    Which do you find more important in developing and maintaining a website, Search Engine Optimization (SEO) or End User Optimization (EUO)?

    I will hang up and listen. 🙂

    cheers

  15. … stupid mistake… does google favour bold over strong tags, and sorry that my whole comment is now bold … ahhh well, its Friday….

  16. Two Questions:

    1) Can you point us to some SPAM detection tools? I would like to monitor my sites to make sure they come up clean and also have a valid way to “rat out” my no-good, spamming (I am sure of it) competitors.

    2) What about the cleanliness of code (ie WC3). Any chance that the accessible work will leak into the main algo?

    REMEMBER! “Yes dear, I miss you horribly!” even if you are able to watch TV in your tighty whities drinking bear and eating pizza.

    Cheers,
    Ted

  17. Hello Matt,

    I was wondering what the status is on Google Images and if we can expect to see an update on the indexing technology in the future. I know that it is important to include proper alt tags, filenames, and page name and content in order to get an image indexed correctly, but us there any type of research to this method to improve the relevancy?

  18. Thanks for the great site Matt,

    My question is:

    Does Google Treat Dynamic Pages differently then Static pages. My company writes software in Mason/Perl and all pages are dynamicly created using args in the url’s. Many of my clients, SEO companies are always complaining that they are hurting in page rank because of this. Is there any truth in this?

    Example URL http://www.someurl.com/index.mhtml?page=home_page
    Or for product pages
    Example UR: http://www.someurl.com/index.mhtml?sp=1&category=category_name&page=product_display&product_id=2725&product_category_id=3

    Thanks for your time.

  19. Hi Matt,

    Glad to have you back with us! 🙂

    Without giving away trade secrets, can you enlighten us as to what the most recent PR rollback was all about and the point of it? What purpose does it serve?

  20. Innocent sites hacked (shared hosting) and then links placed to website to password crack, casino, pornographic sites ect. Tens of thousands face this all the time and there never aware of it. These hacks destroy the webmasters SEO efforts.

    – Can google inform the webmaster of this occurrence within sitemaps? Inform them maybe that inapprioate pages were crawled. Inform them if 90% are proper and 10% are out of no where bad?
    – Can we talk about common hacks that hurt webmaster, related to search engines?
    – Friend of mine, had his site hacked because he allow users to upload images to there account for his website users and from that was hacked and panelized by google. He did not find out about this until months later when he say a Google cache page and still took him 2 months to find a solution. When I tried to search within google for similar cached, I relized that hundreds/thousands of websites were hurt or will be hurt…

    – So any points on common mistakes, common tips, on things to be aware would be great to prevent getting banned or penalized for innocent webmaster would be great!

  21. Hi Matt

    In the fullness of time, I’d like to use geotargetting software to deliver different marketing messages to different people in different parts of the world (what I’m thinking of is some form of discounted pricing structure helping people in poor countries afford our products.) While we haven’t yet identified a solution, this would presumably involve some form of geographically-based IP redirection to region-specific sales pages.

    Are we safe to run with this sort of “plain vanilla” use of geotargetting software? Clearly we want to avoid any suspicions of cloaking…

    Anyway, hope your family have a great trip!

    James

  22. I have a competitor that is a web development firm for a specific niche. Each time they build a new site they add to the footer links a credit to their corporate site, which makes sense, but then they also add a link using a keyword phrase to a site that fits the niche. My question is this. Is it frowned upon to add an extra, not necessary link to a niche site for link pop purposes? Can it hurt the linked-out to site?

    Thanks Matt!

  23. Matt,

    I have a question one of my clients is going to acquire a domain name that is very related to their business and it has a lot of links going to that domain. They want to do a 301 redirect to their current website. It will increase their links by double or so and as I said the links are very related to their website. The question is will Google ban or apply a penalty for doing this 301 redirect?

  24. Matt,
    whats the best way to theme a site. using directories. do you put your main keyword in a directory or the index page. if your using directories do you use a directory for each set of keywords.

  25. If an ecommerce site’s URLs have too many queries (i.e. /?productId=1212312323astegkj984tdas?1212312323astegkj984tdas?asd2fsdf) and is un-indeable, is it acecptable under the Google Guidelines to serve static HTML pages to the bot to index instead? Thanks!

  26. Aw heck, how about a real question:
    Ginger or Maryann?

  27. I would like to use A/B split testing on my static HTML site. Will G and other search engines understand my PHP redirect for what it is, or will they penalize my site for perceived cloaking? If this is a problem, is there a better way to split test? Thanks!

  28. Matt,

    I have a very interesting question about redirects. I have one or more pages that have moved on various web sites. I use Classic ASP and have made redirects with the proper header tags as listed below. Despite this, they are still being displayed in the search engines with their old page names. I don’t want to give specific URLs for multiple reasons on your blog, but I’m curious as to why this happens. Example code is below. These redirects have been setup for over a year now. When I run a spider on them, it handles the redirect fine.

    Response.AddHeader “301”,”Moved Permanently”
    Response.Status = “301 Moved Permanently”
    Response.Redirect(“newpage.asp”)

    Thanks Matt!

  29. Matt,

    Should I be worried about this:?

    site:tableandhome.com
    returns 10K records
    site:tableandhome.com -intitle:by
    returns 100K records, all supplemental

    The reason I ask is that we are having trouble getting the “pages we want,” namely our product pages in the index, which seems to be getting jammed up with lots and lots of different ways to slice and dice our product line.

    Thanks,
    -Dave

    PS. see you at SES

  30. Sean Stallings

    Does Google Analytics play a part in SERPs?

    Thanks!

  31. Dear Mister Cutts,

    It’s going to be a long weekend I purpose, looking at all te questions already asked.

    But I have one to. When does google detect duplicate content and within witch range will duplicate be duplicate?

    And in advance of this question, bij measuring the duplication, will link text be measured as content or not?

    Greatings from another Pizza shredder!

  32. I’d like to explicitly exclude a few of my sites from the default “moderate” safesearch filtering, but Google seems to be less of a prude than I’d prefer. Is there any hope for a tag, attribute, or some other snippet to limit a page to unfiltered results, or should I just start putting a few nasty words in the alt tags of blank images? Thanks!

  33. Sometimes I make a box spiderable by just putting links in the option elements: Bla. Normal browsers ignore them and spiders ignore the option en use the a.

    But since a while Google is using the Mozillabot, and that bot renders the page before he crawls it. I know that if the Mozilla engine renders the element he will remove the element from the DOM. Is the Mozillabot doing that also? Because then I have to find out an other way of making them spiderable.

  34. Hi Matt,

    Is it possible to search for just home pages? I would like to be able to do a search and filter out any page that´s not just the domain name. Right now I am trying to get that done by adding -inurl:html -inurl:htm -inurl:php -inurl:asp -inurl:doc -inurl:pdf etc. etc.

    But that doesn’t filter out enough.

    Is it possible to do this in some different way?

    Thanks,

    Peter

  35. William Douglas

    Over at WMW, there were quite a few people claiming their sites came back to normal on July 27th from the June 27th fallout. Unfortunately for me, I found all my listings totally gone on July 27th. From #1 to #200, many pages not even showing up anymore. I did notice all the pages that lost rank fell to the very end of site:

    When things like this happen, and people lose nearly all their listings, do you advise going through and looking for problems, re-writing content etc., or is this something that is more of a waiting game, and we should just keep working as usual? I’ve signed up for sitemaps, enquired about any penalties, but I don’t anticipate getting much out of it. My problem seems limited to all the article pages, regardless of topic or linkage. Just an unexplainable loss. All doing fine one day, all gone the next.

  36. Matt –

    “Wondering whether a new website is worth your time? Use the Toolbar’s PageRank™ display to tell you how Google’s algorithms assess the importance of the page you’re viewing”.

    I have never met a single person who uses TPR for saving time! In the real world it is used either to sell text links, sell sites or as a badge by link collectors. TPR serves no useful purpose that I can see except to erroneously validate a micro-economy, an economy that Google could well do without. So my question is – Why doesn’t Google do away with TPR?

    – Michael

  37. Matt,

    How is Google coming along with indexing Flash? How can a webmaster avoid duplicate content if they want their site to rank and have a snazzy flashy site? Any different guidelines?

    Thanks,

    Pickle

  38. I’d like to know more about the supplemental index. It seems while you were on vacation many sites got put in there. I have one site where this happened. It has a PR of 6, many links from dmoz, yahoo, wikipedia, .edu sites, etc. Online since 2001. In late May though it got stuck in the supplemental index, and from what I read on forums I’m not the only one.

  39. Does Google consider using sIFR cloaking? If running text through sIFR displays the same text as it would without running through the code, why would it be penalized?

  40. Will we ever see more kitty posts in the future? 😀

  41. Matt, what are Google SSD, Google Guess, Google RS2, Google Mobile Marketplace, Google Weaver and other services discovered by Tony Ruscoe? 😀
    http://ruscoe.net/blog/2006/07/whats-in-googles-sandbox.asp

    So, an other question because I suppose that you will not answer to my first question 😉 : How Google officially respond to the statistics about Google services with a really small market share? How do you explain that? http://weblogs.hitwise.com/bill-tancer/2006/05/google_yahoo_and_msn_property.html
    http://weblogs.hitwise.com/bill-tancer/2006/07/google_properties_updated.html

  42. hello,

    I own a web site which uses national TLD suffix; just recently I have joined sitemaps and what happens is that for some reason googlebot is unable to reach my web site because of DNS timeout error

    I have checked, rechecked, analysed and my DNS records are fine, site is easily reachable from any browser, emails are flowing without a problem and yet, google sitemaps state that last sucessfull googlebot scan was on june 15th

    I have tried to explain this particular problem on sitemaps newsgroup (no usable response), I have tried to send support request (no response) and I am simply out of ideas

    is it possible that googlebot for some reason uses different DNS approach, or that something changed in last 45 days?
    how can I learn what is the exact error message instead of “DNS timeout error” which I can not duplicate

    p.s. not to mention I have few other domain hosted on same server and they hold DNS records on same DNS server and the only difference is that all of my other domains are either .org or .net and not national domains

  43. you should switch the captcha here to kitten auth, but with pictures of your cats.

    that should provide everybody with enough kittens to last a while.

  44. Hey Matt,

    Can a redesign (improved from bad invalid markup to strict xhtml css etc) affect rankings dramatically? ie. don’t the rock boat

    Also when you end a page’s life and have no replacement to redirect to, what’s best to use? 410? or just 404?

  45. Matt, thanks for sharing answers!

    Here goes. Google says:
    “What can I do if I’m afraid my competitor is harming my ranking in Google?

    There’s almost nothing a competitor can do to harm your ranking or have your site removed from our index.”
    http://www.google.com/support/webmasters/bin/answer.py?answer=34449

    My question is about the “almost” in the last sentence… or differently put: So what can a competitor do to hurt one’s rankings, as Google doesn’t completely rule out the possibility of it happening?

  46. Matt – first, good for your mom and wife – that’s a super project.
    Second – how about a preview of what you’ll say in the many SES “Duplicate Content” discussions that’ll be going on? e.g.
    Many sites use similar content in different ways (e.g. Hotels.com affiliates using their database with maps or other added content) – what are the search pitfalls to avoid?

  47. Does Google index or rank blog sites any differently than regular websites? With WordPress being so great, many sites are dropping the static pages for a 100% blogsite. Are these seen or treated any differently?

  48. Matt,
    First, congratulations on your fight to bring back ‘oganic’ SEO. Its true you still need links to rank in google but they must be organic and in in the spirit of what link is supposed to be ‘a vote for your site’. There is a huge push now to submit articles to get links, since reciprocal links and paid links are no longer as effective. We have always done articles and encouraged content distribution as a legitimiate branding and marketing strategy. Now my concern is that the spam aritcles will become a target for Google and before we know it article links are all wiped out and sites using them in the correct way and for the right reasons are taken down int he fall out as has happned before.

    My question is this can you please dispel the myth that spewed out articles from ‘article banks” distrubuted all over the place is a valid linking strategy for Google. It might stop things getting out of control and prevent the need for ‘article filters’ which would wipe out the value of ‘real’ articles to the web. I have conducted multiple tests with the article distirbution being touted as a linking strategy and the results were SPAM, SPAM,SPAM……

  49. My question is regarding the total number of pages indexed for a given site in Google…
    We see this number bounce around a LOT for our site (via the [site:] command). Besides, hitting a different datacenter, what could be the reasons for this? Somedays all datacenters will have >2 million pages from our site, then some days all datacenters will have 200,000 pages from our site. The numbers seem to increase/decrease randomly each day. Overall, shouldn’t the trend be up?
    Thanks,
    Jarid

  50. How do we get new websites to show up?

  51. Pon Arun Kumar

    Hi matt . This is my first ever comment in your blog and that too i am writing it with my shattered hopes. i recieved a mail from google stating that your profile is not that strong to suit an SQE. Moving forward i like to know the basic skills expected.

  52. Hi Matt, thanks for taking questions again. 🙂

    Almost on a daily basis, I see someone on a forum mention their “sites”. They talk about having multiple sites on the same subject with “different content”.

    Any chance we can get some insight on how Google views these types of “multiple site collections” on the same subject? Is a larger comprehensive site the better option?

    I always suggest the larger site, but it would be nice to hear your thoughts as well.

  53. James Pettyjohn

    I’ve been seeing a rise in Social Bookmarking popularity and noticed they are showing up a lot now in standard Google results (places such as digg.com and netscape.com showing up I’ve seen as high as #10 on a prominent keyword, this instance was for “church”). With the rate of growth in that area accompanied by the speed of indexing I was wondering if there is a supplental GoogleBot of sorts for social bookmarkin. On this same line wanted to know if there is, or is there any plan to be, some special way of treating these items as to what weight they give other related sites and what they value they themselves can collect.

  54. Is it possible for Google to have TWO different titles for the same page?
    It seems so…
    When I do a site: mydomain.com it shows some outdated info from years ago, (my company name) from when I first launched my site about 6 years ago.

    However, if you do a view source, you will see the actual title. Sometimes Google displays the old title, sometimes the actual title, depends upon their mood I guess.
    With the July 21 upset, they seem to have switched back to my old title and removed me from the serps.
    Any ideas on how this can be rectified?

  55. Hey Matt, while all the important ladies in your life are away, can you give us your thoughts on a simple point relating to writing for the Web:

    Prevailing wisdom from the end-user usability perspective goes something like: “People don’t read screens like paper. They’re usually in a hurry, they scan-read rather than read every word, and they want a short, snappy delivery of information, not screeds of Ph.D. thesis.”

    Prevailing wisdom from the search-engine perspective goes something like:
    “Search engines tend to rate pages with loads of keyword-laden text more highly than their scanty (but perhaps equally keyword-dense) counterparts. So 50 pithy bullet-point words of text are less good than 500.”

    Is there indeed a dichotomy there? We all know one should write for people, not for search-bots, but one can never quite shake off the feeling that a customer-pleasing, pithy page is likely to score less highly with Google than a more lengthy piece.

    Your thoughts?

  56. How important is LSI (themed content) in the current ranking algorithm?

  57. Something went wrong with my post, I didn’t know you parse the HTML. So here again:

    Sometimes I make a [select] box spiderable by just putting links in the [option] elements: [option value=website.com][a href=”website.com”]Bla[/a][/option]. Normal browsers ignore them and spiders ignore the [option] and en use the [a].

    But since a while Google is using the Mozillabot, and that bot renders the page before he crawls it. I know that if the Mozilla engine renders the [option] element he will remove the [a] element from the DOM. Is the Mozillabot doing that also? Because then I have to find out an other way of making them spiderable.

  58. Hi Matt,
    Greetings from Portugal 🙂

    Does Google give the same link weight to an international second level TLD like gov.pt or edu.pt as to the US equivalents .gov/.edu?

    Thanks,
    Jose Fernandes

  59. I’m curious to hear about the 27th shakeup and the weird things that happened with the link operator for reporting backlinks.

  60. How much spam does Google allow? (just kidding.)

    Seriously, if a subdomain for every city in town in the US with a doorway on each one doesn’t get taken out ….. we can’t help thinking … why not do like competitors with some hidden keywords top and bottom of each page? … why not throw in some extra links with style “display:none” or (I love this one) divs with id=hidden.

    We try to stay clean but, man, it gets frustrating to see what other folks get away with. How about an occasional focused effort on one specific trick and then another? Maybe that could put some fear into the folks who think they can fly under the radar.

  61. Matt

    What’s your view of those co-op link networks?

    The reason I ask this question, is because I’m seeing a site rise very high in the SERP’s for some very competitive terms and from what I can see they are achieving this by being part of a co-op link network and are showing thousands upon thousands of backlinks on Yahoo.

    Just thought this type of thing would have been high on your agenda and am amazed it seems to be working in 2006!

  62. Matt, for you, what has been the biggest challenge working at google and why :)?

  63. Matt,

    How does one get their Adsense type rep to reply to requests for adsense documentation?

    The boss thinks ours went on permanent walkabout.

  64. Enjoy your newfound – albeit temporary – bachelorhood.

    Now the question:

    Google (allegedly) does not like subdomains that repeat content as per the webmaster guidelines. ASo for example, if http://www.mysite.com and subdomain.mysite.com are exact copies, rumor has it that they will be dinged a few points.

    What happens in the instance of http://www.mysite.com vs. mysite.com?

  65. Hiya Matt – nice to have this chance again to get replies to puzzling questions.

    Mine concerns the .org extension. I always understood that this extension was for use only by either government agencies or official non-commercial sites and as such were given preferential treatment and heavier weighting in the search engines.

    But more and more often we are seeing these extensions used for sites that are commercial sites or even simply adsense carriers, such as:-

    http://www.babybedding.org.uk/

    are they then getting preferntial treatment or is this one of those myths.

    ALSO – are sites, such as the one above considered spam? I can’t see where it adds any value and would only annoy visitors who stumbled across it – I understood Google was filtering out these type of sites.

    ps – very jealous of the women in your life 🙂

  66. Hi Matt,

    If you can post at all about the related:www.mysite.com information I would be appreciative. Google’s community view of a site seems increasingly important. What other information can a webmaster glean from the results of a related: search?

    Thanks in advance.

  67. Matt
    Enjoy the ‘alone time’ and then be glad when it’s over. “It is not good for the man to continue by himself” – Genesis

    Does some duplicate content hurt rankings? I’ve recently started a journal of designs ( http://www.signspecialist.com/journal/ )

    By the nature of the journal (displays of previous sign designs), some content (whole paragraphs) will be duplicated hundreds of times (once in each entry). The duplicate content is needed so the potential customer can be successfully ‘pitched’ (or get a feel for the process/product). But will this duplicate content hurt rankings – with Google seeing the content as a spam attempt? Or will Google be able to ‘see’ this ‘non keyword’ content and just ignore it?

    With that concern in mind, it would be great if there was some recognized tag for a ‘noindex’ for partial text within a page. Kind of like a no follow but for text.

    But, then again, if this type of duplicate content is ignored – even better. But is it?

  68. Blogger. WordPress. MT. TypePad. LiveJournal.

    Who does the Google Algorithm love the best in terms of indexing, especially for good keyphrases? The ones most semantically correct?

  69. Hi Matt,

    I have seen a lot of conflicting information on how Google crawls “stateful” web sites.

    If a site sets session cookies to maintain non-essential information, which is not required to view pages but alters the way they are displayed, will Google give such pages any kind of reduced coverage or penalty? Users with cookie support would see a slightly different page than bots. Would this be seen as cloaking to Google?

    A concrete example:

    When a user looks at a page showing all “Blue Widgets” their query is stored in the session so that after they chose particular blue widget they can click [next] to move directly to the next result. The same widget could appear on a “Large Widget” result page and the [next] button would then move to the next big widget but they are essentially the same page.

    It is also important that the page displaying the large blue widget is indexed by Google with a consistent URL to avoid duplicate content problems so I believe I should not store the query or session id in the URL itself.

    1 – Does setting a session cookies affect the crawl in any way? I have read that dynamic pages can receive reduced coverage?

    Depending on what the user searched for I would like to make sure that relevant information is displayed prominently on the page so they can see why the result matched their query. From the “Blue Widgets” page, viewing a widgets details should highlight the blueness of the widget but may not show the size at all.

    2 – Bots (and browsers) without session support would see all properties of the widget down to the finest detail. Is this cloaking?

    Thanks,

    John.

  70. I’d like to know if Google ever goes back and checks banned sites to see if they could be re-included without getting a request for re-inclusion.

    The background is that I bought an expired domain, built a new site using it, and immediately had good traffic from the major SEs. Googlebot looked at the site every day and I was happy for months, until I opened a Google Analytics account (Magic Tool!!) and discovered that the Google traffic was coming from a Google Directory listing and there was zero organic search engine traffic from google. I did a site:www.domain.com search and nothing showed up, so applied for reinclusion from the link on my sitemap page.

    3 weeks later there is a big jump in traffic and now the site: search shows about 40 pages in Google’s index.

    Would the reinclusion have happened anyway when Google realised that whatever had caused the ban had been fixed, or do we need to check a site and request re-inclusion if the site seems to be banned?

  71. Hi Matt, I’ve mentioned before that I’d love to see you do a define: type post, where you define terms that you Googlers use, that we non-Googlers may be confused about. Terms such as data refresh, orthogonal, etc. You may have defined them in various places, but one cheat sheet type of list would be great. Thanks.

  72. If I buy out a company’s website and 301 redirect the entire domain to my current domain will I get a penality for doing that in Google organic search?

    The domain I am going to buy out has about 3300 links pointing to it and it is very relevant to my current website that has 2300 links pointing to it.

  73. Hi Matt,

    Tell the wife to get you a five toed Dragon in Beijing… You won’t regret it. 😉

    I asked before about javascript cloaking, but never got an answer, so I thought I would ask again.

    Is google working on ways for googlebot to spot javascript cloaking? Or maybe I should ask;
    How far along is google in finding a way to detect and remove javascript cloaked sites?

    Javascript cloaking is so easy for humans to spot (just surf with your browsers javascript turned off and check the cache), but it seems to be beyond the grasp of googlebot.

    There are some highly competitive queries where I have seen as many as seven javascript cloaked pages in the top ten results… that’s seven out of ten, on just the first page…

    From what I see, javascript cloaking is the current way to ‘fool’ google and not get caught, unless a human reports it.

  74. Matt, I would like to know why some sites now no longer show the homepage at the top of the results for a “site:www.example.com” search. Sites that I’ve seen this happen to have taken a big drop in SERPs. Sometimes during the next data refresh things go back to normal, but not always.

    So what triggers a site’s homepage no longer showing at the top for a “site:www.example.com” query and what can be done for sites seemingly trapped in that condition?

  75. Google still seems to be having listing issues for EXACT company names. I stumbled upon this one recently that goes back to January (URL below). Today it brings up an Infoword article about the company.

    http://chuvakin.blogspot.com/2006/01/google-sucks-sorry-cant-resist.html

    Thoughts? Comments?

  76. Thanks for this

    Can you explain what is going on with
    the home pages disappearing.

    My HP has gone for some searches
    on google, but remains in place on
    google.co.uk

    It’s rather odd especially when people
    are searching for my company and it’s
    not there.

    I didn’t implement the noodp tag before
    although I have now as the description was
    wierd – so maybe that will improve

    If you have time, the domain is the email
    addy I am supplying with this comment.

    Thanks Matt – and if you ever make it
    across the pond – let us all know and
    we can buy you some warm beer 🙂

  77. Hundreds of companies are having issues with one spammy site in particular. It is a self proclaimed ‘consumer complaint’ site. I’ll call it Ror for short. Here is the deal with this particular site. It is not moderated and allows for anonymous comments to be posted. Ok, so where is the problem and why is this spam? The problem comes in when someone, whether it is a competitor, a disgruntled ex-employee or someone with a vendetta, decides they don’t like your company and decides to spam postings to Ror. This becomes kind of like blog comment spam except all on one site and it’s not for external link building but for on topic content. It may be inaccurate content but content none the less.

    So let’s say the owner of Mom’s Donut Shack yells at an annoying kid skateboarding in front of her business one day. The kid gets ticked and goes to Ror and decides to post 30 messages using various names about how Mom’s Donut Shack (MDS for short) is really a brothel and poisons their customers with arsenic. Now even if MDS has the best website in the world and is number one for any search involving the MDS brand name, Ror will show up as number 3 and number 4.

    So my question is what can a company like MDS do about something like that or how do we get spam like Ror out of Google?

  78. Of dashes and plurals.

    When I’m optimizing a site that has terms that sometimes get entered with a dash (-) how does Google look at that related to a space? Also how does Google handle plural terms. Should I try to optimize for both the plural term and the singular term?

  79. Thanks for asking us for questions again…It’s very timely.

    I recently tried to send a client to Google Help regarding
    “What are supplemental results?” at – http://www.google.com/support/bin/answer.py?answer=12286
    and I find that it’s gone.

    I thought that was odd, and so I Google’s “supplemental results” and that same page came up #1, (and missing).

    Cache showed the old answer, (which I posted on my blog), but now there are no results at all found when searching the Google help files, and even the cached page is gone.

    So…While I’m not really after info on “What are supplemental results?”, I AM interested in why the information was removed from the Google help files.

    I’ll also very quickly echo Ryan’s post above…

    I can’t believe that a legitimate business would have to suffer the inconvenience of having to to spread out their websites on different IP blocks and web hosts just to rank a little bit better.

    There are now even “seo hosting companies” that offer different IP blocks for this very purpose. As a web host with four of my own servers, I find this frustrating…Say it ain’t’ so, Matt!

    thanks again, and enjoy your loneliness

  80. I have a .com site with sightseeing information and links to MS Word documents with more supplementary information (for easy printing – have no control of css on that site).
    I am now building a new .co.uk site where I would like to put all that information on regular pages because here I can manage the printing issue.
    Could I get banned for duplicate content when one set is on web pages on .co.uk and the same information is in MS Word documents on the .com site?

  81. Supplemental Challenged

    Matt, previously you said non-English spam would be targeted this summer. To say the least, spam has increased dramatically on all the Google’s in July, but even more so on google.de Free host blog pages and redirects of any blog page are all over the results. In other words, obviously no progress has been made in lessening spam on the non-English Google’s in particular.

    Can we expect to see some attention paid to spam on the non-English Googles in August?

    (And please hurry up and get rid of the July blog garbage on google.com too… 🙂

  82. Hey Matt,

    For years there has been talk of being penalized by Search Engines for using CSS that somehow hides content. People often cite: {display:block;} as a sure way to get penalized, and based on that assumption, reccommend something like: {position:absolute;left:-999em;} or {height:0;width:0;overflow:hidden;}, etc. It seems to me that Search bots would see right through these other techniques if they wanted to.

    Meanwhile, my experience is that there are plenty of valid reasons for hiding content — CSS dropdown navs, tabbed content, etc.

    My question is: is it really true that Search Engines will penalize pages which have content that is “hidden” with CSS, and if it depends, what does it depend on? Thanks

    Ken

  83. Supplemental is the topic!

    Trying to avoid the question, “why did my site go supplemental?” I thought I’d rephrase from a searchers point of view first.

    When I do a search using Google, often results show up that have the word “supplemental” by them. These results are not on page 38 of the results but rather right on top. Sometimes when I follow the result the page is exactly as shown in the snippet, other times its changed totally or doesn’t exist.

    What does Google mean by supplemental? If it is meant as a page that may or may not still be what it was when crawled some long time ago, then why does it show up in the first page or results when there are hundreds of thousands of others available.(according to the 1 OF Statement) If that statement is true, then why does sometime the resulting page have exactly what I want, but the cache date is from 2005.

    Now as a webmaster; I have sites that have gone 95% supplemental. My concern is that there is no pattern to it. Specific product detail pages with tons of content are supplemental while a worthless (in the eyes of a searcher) contact page or sitemap is not. The supplemental pages show up in the search results just fine, still receive search traffic, but are not crawled or updated at all. To combat this I’ve had to start designing the site for GOOGLE not the user, which is in fact against all logic when it comes to building a good site. Why bother updating a page if it’s gone supplemental? Just do the update and rename the page (to take it one step further and greyer I could make it a subdomain page!). That page will be indexed soon enough and the supplemental can go away. Seems like this practice is chewing up valuable processor time, crawler bandwidth, and storage space by Google and myself.

    So the real questions are: What differentiates a page from being in the supplemental vs. normal index. Is it a site specific phenomena, or page specific? Is there anything I can do about it, generally speaking?

  84. Oh yeah, I rudely forgot to say thank for this opportunity in my previous post. Thank you.

  85. Anonymous Coward

    Hi Matt,

    Is it true Google favors sites (forgive the personification) where content is changed frequently?

  86. Matt,

    First and Foremost , if i may start by wishing you a safe return and reunion with your loved ones.

    In regards to the topics to the Q&A if you would be able to touch on the duplicate content issue , especially for e-commerce sites.
    Example would a Blue Widgets and Red Widgets page be considered duplicate pages in the eyes of G ?

  87. Matt,

    Can you please shed some light into the area of RSS feeds. Specifically, why are RSS feeds not considered duplicate content, but copying and pasting content from one site to another is dup content? Is it because one person made their content available to others, while the person who coppies and pastes is in actaulity stealling?

    thanks,
    Mike

  88. A recent thread in WMW was concerned with the use of alt-text for images and whether it might have a positive or negative effect on rankings. Could you elaborate on this?

    Thanks!

  89. What just happened yesterday? Our top search phrases have rearranged in a biggest way I’ve ever seen, with many of the shortest ones dropping out altogether. Here’s the interesting part, the pages that got hurt by this were all in Google Analytics. Has the Chinese wall between Search and Analytics been breached in some way?

    BTW, my Analytics measured referals from Google are within 5% of each other Wednesday and Thursday, so it’s mainly traffic getting redistributed around my site on longer phrases, but it’s still the oddest thing I’ve seen in ten years of reading server stats.

  90. Hi Matt,

    I would be interested in some more thoughts on the sandbox. One of the most common questions around the forums is “Why doesn’t my 2 month old site rank for ‘blue widgets’ on google?”. Most of the time its because the site is cr@p 😉 but in some cases the site has good quality links and is just not old enough to rank for competitive terms. While SEO pro’s know about this it must be very frustrating for people with just one site trying to promote a new business. Maybe if there was some tool in sitemaps to show whether a site was sandboxed or not?

    Also (not that it matters) whats up with Page Rank? A bunch of sites seem to have PR3 homepages and PR5 inner pages.

    Patrick

  91. Why has Google Suggest data/capabilities not been utilized on the main search?

    And a step further…
    Why not show the end user similar data on a SERP recommending links based on previous user’s click throughs?

  92. TrustRank…

    What’s the likelihood that it will be shown in a future version of the ToolBar?

    Which is more important to Google calculations – PageRank or TrustRank?

    How does one keep ones TrustRank score clean?

    Thanks!

  93. For the datacentre watchers out there:

    – should all the results across one Class-C IP block be the same most of the time (except for short out-of-sync periods where new data is being poured in), or are they supposed to be different because you are trying different things on them? Of course, different Class-C blocks are often very different in their results; this is expected.

    – would it make more sense to use the direct IP addresses or the 41 GFE datacentre names (that you have been using for the last 2 years since the demise of the old system) to communicate with Googlers?

  94. Matt,

    Are you tired of all the 27th questions? I know I am…..

  95. when a site is listed on the first page with hidden text did google exclude the hidden text or is it “wise” to report that site?

    I reported a site twice the past few months but nothing happens

  96. Why do Google results often have two consectutive listings (second one indented) for the same site? Been seeing a lot of these where the second listing is just a Contact Us page or a FAQ page.

    I would think showing the single best page that matches the search query would be enough. If the site has good navigation, the user will find additional relevant information.

    This would allow more folks to rank higher and level the playing field.

    Ed

  97. 1) I am seeing a lot of sites with “%09″ (tab) and “%20″ (space) in front of the URL in Googles index. What can a site do that is indexed like this (since the URL will usually not work for a visitor and a 301-redirect is not possible)?

    Try http://www.google.com/search?q=site%3A%2509www http://www.google.com/search?q=site%3A%2520www

    2) How can we speed up cross-domain 301’s? (moving domains)

    3) How can we speed up automatic page removal via 404? (would 410 be better?) Is it really “wait a year” or “do it manually”?

    4) Tell us something about the 30 day automated spam penalty. 😉

    5) Since the link:-command is purposely broken, why is it still available? Do you plan on fixing it? What use does it server as it is now?

  98. TearingHairOut

    Why does a site subject to a removal request not get dropped from the index permanently? Why put a non-existent page back in the index after 6 months?

  99. TearingHairOut

    Sorry, I meant to say ‘Why does a page subject to a removal request…etc.’

  100. There appear to be changes in the way that [i]site:[/i] and [i]inurl:[/i] searches work in the last few weeks.

    The [i]site:[/i] operator looks like it is now working more like the old [i]inurl: [/i]operator used to, and the [i]inurl:[/i] operator is simply completely broken as far as I can see.

    That is:

    a [i]site:domain.com[/i] search used to return any pages that ended with [i]domain.com[/i] at the [i]end[/i] of the domain part of the matching URLs. You could add folders like [i]site:domain.com/myfolder[/i] to cut the results down, but as far as I am aware you couldn’t do a [i]site:www.domain[/i] search and get results, because in that query you had missed the right hand side of the domain URL off the end. The old [i]site:[i] search matched stuff only directly to the left of the first / in the URL, and optionally after it too.

    Now you can make a search like [i]site:www.domain[/i] and it will return any URLs that have [i]www.domain[/i] [b]anywhere[/b] in the URL. The [i]site:[/i] search is no longer matching the right hand side of the data, but now matches anywhere within it. It now works like you would expect [i]inurl[/i] to work.

    A [i]site:domain.com -inurl:www[/i], search should exclude [i]www[/i] pages from the results, but now shows [i]www[/i] pages that are [i]supplemental[/i]. It looks like the [i]inurl[/i] operator is broken too – maybe only with its interaction with [i]supplemental[/i] results, but it doesn’t work as expected any more. There are several other ways that [i]inurl[/i] now fails to return the expected results.

    Oh, and for anyone still reading, searches with hyphens in the query string have been broken for several months, and are still broken now. Try looking for something known to have a hyphen in it, and see that you don’t get many results. Then take the hyphen out and replace it with a space in the search query, and suddenly a massive amount of [i]supplemental results[/i] appear.

    Matt, can you comment on how [i]site:[/i] and [i]inurl:[/i] are actually supposed to work, and how to make best usage of what you now get? The things that I described; are they bugs or features?

  101. Ed said>>>>Why do Google results often have two consectutive listings (second one indented) for the same site? Been seeing a lot of these where the second listing is just a Contact Us page or a FAQ page.

    Hey Ed, try turning off the javascript in your browser and then take a look at those indented listings…

  102. There appear to be changes in the way that site: and inurl: searches work in the last few weeks.

    The site: operator looks like it is now working more like the old inurl: operator used to, and the inurl: operator is simply completely broken as far as I can see.

    That is:

    a site:domain.com search used to return any pages that ended with domain.com at the end of the domain part of the matching URLs. You could add folders like site:domain.com/myfolder to cut the results down, but as far as I am aware you couldn’t do a site:www.domain search and get results, because in that query you had missed the right hand side of the domain URL off the end. The old site: search matched stuff only directly to the left of the first / in the URL, and optionally after it too.

    Now you can make a search like site:www.domain and it will return any URLs that have http://www.domain anywhere in the URL. The site: search is no longer matching the right hand side of the data, but now matches anywhere within it. It now works like you would expect inurl to work.

    A site:domain.com -inurl:www, search should exclude www pages from the results, but now shows www pages that are supplemental. It looks like the inurl operator is broken too – maybe only with its interaction with supplemental results, but it doesn’t work as expected any more. There are several other ways that inurl now fails to return the expected results.

    Oh, and for anyone still reading, searches with hyphens in the query string have been broken for several months, and are still broken now. Try looking for something known to have a hyphen in it, and see that you don’t get many results. Then take the hyphen out and replace it with a space in the search query, and suddenly a massive amount of supplemental results appear.

    Matt, can you comment on how site: and inurl: are actually supposed to work, and how to make best usage of what you now get? The things that I described; are they bugs or features?

  103. TearingHairOut

    Something that I’ve referred to before: When will Google provide some information on how it guages a page under various categories (on page SEO, backlink quality, quality of neighbourhood, etc.?)

    Sitemaps is the ideal location to do this, and it would provide genuinely beneficial information to legit white hat webmasters, and encourage grey and black hat types to change their ways (maybe).

  104. I made my new websites on subdomains off my trusted older domain, they do ok because my older domain vouches for them.

    I would like to graduate them to their own unique domains and off these subdomains but fear that the trust will not be passed via 301 redirect.

    Question: (also see graywolf)

    If I move the sites from subdomains to fresh new registered domains will I be back to square #1 or is this the way to do this correctly?

    Thank You!

  105. Screw it, everyone asked questions that were already asked other places, so I’m gonna too just to make life extremely simple.

    http://labs.google.com/accessible

    What’s the story with this as far as development and rollout are concerned, and can we expect to see at least some degree of benefit applied to accessible, well-coded websites in the future?

    (All mah web standards geeks inna haus say “ALLLLLLLLLT! ALLLLLLLLLLT!”)

  106. Can you comment about Orion algorithm implementation?

    BTW, what do we do with a site still stuck in supplemental results despite updating .htaccess and addressing lingering canonical issues?

    Asking because the site is one about golf that I made for my mother and she wonders why Google hates her…lol!

    Greg

    ps – Love your blog! Keep sending the wife and associated tangential matriarchal figures to China!

    pss – My mother would definitely appreciate an invitation!

  107. ♣ Does Eric Sergey and Larry read the comments on this blog – and have they as well as Google’s Search Quality Engineers been influenced by these comments to change ANYTHING 😕

    ♣ Concerns were expressed about the unfairness of the Link Popularity standard for newer sites – thus forcing Webmasters to deploy tactics that are frowned upon 🙁 – are new Algos being created that would EVEN the Playing field

    ♣ As expressed before, there is definitely a major flaw in the PR algos – What is being done to make them more stable.

    ♣ Could Chat be added to this Blog (or at the very least http://gabbly.com/www.mattcutts.com/blog – or even a real-time Web conferencing set up among the blog regulars with Google’s Engineers, there are technologies available now that can do this without any cost to anyone. The age has come where it is not necessary to ONLY go to conferences or expos to talk with the public.

    Let’s Embrace the new advances in Communication!

    ♣ In previous topics, it was stated that the NOFOLLOW would be removed from the URI for trusted posters – when will that be implemented.
    This would please and bring such joy … 😛

  108. TRUST RANK

    What are the suggested steps that you would recommend to achieve this – please avoid any suggestions that could be immitated by spammers.

    Red herrings – has G ever considered this? Include into the algo something that obviously only a spammer would follow – allow the ranking to increase to encourage more spammers, then when they have all taken the bait to the ambush NUKE the lot of them.

  109. Surprised nobody has asked you the most important question Matt.

    Boxers or Briefs?!?

    P.S. Wikipedia says: ‘This phrase is frequently misquoted as being prefaced with “Dammit, Jim…”, but McCoy never used the clause on the television series, presumably since the profanity would not have been permitted by then prevailing standards.’
    http://en.wikipedia.org/wiki/Leonard_McCoy

    P.S. Ditto what Patrick Grote Said

  110. Boxers or Briefs?!?

    I for one would prefer not to know the answer to this…especially if it’s “Commando.”

  111. I guess everyone’s questions can be summaried by this question: how exactly does Google evaluate a site? lol…

    Seriously though, what is the best keyword density would you suggest for a site to have? I heard it’s around 7%.

  112. Hi Matt,

    Thanks for the Q&A session. I have never posted and am usually just a “lurker” but the time has come that I make my first post.

    Is there any way of figuring out why a site gets continually affected by data pushes, etc? While others are never touched? When a site is clean, no spam and whatnot, it seems like there must be a way to protect a site from such changes, pushes, updates… If I saw it happen to a number of sites under the same keywords I wouldn’t think so much of it. However, when everyone else stays put, and my site gets hammered each time, it makes me wonder what is wrong and how to fix it.

    A recent “push” or something has once again caused turmoil for our home page (so far just home page) and yet it still shows up with the quotes under each keyword phrase we used to appear for. It is situations like this that continually happen and we would like to understand what we are doing wrong that we keep getting caught up in it. This is a 6 year old site that only started getting caught up in these “pushes” or “updates” starting Feb ’05.

    Thanks for reading our posts. You are gonna be busy!

    Regards,

    Kris

    P.S. My apologies ahead of time if I didn’t see the same question asked. My eyes have gone googlie. O.o

  113. Do you like hot sauce? Seriously…..

  114. Do innocent sites get caught up in spam filters?

    Safe travels for the Family!

  115. I realise there are a lot of questions here already, but…

    I’ve been being more proactive recently in reporting comment/trackback spammers to their web hosts recently. I’ve noticed there are some web hosts which don’t reply to such complaints, and have servers which seem to be full of spam sites. In a situation where a web host’s IPs contain a lot of spammy sites like this, would Google consider doing a blanket ban of sites on their IP ranges in order to try to get them to change their ways?

  116. Hi Matt.

    I was asked this question earlier but i didn’t have an answer, So i figured i would ask you, It shouldn’t require an indepth reply so i -do- expect an answer :p

    When we use a custom 404 error message on apache by using;
    ErrorDocument 404 http://www.domain.com/404-error-page.php
    The server will then send a 302 status (temporarily moved) to send the visitor to the custom 404 page.

    Would this 302 status risk the site being penalized for duplicate content as a lot of URLS could be sending the user to the same page using a 302?

  117. When might 301s earn a penalty, if ever? Would the algo pick this up or must it be eyeballed?

    And specifically, might 301s from hundreds of no-longer-wanted subdomains back to the main domain, ie subdomain.domain.com to domain.com ever earn a penalty? Is there any point in those mass 301s being done in smaller batches?

    If a 301, or mass 301s, can earn a penalty from the algo, then wouldn’t it be the easiest of dastardly deads to bowl out competitors?

  118. Okay, this is wonderful. Just wonderful. But please stop now. 😉

  119. Hi Matt,
    What is the deal with SmartPricing?

    I get the reason for it to exist. But I do not understand things like how one badly smart priced site can reduce the worth of all the other sites in the account. That makes little sense?

    Does the SmartPricing algo actually look at “sales” on the adverertisr pages or just some “external…general” factors? If it does look at sales (which would be hard to believe since many of the Adwords advertisers are also Adsense spam) then should SmartPricing not be page wise not account wise?

    A lot of people want some SmartPricing answers from the horse’s mouth. (Check out the Webmaster World debates.) It would be great if you could go into some detail on this subject.

    Thanks.

  120. Were canonical issues fixed to Google’s satisfaction?

    Can we expect any more changes/improvements with how Google handles canoncial pages?

  121. Ok, I have finally figured the entire thing out, it is all about helping Google orgainze the world’s information. 🙂

    (HAHA @ Matt, boy am I glad I am not you!)

  122. Hi Matt,

    Thanks for the Q&A session. I have never posted and am usually just a “lurker” but the time has come that I make my first post.

    I have been having a problem getting a site indexed. Google crawls the site regularly but only the index.html page gets indexed the rest of the pages are supplemental from 2005. Thursday, July 20Th I started converting the site to php and put a 301 direct to the new index.php page. Google indexed the new index page on July 24, but still did not index any other pages. It’s like google doesn’t even see the links, much less the sitemap. I am even using Google sitemaps and it doesn’t seem to help. Is there anything that would make google leave the page before following the links?

  123. As per the thread on WMW, I’m wondering why a site:mysite search would pick up images of ours that have been hotlinked on people’s blogs, MySpace, etc. especially when the image won’t display on their site due to hotlink blocking.

  124. I would like to hear more about how Google views accessible websites
    Does the algorythm analyse a pages accessibility …surely as Google wants to provide the ‘best’ webpages and experience it shoudl take these factors into account.

    Cheers

    Webecho

  125. A simple and straight question:

    what happened on June 27?

    Of course you and most readers knows what I mean… that day LOTS odf site had huge looses on Google traffic… and that situation still exists.

  126. Matt Said,
    “Okay, this is wonderful. Just wonderful. But please stop now.”

    Not before you post a weather report about young JD including a recent picture. Has Emmy “unplugged” him or chased him out of the house, or something 🙂

  127. In the spirit of SES Latino, I would encourage you to expand upon a recent post by Vanessa Fox regarding tips for international sites – one of your “all but the kitchen sink” guides addressing both language and location issues would be welcome.

    Vanessa mentions using a country specific domain to appear in Google country restricted searches. She also suggests hosting a generic tld in the country of interest. This is a great start. Unfortunately, the waters get muddied rather quickly.

    As a start, consider a “simple” example: I want to market to ALL of the German speaking countries, thus a linguistic group (which may or may not be in the same geographical area).

    1. Does it matter if I use a .de, .at, or .ch domain?
    2. Assume I use a .ch domain as my company is Swiss (or have a .com multilingual site hosted in the US), can I tell Google the language of my content by flagging my content as German rather than French or Italian? Is Google aware of html lang= tags? http “Content-Language” headers? In the case of conflicting http headers and http-equiv meta tags, does the page level meta tag have the final say?
    3. I want to market to each of the major German Speaking areas, tweaking my marketing copy for each local site (.de, .at, .ch, each on the same server in the US….) to use the local language variants to insure I’m resonating with my audience. I’ll flag my content as language de-de, de-ch etc. Along the lines of using “color” for the US and “color” for the UK (and NOT using fanny in the UK to cite perhaps the most colorful example). Is there a fool-proof way to ensure Google is able to detect such content as localized content, avoiding the risk of triggering a some sort of I plagiarized someone else’s content duplicate content penalty?

    Thanks in advance!

    P.S. Ditto the question above on CSS. Technically, css dropdown menus are a nice non javascript navigation solution for end users – do search engines see it this way? FUD – it seems that we use them at our own peril…?!

  128. Just to clarify: the second occurance of “color” above should have been colour.

  129. Finally I stumble across your blog – fantastic stuff. As a longtime SEO freak I feel at home here.

    My question: Are we in the midst of a Google update? One of my web sites – actually the one I included in my comment – had fantastic SERP for all related keywords/phrases for months, and this morning nearly vanished from all keyword related search results, although the sites pages are still in the directory. This is a common side-effect for many of my sites before an update. Coincidentally I just completed a re-SEO of the site so naturally I get nervous that I botched something. Any news regarding an update, be it minor/major, directory or algorithm?

    Appreciate your time! Keep up the GREAT work.

  130. Can poor quality search results contribute to the decline of a programming language? I started thinking about this when I recently decided to learn Perl and do a small project using it. So many of the results are duplicate pages and / or irrelevant … its really frustrating. I find it hard to believe that a language that has been around so long doesn’t have webpages that document the answers to the questions I seek. Sorry. I guess this turned out to be more of a rant than a question …

  131. I’ll try to make this a non-my-site question:

    If you have a site that opens a pop-up virtual tour window when the “explore” link is clicked, is that site penalized?

    The virtual tour looks so much better in it’s own “clean” window. We’ve done this for five years without any penalty, but now…

    So, I am wondering, what are the rules for (good content) pop-up windows?

  132. Dave (Original)

    RE: “Okay, this is wonderful. Just wonderful. But please stop now.”

    LOL! I think you are being totally ignored on your own blog matt 🙂

    Just post a link to the Google Guidelines, it’s all there IF people would only read & understand them.

  133. Hi Matt

    Sorry I am late to the Party – but would love an explanation on this question.

    What is the score regarding where the root page eg (domain.com or http://www.domain.com) in a site:domain.com search ? Sites which have ranking problems seem to suffer from the domain.com not being first in a site:domain.com search and the two refreshes on 27th June and July seem to have some effect on this positioning.

    For sites that had problems and are now the domain.com is top on a site:domain.com search – would this be good news for that site going forward ?

    Sorry this is so late to the thread – had a lie in on Saturday morning UK time 🙂

  134. Sitemaps!! Over at the sitemaps forums there is unending anguish an uncertainty about the usage of sitemaps. Noone from Google is participating in the forum which certainly does not help. Any info would be helpful. Main issue seem to be :

    * Adding a sitemap does nothing, and perhaps even hurts, your pagerank. Seems unlikely that using this fine google service would penalize some sites, but there have been numerous well documented accounts which seem to indicate just that. Many people report after deleting their site from sitemaps, their serps came back immediately. What’s the scoop?

  135. Hi Matt,

    I’d be interested to know the kind of numbers involved in contacting webmasters about problems with their sites through the webmaster sitemaps console as per your post http://www.mattcutts.com/blog/notifying-webmasters-of-penalties/

    Is it being put to good use by the sitemaps/spam teams or is it just lip service?

    Thanks,

    Philip

  136. Matt, pertaining to a comments thread on a previous post that I recently added to, could you tell us more about the process of removing spammy sites from the SERPs?

    It was demonstrated to me that finding hidden text can be a computationally expensive procedure, meaning that some forms of spam can only be detected/removed manually. I know sites can be reported for this, but it would be helpful to know what happens after that – what resources does Google have for this activity (UK-specific info would be handy for my particular situation) and what sort of response/timescales should we expect once a site is reported?

    I note above a post from an Italian lady asking a similar question.

    To TedZ re: spam detection tools – I had this link posted in a reply on another thread: http://www.detect-hidden-text.com/

  137. I want to bump up JohnMu’s question:

    *) Why does the link: command show an incorrect result? Yahoo siteexplorer and MSN shows a far more accurate result.

    Thanks.

  138. Google seems not to show “new” sites for any “competitive” keywords. Does this help prevent search spam, or is it a clever marketing ploy for Google Adwords? [big grin] We SEO’s keep debating this, and some think the “aging delay” is unavoidable, while others think that there are ways to minimize or avoid it.

  139. Thanks for this Q&A Matt!

    I have a question about internal linking and something I read about link churn:
    For usability reasons I think it is a good idea to have random links on a product page that point to other items in the same product category. The links would be randomly generated from a catalogue but what if these pages had a 1-month cache. That would mean that a part of the internal links on the page change monthly. Would this be considered spam or is it okay to do that?

    Thanks 🙂

  140. “Just post a link to the Google Guidelines, it’s all there IF people would only read & understand them. ”

    Not quite so oh original one 😉 .

    Stuff sometimes gets assumed that should not be.

  141. Matt,

    I have a question that has been burning at me for quite some time. It is site specific, but I think I can post the question in generic enough terms that it has wide appeal.

    My site used to get TONS of traffic from Google. Then on one day just about two years ago, it was cut to less than 5% of what it was. Since then, it has trickled up but is now about as low as ever. During this time, my PageRank for the home page and for various subpages I test has remained steady. My home page is a 6 and subpages are either 4 or 5.

    If I search on a very specific term for something that is a service that only I offer, I am typically on page 5 or lower of the Google results. The same query on the “other guys” has me in the number one position. The first page of the Google results are sites that are linking back to me. Basically most of them are press releases about me. So I would think that should help my “link relevance”.

    I’ve asked Google for specific help in the past and have been directed to the Webmaster guidelines. Their assumption is that I’m breaking some rule. I’ve read the rules over and over and over and if I’m doing anything wrong, I’m baffled. I’ve asked several experts and they can’t see anything I’m violating. All of the things you are supposed to do for a high ranking, I’m doing….legally.

    For the most part my site is VERY content heavy. Sure, there are affiliate ads and AdSense and things to help bring in some money. But in the case of one page in particular, the only ads are AdSense and yet it still is buried in the rankings.

    About a month ago, you gave instructions on how to request to get a site off the “spam blacklist”. I followed those guidelines and have not yet gotten a response. I have used Sitemaps for the past 6-8 months and it hasn’t had any effect on getting more/less of my pages ranked or ranked higher.

    Since it has been two years since the big drop and it has had a very big impact on my site’s earnings, it is a constant headache for me. Remember, the site does very well on the other big guys. It just seems like Google has something against me and for the life of me I can’t figure out what it is.

    So if you have some general insite on this type of issue, I’d love to hear it.

    Foster D. Coburn III
    Unleashed Productions, Inc.

  142. Question: Is there a penalty/filter if I have several web sites that use the same site structure? For example DIV tag names, CSS class names, Tables names.

    The sites are copied from each other, it is just the subject matter is different.

  143. woops there goes your weekend Matt 🙂
    Just send me the google search algorithm and I’ll help you answer your questions.

    😉
    Danny

  144. Thanks for Q & A Matt.

    Can you explain google’s preference with regard to 301s and 302s.

    Several clean sites (4-8 years old) that sit on Win boxes have the non-www 302 to www. A few weeks back google started showing different PR and links for the non-www versions. Last Saturday many of these sites saw their www pages dropped and the non-www pages show up instead.

    With a few of my sites I decided to move to apache and implement 301s. Now I’m seeing some sites that kept the 302s come back. Should I have kept the 302 or is the 301 better in the long haul?

  145. Hi Matt,

    I run an eCommerce site with 10,000+ products for sale. I also list about 6,000 of the same products on eBay with mostly the same descriptive content. Will this cause a dupe penalty? Also VERY interested in your take on supplemental results (causes, corrections, 301’s, queries, etc.) . Will there ever again be hope for the masses of still valid, original content pages that are stuck there?

    Thanks in advance

  146. hi
    could you please explain what big daddy is? its always mentioned, but i can never find out wtf it is!!
    cheers man

  147. Is there a Google bias for www. pages and sites versus non www. sites and pages? Do you hurt yourself by not having www. in your URLS?

  148. Holy guacamole this is a lot of questions. I can talk faster than I can type, so I’m thinking about trying to videotape me trying to answer some of these questions. I uploaded a short test video to Google Video and the status page says that they’re verifying it. If they can verify it in a few hours, then I might try to videotape me giving answers and upload it. If that doesn’t work, I might try YouTube, although they only allow 10 minute clips? I know nothing about video, so if the whole thing fails miserably, I’ll try just writing the answers.

  149. Matt,

    Thanks for doing another grab bag, I have a question about duplicate content on multiple sites.

    Let’s say that an ecommerce website has a multitude of products that is related to a specific industry but because of their breath and depth of product they have thought about creating sister websites that list a category of their products in order to better service that niche area. They would also like to keep the products on the main website as well therefore having the products in two locations and accessible for customers. What duplicate content or search engines issues do you think would arise with this or any website that wants to list the same content on two different websites? As far as I can tell they would get hit with a duplicate content penalty because the content is exactly the same with website formatting differences. Is this true and how much do you think this penalty would effect both websites? What are ways that you see if any around this issue, or is it a non issue?

    Thanks!

    -Zach

  150. Matt – there’s a lot of talk about subdomains these days. Can you tell us if legitimate sites using subdomains are going to be penalized? For instance,m About.com uses subdomains as do Google and Yahoo. I hear so much fearmongering about this that I’d really love to know the truth!

  151. How to change domains?? A year ago I changed the domain on a site to a more relevant domain (which was just registered) and bam! All ranking disappeared despite dutifully following guidance and 301’ing all the pages etc. Took 7 months to get my traffic back.

    Just purchased a great content rich site which happens to have an aweful domain but am terrified to change it over to a more descriptive shorter name as I really can’t afford to lose all the traffic for another 7 months.

    It is strange that Google has staff on hand to manually allow sites banned for spamming back into the index, but nothing for us law abiding folks who just need to change domains.

    OK – a bit more of a rant than a question I guess…. but guidance on what to do would really be appreciated

  152. Video, cool idea, sounds like a better way to communicate.
    (we can watch you sweat and squrm… 😉 )

  153. Matt if you need to hire another assistant to just answer questions… I’m your man.

  154. Poor Matt’s gonna regret this decision. All those questions.

    Matt, if you can answer all of them in 10 minutes or less and still do it coherently enough so that most people actually understand the answers, you will become an imperial God. I don’t think you’re gonna be able to pull it off though (no offense…there are just too many questions.)

  155. ** Holy guacamole this is a lot of questions. **

    I guess a lot of things went unanswered when you went AWOL for 6 weeks.

    Just playing “catchup” now.

    If you video the answers, I hope someone transcribes at least the highlights back into text.

  156. Kevin Heisler

    Maybe SeekingAlpha can do a transcript of the video since they’ve partnered with Google Finance.

    If not, maybe you can give Dragon Naturally Speaking 9.0 a shot. The NYTimes gave it a big thumbs up.

  157. I would prefer answers in text, because it will be easear to save as reference for future discussions here on his blog, and also on other webmasters forums.

  158. Hi Matt

    Is there such a thing as a Google ‘Trademark Supression Penalty’. Webmasters will see their website in the Google index but buried in the results for particular ketywords. Also if you type in the name of URL it will surpressed to page 3 (or higher) of the results.

    How is it ‘tripped’ and is it possible to come out from under the penalty.

  159. Hi yall….
    Somehow this blog reminds me of the Holy Wall in Israel where people stand all day long praying, hoping that one day there will be an answer.
    Not sure about the wall in Israel but praying to this wall about answers is like talking to deaf person. You’ll never get any answers and if you do then the answer is like a prophecy from Nostradamus 🙁

    Sorry guys, but I don’t think Matt can help you all 🙁

  160. Hi Matt,

    How about not have a top 10 result that is just a framed page showing the contents of the longstanding #1 result?

    Chances are that a user has already seen the #1 result by the time they hit #5, eh?

  161. ahh matt tell them to take the /ig advertisement off the homepage… where’s the simple google I love?

    yes I use /ig and yes I have one of the most popular modules for it… but knowing that the plain regular nothing but a search box google was there was cool. I liked using that version at work…

  162. Matt, I honestly went through all the messages. I resisted posting cause people here ask very important things and wouldn’t want to add, but this is something that has outraged quite a few.

    One thing that is important about Google and how to make it better….

    When will you be able to trace hardcore spammers like http://www.google.co.uk/search?num=100&hl=en&q=site%3Admost.info&btnG=Search&meta=

    ?

    I mean there are several domains which do this and are hosted at the same ip range which have tens of thousands of pages with just spam.

    And how about all those poor webmasters who have been severely damaged from these guys through what is supposed to be an easily detectable 302 redirecting trick?

    I mean this is something which after all those billions you have yet to detect. It affects us, but it also affects your searchers. And what about the domain auction site selling all these domains behind?

  163. Hello Matt,
    There are alot of questions and comments about the site: command, and I also have one. What is going on with http://72.14.207.104/ ? I am seeing major drops in numbers for the site: command versus other d/c’s. It seemed to show up on July 27th, with some discussion over at DP. Also, there are different results across some d/c’s for site: command done with and without www, despite the use of 301 redirects in the .htaccess of the sites.
    Is there one d/c, one command, that gives the correct results right now?

  164. Matt,

    I honestly have concerns about anyone that is suddenly free from the harassment of all XX chromosomes, when it’s time to settle in for a pint with his bugs, instead invites all of this complaining and bitching, must miss the XX’s eh? 🙂

    Seriously tho, I have a few questions about some very specific things but nothing I can post publicly, security concerns and all that, sigh…

  165. I meant pint with his BUDS…. horrible placement of the G key 😉

  166. Sorry guys, but I don’t think Matt can help you all

    Sure he can. He’s Omnipotent Lord of All Things Google.

    Although…this does lead to a question that might save you a lot of trouble in the future, Matt:

    Where else could people searching for interactive communication sources with Google go (e.g. other Google engineer blogs, etc.) that would allow them to get answers in addition to this one?

    A list of about 10-15 of those would save you a ton of aggravation and allow users other avenues of communication.

  167. He’s Impotent Lord of All Things Google

    .

    Um, ah, no comment.

  168. Will somebody once and for all put end to the following 2 questions..yes MATT im looking at you, this is never ever sorted or decided, and after all this time these two deserver difinite answers

    1. 301 from www to non www, why does google make such a mess of this, its like having all these sites are being penalised who dont have the 301, other search engines dont mess up on it like google do they? Damn, i hate having to make sites all just for google, its rediculous

    2- What about the claims that google is ranking sites with adsense ads on higher?

  169. Matt,

    Can you tell me where I put my keys? I can’t find them anywhere!

    Jess

  170. Do you think a Visual-block Page Segementation model or variance of it, works to fight link spammers?

    Dave

  171. Dave (Original)

    RE: “LOL! I think you are being totally ignored on your own blog matt”

    Let me re-phrase that. LOL! You ARE being totally ignored on your own blog matt.

  172. I see you already have more questions than you want (I would have cut us off in the single digits! 😉 — but I just wanted to throw another vote in favoring the video response.

  173. I guess, Matt went early to bed this evening after he had decided to answer the questions, on Sunday, in writing and not through a video response.

    Stay tuned for the greatest Grabbag Sunday 🙂

  174. Hi Matt, here are a few questions.

    1.Does the sandbox exist or what we see is a “too many links too fast” penalty since most webmasters do most of their link building when they launch a site.

    2. If my sites get scrapped by spam sites will I be penalized for duplicate content? Do you have any mechanisms in place that detect which came first?

    3. Can Affiliate content hurt my rankings? For example let’s say I have a blog about purple apples and I add a niche Amazon affiliate shop selling purple apples. If the shop goes supplemental which is quite natural can it cause the rest of the site going supplemental as well and losing it’s rankings?

  175. Dave (Original)

    RE: “1.Does the sandbox exist or what we see is a “too many links too fast” penalty since most webmasters do most of their link building when they launch a site.”

    I’ll have a crack at that one and let Matt ignore, correct, or agree 🙂 IMO, the “sandbox” is a myth (1 of 1000’s) and simply a consequence of Googles algo. In other words, new sites are analyzed and ranked by the SAME algo as established sites.

  176. Hi Matt,

    What is in Google’s eyes a good linking strategy?

  177. Matt,

    For the past three weeks, I thought my supplemental days were over. There were no supp pages showing and my real indexed pages were growing nearly every day. I was up to about 700 real pages out of a 2000 page site. I was happy. Last night disater struck, the index reverted back to were it was a month ago… most of my pages are supplemental again.

    In the past you told me there was no penalty on my site, you said it was a matter of PR. We worked hard to gather respected, topical, quality links. We believe we were succesful in this endevor. The pagerank update came and, for the second consecutive time, left our homepage PR untouched. It remains a PR2. This is wrong! How can it be the most of our newer section have a much higher PR than the homepage? How can it be the any page in our site which was assigned PR in the February update a) remains unchanged since b) is ridiculously low? Pages added since February appear to have legitimate PR.

    How can it be that a one year old site, which has followed the rules to a tee, loaded with useful content, has a solid collection of quality inbound links, a) remains a PR2 and b) cannot even get in the Google index… much less rank?

    If this is not an example of something being broke with BD, waht is?

    jim

  178. Hi there

    Regarding to several statements the only solution to get deeper website crawls is to get more high PR links and/or to get deep links to the website’s content.

    Our problem is similar to that from those folks who have problems getting their product pages into the index. We operate a car enthusiasts website that consists of a forum and other content pages like FAQ, links, events, reviews. Everything is _very_ well indexed and retrievable with the right keywords except for the forum topics. Not a single forum topic is retrievable from the Google index today. All the topics are supplemental.
    Of course we’re using a Google sitemap for the topics and I can see in the logfile that Googlebot crawls hundreds of them every day.

    It should be very obvious that it’s impossible to get deep high PR IBLs to every topic-URL. It’s also a fact that we can’t get that many links from high PR sites because we’re operating in a niche and the language of the site isn’t even English.

    It’s just like Google misses the most important parts of our website and indexes just the less important part. This isn’t good for us (we’re using AdSense) nor is it good for the folks searching for advice for their car-related problems nor is it good for Google (they make less money that they otherwise could make).

    So my question is: what else can we do in this situation?

    Cheers

  179. How can I challenge Sergey Brin to a game of table tennis at the Google Party?

  180. I see you’r participating in the duplicate content session in San Jose. Was hoping you might want to explain a bit of Googles policy regarding duplicate content for the benefit s of webmasters who struggle with this issue everyday. I’ve read for years that search engines (including Google) would always penalize the newer content rather than the older original content but have found from experience that this is not always the case and the priginal infact may garner the penalty.

    Is there a way webmasters can identify true original work to/for the search engines?

    If a webmaster garners a penalty either for a direct copy of another websites content or content Google considers too simalar… what steps would you reccomend for geting reinstated after the duplicate content is removed?

    How close/simalar content must be to be considerd duplicate? Foe example if my website is about green widgets and there 100 other websites about green widgets won’t my website just by nature be so simalar to the others outlining the details ansd specifications of our widgets that I would risk being considerd a duplicate?

    I could go on and on but I think you get the idea.

    Thanks for your consideration.

  181. Google Opt In
    This means YOU chose to be in its ->index

  182. Could you ask the Image team this: Why are hotlinkers being represented in the Image search results? There’s nothing more insulting to me as a photographer than seeing one of my original photos in a site:mysite search as if it came from a xanga or myspace page. It’s pretty insulting to click on that image and see”Below is the image in its original context on the page” when it’s not even being pulled from that site. And it’s comical because it’s not even visible on that page due to our hotlinking protection.

  183. Matt,
    Where are you! i put my whole weekend on hold. waiting to seo.
    i hope all is well!

  184. Impatient_fanboy

    We are awaiting for your response to the July 27th data refresh questions everyone asked!

  185. Hi Matt,
    Lots of questions here – what have you started! I don’t see much of a chance of my question being answered but I give it a go anyway.
    Sitemaps – Is there a way to contact the sitemaps team directly with a specific site related question apart from the Google Sitemaps Group? Looking at many forums and Google Groups, there seems to be a lot of things going haywire at the moment.
    Thanks for you time

  186. Kevin Heisler

    one more reason to test Dragon Naturally Speaking 9.0 for your responses …MSFT has another 4 months to improve their speech recognitiion module in Vista

    http://googlesystem.blogspot.com/2006/07/vistas-speech-recognition-demo-video.html

  187. Bernie, I’m trying to record the answers on video, so that I can answer more of them than I could answer by typing.

  188. Supplemental Challenged

    ” Mr. DeMille, I’m ready for my close-up.”

  189. Matt, i was just looking for a response, making sure that you’re still with us.
    See you at the dance.

  190. Dave (Original)

    Yeah hurry up Matt. Forget your kids, wife, family, free time and week-end etc and hop to it buddy 🙂

  191. Matt,

    In a 2004 interview by Mike Gerhan (http://www.e-marketing-news.co.uk/april_2004.html), Yahoo! Search Manager Jon Glick was asked about using the Keyword Meta Tag on web pages to let Yahoo!’s search engine know that a page is a candidate for possible misspellings:

    Mike Grehan:

    “So, the advice would be to use the meta keywords tag, as we used to do back in the old days, for synonyms and misspellings…”

    Jon Glick:

    “Yeah. So this is a great chance if you’re Nordstrom for example. Many people type in Nordstroms with an ‘s’ it’s a very common misspelling. You don’t want that kind of typo on your body text when you’re trying to promote your brand. But putting that misspelling in the meta keywords tag is very acceptable and also encouraged. It’s actually letting us know, hey by the way, we are a candidate page for that query.”

    As you may imagine, this statement continues to be referenced today as a “best practice.” I would like to know whether or not this use of the Keyword Meta Tag is also valid at Google or if this usage could create a potential conflict.

  192. Okay, that’s a Friday and a weekend for people to ask questions. I’m going to go ahead and close this thread to comments so that people know to stop.

css.php