Q & A thread: March 27, 2006

Okay, let’s try tackling a few questions from the Grab bag thread. Just a hint for next time: if your question takes three paragraphs to ask, your odds of getting an answer go down. ๐Ÿ™‚

Q: “Is Bigdaddy fully deployed?”
A: Yes, I believe every data center now has the Bigdaddy upgrade in software infrastructure, as of this weekend.

Q: “Whatโ€™s the story on the Mozilla Googlebot? Is that what Bigdaddy sends out?”
A: Yes, I believe so. You will probably see less crawling by the older Googlebot, which has a User-Agent of “Googlebot/2.1 (+http://www.google.com/bot.html)”. I believe crawling from the Bigdaddy infrastructure has a new User-Agent, which is “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”

Q: “Do you take Emmy with you to San Francisco?”
A: Nope, Emmy is a true indoors cat; she doesn’t like to travel.

Q: “Any new word on sites that were showing more supplemental results?”
A: An additional crawling change to show more sites from those sites was checked in late last week, but it may still take a little bit of time (another few days) for that to show up in the index. I’ll keep an eye on sites that people have given as examples to see how those sites are showing up.

Q: “Is the RK parameter turned off, or should we expect to see it again?”
A: I wouldn’t expect to see the RK parameter have a non-zero value again.

Q: “Whatโ€™s an RK parameter?”
A: It’s a parameter that you could see in a Google toolbar query. Some people outside of Google had speculated that it was live PageRank, that PageRank differed between Bigdaddy and the older infrastructure, etc.

Q: “Now that Bigdaddy is out, will there be a new export of PageRank anytime soon?” and “Will the deployment of BigDaddy stabilise the rolling PR issues we are experiencing at present?”
A: I’ll ask around about that. If there aren’t any logistical obstacles, I’ll ask if we could make a new set of PageRanks visible within the next couple weeks. I’d expect that as Bigdaddy stabilizes everywhere, the variation in toolbar PR for individual urls is more like to settle down too.

Q: “This datacentre http://64.233.185.104/ works differently to all of the others. Noticed just a few hours ago. . . . . Where does that DC fit into the scheme of things? Is it mainly made from newly spidered data?”
A: Sharp eyes, g1smd. That wouldn’t surprise me. As Bigdaddy cools down, that frees us up to do new/other things.

Q: “Not so much a questionโ€ฆ GET A PSP!”
A: I got one today, TallTroll. I picked up Me and My Katamari (MAMK) and a PSP that turned out to have firmware v1.52 on it. So I could upgrade to 2.0, then downgrade to 1.5 so I could run homebrew programs. But I think MAMK requires firmware 2.5 or 2.6 to play, which means a one-way upgrade or maybe using RunUMD or a similar program. Suffice it to say I’m having fun just geeking around. ๐Ÿ™‚

Q: “Can you give us a general way of getting a good idea in front of Google?”
A: If it’s bizdev, there’s a bizdev dept. at Google you could contact. If it’s not a business/patent/proprietary idea, I’d mention it here or blog about it somewhere. Writing a snail mail letter could work well too.

Q: “Did you check out the guys all painted in silver doing the robot on milk crates in San Fran?”
A: Nope, that’s down by Fisherman’s Wharf. We’re hanging near Union Square.

Q: “Why do you focus your attention so much on SEOs and not at webmasters who make actual quality websites?”
A: I think that’s an issue I have personally, because I spend so much of my time looking at spam. Lots of other people focus on helping general webmasters, like the Sitemaps team, for example. I have started to do “SEO Advice” posts instead of just “SEO Mistakes” posts, but you’re right: I personally could use a reminder to keep focusing on the sites that make quality content and how to pull those sites up, not just how to counter sites that cheat. Thanks for bringing that up.

Q: “My sitemap has about 1350 urls in it. . . . . its been around for 2+ years, but I cannot seem to get all the pages indexed. Am I missing something here?”
A: One of the classic crawling strategies that Google has used is the amount of PageRank on your pages. So just because your site has been around for a couple years (or that you submit a sitemap), that doesn’t mean that we’ll automatically crawl every page on your site. In general, getting good quality links would probably help us know to crawl your site more deeply. You might also want to look at the remaining unindexed urls; do they have a ton of parameters (we typically prefer urls with 1-2 parameters)? Is there a robots.txt? Is it possible to reach the unindexed urls easily by following static text links (no Flash, JavaScript, AJAX, cookies, frames, etc. in the way)? That’s what I would recommend looking at.

Q: “When I change a robots.txt to exclude more existing files from being crawled, how long does it take for them to be removed from the index? Perhaps the answer is a function of how often the site is crawled and itโ€™s PR?”
A: It is a function of how often the site is crawled. I believe in the past that every several hundred page fetches or several days, the bot would re-check the robots.txt. Note that for supplemental results, you need recrawling to happen by the supplemental Googlebot in order for the robots.txt file to take affect on those pages. If you’re really sure you never want those pages to be seen, you can use our url removal tool to remove urls for six months at a time. But I’d be very careful with the url removal tool unless you’re an expert. If you make a mistake and (for example) remove your entire site, that’s your responsibility. Google can sometimes clear out self-removals, but we don’t guarantee it.

Q: “I would love to be able to search for html code and see how that ranks.”
A: I would like that too. Indexing non-visible things like punctuation, JavaScript, and HTML would be great, but it would also bulk up the size of the index. Any time you’re considering a new feature (e.g. our numrange search), you have to trade off how much the index would get bigger versus the utility of the feature. My guess is that we wouldn’t offer this any time soon.

Q: “Seriously, How do you plan on picking which of these questions to answer?”
A: I’m tackling the ones that looked interesting, short, and general enough that more than one person would be interested.

Q: “I am seeing a lot of sites with โ€œ%09โ€ณ (tab) and โ€œ%20โ€ณ (space) in front of the URL in Googles index.”
A: I’ll ask someone about that.

Q: (paraphrasing) The sitemaps validation fetch seems to happen with a User-Agent of “-“? My auto-reject rules reject that user agent.
A: I’ll ask someone about that. You could whitelist the IP range that Googlebot comes from in the mean time.

Q: “If one were to offer to sell space on their site (or consider purchasing it on another), would it be a good idea to offer to add a NOFOLLOW tag so to generate the traffic from the advertisement, but not have the appearence of artificial PR manipulation through purchasing of links?”
A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.

Q: “On sites directed to international audiences with the same (high quality) content in several languages is it better to do several TLDs like mydomain.com, mydomain.de, mydomain.fr, mydomain.eu and so on or do subdomains like en.mydomain.eu, de.mydomain.eu, fr.mydomain.eu or something else like mydomain.com/en, mydomain.com/de, mydomain.com/fr?”
A: Good question. If you’ve only got a small number of pages, I might start out with subdomains, e.g. de.mydomain.eu or de.mydomain.com. Once you develop a substantial presence or number of pages in each language, that’s where it often makes sense to start developing separate domains.

Q: “Any results on why IDN Domains donโ€™t show pagerank?”
A: I’ve seen a couple that do, but I’ll check into why most don’t. My guess is that there’s a normalization issue somewhere in the toolbar PageRank pathway.

Q: “Would it be possible to add a date range to queries? I might get 91,000,000 results, but the first 200 are 2-3 years old. I would like to limit results to items no more than 6-12 months old.”
A: Check out our advanced search page for this option. Tara Calashain also did some really interesting digging into this too, e.g. this info she uncovered. Google Hacks is a pretty solid book if you’d like to read more fun Google hacks.

Q: “What about the problem of directories and shopping comparison spam overriding real pages?”
A: Fair feedback. I heard that recently from a Googler, too. Sometimes we think of spam as strictly things like hidden text, cloaking, etc. But users think of spam as noise: things that they don’t want. If they’re trying to get information, fix a problem, read reviews, etc., then sites that like aren’t as helpful.

Q: “Are you planning to visit/speak in the UK at all in the near future?”
A: Sadly not. I’m hitting the Boston Pubcon and SES San Jose, but I can only do 4-5 conferences a year.

Q: “The one thing that seems to be getting to people generally, is what are the post Big Daddy intentions? Fixes, spam issues, regeneration of โ€˜pureโ€™ indices, supp. issues, PR and BL update, etc.”
A: I can’t give a timeline (e.g. “scaling up communication in April, more work on canonicalization in May”) because priorities can change, esp. depending on machine issues, deployments of new binaries, webspam developments, etc. Short-term, I wouldn’t be surprised to see some refreshing in supplemental results relatively soon, and potentially different PageRanks visible in the next couple weeks.

Q: “Even Matt is afraid to use a redirect from www.mattcutts.com/ to www.mattcutts.com/blog/ because Google might penalize his website and put it into supplemental hell.”
A: Heh. No, that’s not it. I’m deliberately leaving them separate as a test case to see how we do now and down the road.

Q: “Just like you told me a couple of months ago, the Supplemental Googlebot (SG) got around to my site and things got sorted out. Thanks. . . . . If you are in San Fran and want to check out the Monterey Aquarium, could you please write a short review? Iโ€™ve been thinking of visiting and wondering if it is worth the trip.”
A: I would definitely recommend the Monterey Bay Aquarium, especially if you can find a coupon or other good deal. I highly recommend the otters, the kelp forest, and the jellyfish area.

110 Responses to Q & A thread: March 27, 2006 (Leave a comment)

  1. Hey Matt,

    That is some pretty impressive posting ๐Ÿ˜€

    I have noted that a couple of sites that I believe had canonical probs have come back – but only sites that have been sent to your engineers.

    Not sure if this is a conincidence or that a correction is starting to roll out. If it is a correction then cool ๐Ÿ˜€ – will it hit some sites before others – depening on crawl cycle etc ? – If it is a engineers intervention then when would you want reports of these ?

    Cheers.

  2. Oops – just to clarify what I would call a correction for these sites.

    EG: Site:domain.com – domain.com is first.

    domain.com as a phrase – domain.com is first

    etc – eg the Homepage returns to its true value – rest of the site seems to follow ๐Ÿ˜€

  3. Matt, some great answers there, thanks.

    This will help put to bed some of the crap that floats around about the Google mystique LOL.

    I know that the Supplemental hell and the Lack of deep crawling are especially important to some people ๐Ÿ™‚

  4. I’m pretty sure that MAMK only requires firmware 2.0 to run, so you should be able to back and forth as required. You need 2.0 for the browser though – depends how much surfing you want to do. AFAIK, the only game that requires 2.5 is EXIT, so you should be able to wait until a downgrade form the 2+ f/wares is available before going there.

    I find that Soulseek, a USB cable and a PSP is a memory hungry combo though…. need to get a 2Gb card soon ๐Ÿ˜‰

  5. Matt, that was a fair amount of time spent on writing answers this night. Thanks.
    Apart from addressing supps, canonicals, pagerank re-calculation etc, will there be an imminent change in ranks as a result of these corrections?

  6. Hi Matt,

    As part of your review of the supplemental problem, are you also monitoring any sites whose pages have simply vanished (rather than gone supplemental)? I think the BD bug is responsible for both types of errant behaviour – sometimes it just refuses to index tens of thousands of pages, despite crawling them over and over again. That’s what we see anyway. None of the supplemental tweaks have yet made any difference to the missing pages problem.

  7. Well well, you can answer questions about Google and SEO very well, but you didn’t answer my “why are there no blue foods in nature” question?! I shalln’t be picking you as my phone-a-friend on Millionaire any time soon Mr Cutts… well, unless they start asking SERP questions in the next few shows! )

    P.S. Saw a mobile dog-grooming van drive past our office the other day, called “Mutt Cutts” – I had a little chuckle.

  8. Cheers for all these answers..

    I do have one question though, with so many different sources of Pagerank, Live Pagerank, future Pagerank etc. What would you suggest we use to see an accurate measurement?

  9. Please answer this:

    From: http://www.mattcutts.com/blog/miscellaneous-monday-march-27-2006/#comment-19408

    ยซFor accessibility purposes, my site has ‘skip navigation’ etc… to allow screen readers to get straight to the content. [..] so I have ‘hidden’ these accessibility links using display:none in the stylesheet. [..] Will Google regard this as hidden text and penalise my site?ยป

  10. On TLDs and international audiences: When a site is in one language how should it be expressed to Google that it is for a global audience?

    For example restaurant reviews and shopping could be seen as local and localised respectively; but product reviews (where the product is available globally), encyclopaedia entries and reference material are more for a global audience.

    There are suggestions the site be duplicated at the various TLDs e.g. .com, .co.uk, .ca, .au, etc. But this wastes bandwidth for the site and the google bots, encourages link splitting and can confuse the users.

    The geo of the IP doesnโ€™t always work as for example 1and1.co.uk gives out German IP addresses, and many other websites use US hosting for cheaper costs.

    Just wondering for a clarification on how this issue should be tackled as the various Google SERPs are becoming more and more local even if the user is not requesting pages only from their contry (google.com vs. google.co.uk or even it seems google.com used from a US ip vs. google.com used from a UK ip).

    P.S. Keep up the good work!

  11. Matt thank you for taking the time to answer all these questions. What you are doing here says a lot about your character and commentment to the webmaster community.

    I didnt get to ask a question but let me try now. If I agree to buy you Starbucks every morning could you place my website at the top of the results ๐Ÿ™‚ Since my new site isnt ranked yet, thats all I can afford is one cup per morning ๐Ÿ˜‰

  12. Thanks for your time, Matt.

    Very generous of you. Much appreciated.

  13. Very disappointed no comments on expired domains.
    Looks like we will continue to see domains such as
    macalstr.edu/
    astronomy-national-public-observatory.org/
    rarestonemuseum.com
    iasicongress2005.org
    papyrusinternational.org/
    and many others in the adult serps.

    Seems like its all too hard for the webspam team and this reflects badly both on google and the adult internet industry.

  14. So how long does it take for 301’s to take effect across all the DC’s? Even Y*hoo and M*N don’t seem to have a problem with it. ๐Ÿ™‚

  15. Eternal Optimist

    Matt, Firstly many thanks for both your time and efforts. I appreciate that you cannot be specific on certain points, due to the nature of privacy at Google.

    Is it within your power to explain exactly what the following GoogleBots do? [You already answered 5.0 above ] – thanks ๐Ÿ™‚

    crawl-66-249-65-225.googlebot.com
    Mozilla/4.0 compatible ZyBorg/1.0
    Mozilla/4.0 compatible ZyBorg/1.0 Dead Link Checker
    Mozilla/5.0
    Googlebot/2.1

  16. Thanks for answering these questions! Great information.

    The URL Removal Tool has been broken for weeks. For example I’ve tried to remove directory.sysice.com from the index cause I took it down a few months ago, but I just get a Page Not Found when I try to submit it.

  17. 301 Redirect Problem

    The biggest problem that I’ve seen many worry about here and that google is way behind in addressing is 301 redirects with domain moves from domain1 to domain2 and Matt seems to forever be ignoring this question .. Even though it was asked about more than 3 to 4 times in the list of questions here and in many other comment posts by viewers Matt and google continue to ignore it or give vague answers about how or when google plans to address this..

    Matt can you please once and for all address the question and webmasters concerns of how and when we can expect to see googles / bigdaddy properly handle domain name moves using 301 redirects?

  18. One comment that you may not publish but I hope will read… WHAT is going on at blogger? It is google’s worst product by a country mile. Regularly unreliable and I can’t recall a single new feature that has been added since you brought it on board. It is dreadful and if I hadn’t been unfortunate to *start* using it I wouldn’t still be using it. I try and warn everyone away and it makes me sad ๐Ÿ™

  19. Thank you Matt for taking the time to answer questions or even to look into the IDN Domain issue with the pagerank. These domains will truly advance the international internet experience.

  20. Hi Matt,

    Great effort answering so many questions, thank you.

    One thing I’m still curious about (so are many others):
    [blockquote]A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.[/blockquote]
    Does this include linked images?

  21. Damn…. if there are 2 choices I always make the wrong one – lol – sorry about the

    blown tags

    ๐Ÿ™

  22. Hi,

    If BD is out now then how comes SERP’s are showing pages that haven’t existed for 9 months plus and return 404’s?

    Cheers,

    K

  23. Good job in answering so many questions, and I know you can’t answer every single one. But, it’s a shame you didn’t answer one of the most popular questions, about the loss of pages. Did you not want to answer it, or did you just miss it?

    Thanks

  24. No answer……………
    It has been three months since spam has taken over the majority of adult search results in Google.

    It’s strange to see “somewhat” relevant results one day Dec 26th then Dec 27th just about the whole adult white hat community was wiped out, filter maybe?.

    I believe that the adult serp problem is bigger than the supplementals – I just hope that it’s not being ignored.

    What I am saying here applies to the entire adult industry in Google not just my little ole site.

  25. Mike (Germany)

    ========
    Q: โ€œNow that Bigdaddy is out, will there be a new export of PageRank anytime soon?โ€ and โ€œWill the deployment of BigDaddy stabilise the rolling PR issues we are experiencing at present?โ€
    A: Iโ€™ll ask around about that. If there arenโ€™t any logistical obstacles, Iโ€™ll ask if we could make a new set of PageRanks visible within the next couple weeks. Iโ€™d expect that as Bigdaddy stabilizes everywhere, the variation in toolbar PR for individual urls is more like to settle down too.
    ========

    Hi Matt,

    I think, it would better, the PageRank is not visible in the toolbar.

  26. Hi can you post some photos of Emmy? We are cat lover.

  27. I have reported several sites that use different spamming techniques. But nothing happens. For exampel look at this site http://www.kickoff-konferens.se/rw/ and go to the bottom of the site. They mention Mirror1, mirror2, mirror3 and mirror4. Why donยดt Google exclude them? It feels like its ok to spam i Sweden and get top positions..

    // Not so fun being a white hat SEO in Sweden.

  28. and this reflects badly both on google and the adult internet industry.

    Ahh.. that’s why so many people say bad things about porn… expired domains. Here I was thinking it was some sort of morals issue.

    Mike, I agree with that.. Take visible pagerank out of everything. People put way more faith and dependance in it than they should, and it’s still easy to fake.

    Give some site a PR higher than 4 and they instantly think they’re worth millions and have hit the big time.

  29. >You could whitelist the IP range that Googlebot comes from in the mean time.
    Do you have a listing of all the Googlebot IP addresses?
    Thanks.

  30. Q: โ€œWhat about the problem of directories and shopping comparison spam overriding real pages?โ€

    A: Fair feedback. I heard that recently from a Googler, too. Sometimes we think of spam as strictly things like hidden text, cloaking, etc. But users think of spam as noise: things that they donโ€™t want. If theyโ€™re trying to get information, fix a problem, read reviews, etc., then sites that like arenโ€™t as helpful.

    To balance that feedback: We maintain a niche B2B directory and customer feedback and high listing CTR seems to indicate that a large number of visitors are indeed looking to “buy” products when they type in a product keyword and the directory is indeed relevant.

    Google has to make an educated/algorithmic guess about the searchers intent (Information or Purchase). If an action keyword complimenting the product keyword is not specified in a search, the type of product itself can be used to yield a decent intent relevancy.

    SERPs should not be flooded with directories, but there is always bound to be more -ve feedback on directories, since there are a lot more individual site webmasters than there are directories!

  31. I have a question about one of your answers.

    You responded to a question about NOFOLLOW tags with:

    A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.

    I was wondering if you could clarify something for me. I’m setting a website for my sailboat and I have the opportunity to join an affiliate program that places banners on my site.
    Is Google going to see my joining an affiliate program as selling space on my site? I would like some advice before I put my site live.

    Thanks for the great blog and information, I know you’ve helped me from making countless mistakes!

    -Rob

  32. Hi Matt

    Wow I can say I’ve been pretty close on what Googles been up to…even something so simple as robots.txt usage…amazing how many sites do not have one or one that is validated.

    Let me ask you a question…besides those selling links and not using nofollow tags getting filters…care to explain to everyone how artificial link building is what results in their being penalized or what they think is ‘sandboxed’…?? You would do Google and yourself a world of good…

    Also I like the SEO Focus as opposed to a single website as SEOs can reach many more websites than just one…Techniques ideas that work are always good though from either viewpoint.

    Thank you

    Clint

  33. Thanks for taking the time to answer the questions Matt! I greatly appreciate the time and energy it takes you to address all the questions that you did. I’m sure that all of us in the SEO Community feel the same way. ๐Ÿ™‚

  34. Ahh.. thatโ€™s why so many people say bad things about pornโ€ฆ expired domains. Here I was thinking it was some sort of morals issue.

    Silly rabbit. What would make you think that?

    While we’re at it, the bad reputation associated with porn has nothing to do with the underage actresses, the B-grade production values, or the lack of frills such as plot and content.

  35. By the way, Matt, I thought of another question (not counting this one) that would relate to a lot of people.

    Should we keep asking questions now, or wait until the next grab bag thread?

  36. Aquariums are extremely cool, my son goes nuts when we visit them and so do I! ๐Ÿ™‚

    Thanks Matt, a little vague on the RK thinger but I am sure PR obsessed Jim W. will figure that out, right Jim? ๐Ÿ˜‰

  37. De site gebruikt artikelen van derden en plaatst deze middels korte (RSS)feeds op duizenden pagina’s om zo op diverse trefwoorden gevonden te worden.

    Matt, thanks for your time to answer all those questions.

    I have been a criticaster and will be, please don’t shoot me for that. ๐Ÿ™‚

    Regarding a question about RSS-feeds that seem to have become more of value and are even placed above regular and content-rich site, i would like to ask a suplementary question:

    More and more RSS is being used to post feeds on other websites regarding articles that were scraped of other sites and placed on the website that is sending feeds for them.

    I see websites shooting up in the rankings for specific keywords, that were nowhere bevor they scraped other sites and post them in unbeleivable quantities of RSS-feeds.

    I don’t think this is an ethical way to promote a website but don’t see any action taken by Google.

    Could you comment on this one.

    Thanks in advance,

    Tonnie

  38. Sorry for the Dutch that slipped in. ๐Ÿ™‚

  39. Using PR as piece to decide how deep a website is being crawled and indexed and saying that in public just forces everyone to continue the quest for artificially getting incoming high PR backlinks.

    Not every high quality website receives many backlinks to get a high PR. Nor should a webmaster be forced to ask for backlinks or buy links to increase Page Rank. Isn’t there a better way for you guys to determine how deep a website will be crawled and indexed?

    Anyway – thanks for taking the time to answer so many questions.

    Christoph

  40. Hi Matt

    When are you going to make Google handle non English languages better ? For example, to make Google stop considering the usual stop words (in Greek) and the plurals as completely different words?

    By the way, keep on providing us with useful information.

  41. Lots of good information Matt, however, I hope the Wikipedia problem with the + and ” ” is fixed soon.

  42. Thank you Matt,

    I hate those directory sites and your post really makes it feel like someone at Google is listening. Your time on these posts is very much appreciated. It’s like we’re not flying blind anymore.

  43. Phrase: Dog Gift Basket
    Position #1: Directory
    Position #2: Same Directory
    Position #3: Regular Site
    Position #4: Affiliate of above “regular site” Boo!!!

    See the trend? If you are below those and have a real product you lose, those above sites appear to be very official and suck in the limited business.

    There is some truth to this pet conspiracy and seeing “SEO’s” in forums with their signature links to their gift basket affiliates makes me sick! Weee, let’s all sell the same freakin’ product, lame!

    Damn but here I go again, did Matt have the patience to read my rambling? It’s got some mad truth in it dudes.

    Word!

  44. Thanks Matt,

    I still donโ€™t know why you fight Spam with one hand and monetize it with the other. Do the sales people even speak to the quality control people? It almost seems like the Spam fighting is being funded by the profits you earn from the Spam. This is why I donโ€™t believe Google is serious about the problem.

    Why not fight Spam by not monetizing it in the first place?

    It seems simple enough to me.

    โ€ข Have a person look at the websites before you allow them to place Google ads.
    โ€ข Make reporting Spam easy and actively encourage it.
    โ€ข Have a person with a brain evaluate the Spam reports.
    โ€ข Remove offending websites by hand if necessary.
    โ€ข Impose legal and financial penalties for people who violate the terms and conditions
    โ€ข Delay payments by 30 days or so like some of the major affiliate programs. If a Spammer is caught you can deny up to 30 days of his Spam revenue.
    โ€ข Prevent the spammer from ever signing up again by blacklisting his SS# and Tax ID# for the rest of the guyโ€™s life

    Or do I just not understand the problem?

  45. “Hawaii SEO” sounds so hot and sexy, anyone got that URL yet? Nope, you better grab it dude! ๐Ÿ™‚

  46. Hi Matt,

    That was a fantastic set of Q&A. Bravo.

  47. BTW, I talked to someone from Sitemaps, and they’re working on a more descriptive user-agent for the next release.

    Also, welcome Memeorandum-ers. ๐Ÿ™‚

  48. Hi Matt,

    I was curious what Google was planning on doing about these ‘link vaultage’ sites. These link vaultage sites basically require you to run some code on specific pages on your site that render static text links on your page. These links look like natural text links so there’s nothing that really gives it away that they’re automated through these link vaultage sites. In return, other sites enlisted into the same program post text links pointing to your site. They’re not reciprocal links, but just a very quick way to get many high PR inbound links right away.

    This is obviously unfair to those webmasters that actually email some other webmasters to be added to their link page or some other more manual process of adding a text link on a specific web page. How is google planning on leveling the playing field? One of my competitor is obviously enlisted in this program because they have over 3,000 backlinks all from pages that currently do not even have a text link pointing to their site.

    As a webmaster and bussiness owner, I’m worried about this. I can not compete at the same level because they’re cheating…take a look at their back links to see what I’m talking abt hookah bzz with a u.

    I know other webmasters would be interested to hear what will happen in the future to sites who ‘cheat’ by using a link vaultage service.

  49. Matt,

    A number of people have expressed some concerns about your comment: “Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.”

    It seems you’ve caused a growing panic among many people who pay Yahoo! to link to their sites from the Yahoo! directory. Can you elaborate somewhat on Google’s paid link policy?

    Thanks.

  50. Well done on the Q&A thread Matt. It’s been very good.

    In one of your answers, you suggested mentioning ideas here (in this blog), but there isn’t really a place for them. On a couple of occasions, I’ve jumped into threads that were vaguely on-topic, and both were successful, but there must be many times when people wants to get something across, or to make a comment, but there isn’t a suitable thread, or what threads may be suitable are so old that you are unlikely to read new posts in them.

    So how about starting a “comments” thread that you will dip into regularly – similar to this one but perhaps more continous?

  51. Hi Matt,

    do you know if this DC was converted to BigDaddy?

    http://64.233.187.104
    http://64.233.187.99

    Results on those 2 very different from others.

    Can you please comment?

    Thank you,

    Vick

  52. We’ve known for years that Google’s spidering has a lot to do with pages’ PageRanks, but why? Fair enough if it’s only the frequency of spidering that PageRanks affect, but in one of your answers, you said that it affects the depth of spidering. Why?

    Doesn’t Google want to index pages? If not, why not? It’s common sense that sites with lower PageRanks are every bit as useful as those with higher PageRanks. There are many types of site that other sites don’t naturally link to very much, so why keep their inner pages out? Why encourage them to go on link-building campaigns, knowing that they are pretty much forced to do it in unatural ways – ways that you don’t want?

    Sorry, Matt, but I don’t see the sense in it. Either Google wants quality pages, or they don’t. Just because some of them are deeper than others, doesn’t mean that they aren’t good quality. Imo, PageRank shouldn’t have any bearing as to whether or not a page in spidered.

  53. errr, I don’t this is another grab bag post already!

    I feel for you Matt, even when you answer umpteen questions it only serves for some to ask even more.

  54. One too-late question, but one that still bears asking, since it bears a direct effect on the future of the universe:

    Did you go to San Fran specifically to beat up the cast of Full House? I don’t think you could take Uncle Jesse, but I think you could smoke Danny Tanner straight up.

  55. Cheers for answering my question Matt. Whenever you’re ready to take a break from the SEOs, and walk among the regular webmasters, you can find loads of them on Sitepoint’s forums. ๐Ÿ˜‰

  56. Good morning Matt

    It seems your cat Emmy is very famous among the folks at WMW ๐Ÿ™‚

    Do you care to share with us a picture of Emmy girl?

    Have a great day.

  57. Matt there appears to have been a PR export start around or on Feb 18 which as in the middle of the BigDaddy upgrade which started Jan 4th.

    Many DC’s have been showing differenet PR. Was there a PR export on Feb 18 during the BigDaddy update? Or was the last PR export on Dec 19th?

    I just want to get this list right.
    http://www.seocompany.ca/pagerank/page-rank-update-list.html

    Thanks!

  58. Great Q/A session Matt, really helpful. Can you give any insight to the new deisgn of the Google homepage? I noticed it a couple weeks ago and haven’t heard much more about it. Would love some more details on why it popped up, if I’m on a beta testing account, or anything more about it. Thanks Matt!

  59. Great post, you should do Q&A stuff weekly. Posts like this is exactly why I keep visiting your blog.

  60. Hey Matt

    You should consider making this Q&A into a tradition. Would certainly bring me back each month, to read your answers, even if they are about your cat.

    By the way, where does the name “Big daddy” come from, are you “Big daddy” ๐Ÿ™‚ ??

    Cheers
    Jakob Drรฆby

  61. Hi Matt,
    i’m very happy to have found your blog and thanks for taking time to do this.

    Question:
    Are you able to give some guidlines to what the difference between a site with PR6 and one with PR10. Our company run several public acces news websites which have PR5-6 on the top level pages (like http://www.politics.co.uk and http://www.inthenews.co.uk). What things make http://www.bbc.co.uk/news better/higher value?

  62. Best Blog Post Ever, Thanks MATT.

    Can we know if any of those parameters will affect the new Google ( web 2.o ) … just wondering ?
    Thanks.

    Some weired results are still appearing:
    Query Google Swedish http://www.google.com/search?hl=en&q=google+swedish&spell=1

    SERP: http://www.google.com/intl/xx-bork

    And concerning PRs , YEs many Unique evaluations showing up on unique sites pages. Some Main index pages getting O and the rest of the site pages PR 8 – 6 or 7

  63. “Yes, if you sell links, you should mark them with the nofollow tag.”

    As Michael pointed out above, Yahoo doesn’t.

  64. Matt,

    I’m another who’d love just a little more background on the reL=”nofollow” issue.

    How do you recognise a paid-for link?

    Presumably, if you are simply looking for sites with a lot of links and a paypal account (eg directories) there is still no way for Google to distinguish between ‘pay-for-inclusion’ and ‘pay-for-review’?

    So a genuine directory site full of human edited links that are published precisely and specifically because they are worthy sites of benefit to visitors (just the sort of “with-juice” vote that Google needs for indexing) also needs to wear the Google safe-linking condom to avoid being penalised?

  65. Q: โ€œIf one were to offer to sell space on their site (or consider purchasing it on another), would it be a good idea to offer to add a NOFOLLOW tag so to generate the traffic from the advertisement, but not have the appearence of artificial PR manipulation through purchasing of links?โ€
    A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.

    this answer raised my eyebrows. i’ve never heard of “reputation in Google”. could you please expand upon this.

    otherwise, this was an excellent post with some true merit. it might be seen as, *gasp*, quality content! :shocked: ๐Ÿ˜€

  66. Matt,
    I am trying to wrap my hands around this whole BigDaddy thing and I don’t unerstand why my site has different rankings still for different datacenters. In the three IP’s listed in February’s Q&A my site comes up first for the 66.249.93.104 and 216.239.51.104 but it isn’t listed at all at 64.233.179.104. Why would this be?

    Thanks…

  67. THANK YOU Matt!

    As soon as we saw your answer yesterday regarding supplemental URLs and robots.txt, we updated the robots.txt and crossed our fingers ๐Ÿ™‚

    We really appreciate your being there.

  68. Hello Matt,

    First of all thank you for your answers and I really appreciate your communication with people like me.

    I’ve got a very strange situation. Some of my affiliates made a redirect to my website and Google show their pages in search results higher then my website’s page (and sometimes no my website in search results at all). I’ve sent you few examples to your email (hope you will read it)

    Since I’ve changed my domain name from olddomain.com to newdomain.com (about 3 weeks ago after request of Microsoft) I’m still getting results in Google with my old domain name! I’ve created 301 redirect to the new one.

    Would you please let me know is this Big Daddy update results and I should expect for a next update, or are there any chances to get in “normal” in next days/weeks?

    Also, would you please let me know how long should I expect for a new domain will be “returned” in Google index to a “normal” positions as the old one was? Is there any changes in this process after Big Daddy update?

    Google in novadays sometimes working like Altavista in 1998… It’s realy upset ๐Ÿ™

    Thank you

  69. Matt:
    Referring to your comment above ” Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google” — my two cents: Selling text links on a site and getting penalized (… affected reputation..) shouldn’t be related at all. For example, I could have a great content site, on which I offer advertising space (for a variety of reasons, including my UI preference, say I offer text link advertising). By putting a no follow for this content-rich page, wouldn’t search engine visitors be deprived of some valuable content that could/should be showing up on SERPs?
    Manoj

  70. Hey Matt!

    I am excited to see that you are going to be at PubCon – Boston! This must be a recent add? I look forward to meeting you there…

  71. OUCH Matt ๐Ÿ™‚ – I see my post is gone, but very confused as to why because i read your guidelines and was not asking questions about my site, but about Google’s spam policy and one I have seen others ask ๐Ÿ™ I posted a long list of links to give you an idea of how prevelant the issue is, but they were not my links….

    The post was about why does Google allow link sites with no real content and that are not directories collect sponsored links and make money off them…was not sure if I was missing something about Google’s interpretation of spam..

    “I know for a fact that there are literally hundreds maybe thousands of these sites run by these same two companies ..”

    Well I do hope you answer because I think this is a real issue because these sites look relevant to the spider and yet are not and they often show up higher in search results than real sites do.. in addition they are earning a lot of money through this deceptive practice by the sheer numbers of sites they have on the net.

    Hope this does not count as a double post.. thought maybe you just didn’t like my links ..

    Thanks!
    KS

  72. Very very useful thread… I will put this all info on my blog too….

    thanks

  73. By putting a no follow for this content-rich page, wouldnโ€™t search engine visitors be deprived of some valuable content that could/should be showing up on SERPs?

    If it were that valuable and useful, then why wouldn’t you link to it for free? Seems to me a resource of that much weight should be shared with the world.

  74. Hi Matt, this dully was a slap-up interview, informative and consistent, I would totally fairly often want such highlights once again good thank you for it.

    Kind Regards

    Susan

  75. Hi Matt

    Take a look at the bottom of this PR6 page. If you have time you may wish to click some of those hundreds links too.

    http://www.bizwiz.com/index_v09.html

    Should such pages reported to Google WebSpam Team?

    Thanks

  76. I wouldn’t post my url in the sig, not even reading the guidelines, but me, as other romanian webmans have a problem:

    One of my sites was on top for afew of it’s keywords yesterday (29), and now is gone for good (I cannot find it in the first 10 pages). Wich made me think if the damn site has something wrong, and I checked it. It’s not a big deal, a personal blog, but a good one (maybe not as good as this one, thanks Matt!). Why the romanian datacenters are showing completelly different datas?

    Thanks, hope it’ll solve out… (damn image codes, I always mix up with those…)

  77. Hey Matt, I didn’t know you were into cats! My wife and I have three abysinians. If you ever get a chance, check out their unique personality.

    I was hoping you could attend the Santa Clara University’s (Markkula Center for Applied Ethics) Conference on “The Ethics and Politics of Search Engines” held on 2/27/06. Perter Norvig, Director of Research for Google had some interesting things to say about the “Randomization” of Google’s search results:
    “Yeah, and we do use some randomization and experimentation in our results. So at any one time, we’re probably running dozens of different experiments where we’re trying out variations to see is this variation going to be better than the standard one? So you do see a lot of turn and mix, both because of our changes in the algorithms and also because of the changes in the Web. So the results that are number one today may be different than the results tomorrow for very subtle reasons having to do with both changes in the link structure of the Web and with the changes we’re experimenting with.”

  78. hi matt,

    Szenario; a .com site hosted in the US but in a non english language;

    Is the pointing (redirect?) of the country specific domain to the .com site enough to indicate to Googlebot that the .com site should be included in the country specific search?

    cheers
    viggen

  79. Hi Matt,

    I’ve learned alot from you. Thank you!

    What is the best way to make a page printer friendly for Google and SEO?
    I currently use a media=”print” css with display: none for nav bars etc. I just don’t want it to look like I am hiding things.

    Thanks again!

  80. Thanks Matt – very extensive and helpful information as always.

  81. dommage pour le RK :s, on devra attendre la GG dance maintenant :s

  82. So Matt,

    When will one of the major business magazines smarten up and put you on the cover as the the poster boy for quality, non-spam search results?

  83. Matt,

    Just excellent! You should do this more often.

  84. I started linking with other sites a long while back. The majority of these links are non-themed. Should I leave them in place, or delete?

  85. Speaking of redirects …because many people ‘mis-hear’ my url I bought the ‘mis-heard’ url and pointed (or redirected ) it to my site. Does Google see this a problem?

  86. A serious implication to the storing of old data has emerged.

    An Australian Police’s blunder of retaining sensitive anti terrorism contacts online and the subsequent reporting of it by the Sydney Morning Herald this morning http://www.smh.com.au , potentially involves the retention of old data by Google, possibly in it’s supplementary index. Mention is made in the article that the data can be recovered on Google.

    Although this is to be verified it does demonstrate a serious problem of not being able to remove supplementary results by the site owner – notwithstanding the original blunder and other issues

    Without wanting to be melo-dramatic I think the practice of storing old data online may have serious implications for some website owners who do not wish it to be there, or believe for good reason it should not be there.

    Matt – is there a way to better manage this in conjunction with Google ?

  87. Heh, I live in Monterey and would definately suggest seeing the aquarium ๐Ÿ™‚

    Matt, you cant forget about the great white shark.
    http://mbayaq.org/efc/efc_smm/smm_meetBrowser.asp?tf=12

    They are the only aquarium in the world to have one on display ๐Ÿ™‚

  88. Any one notice the PR update Matt requested is now occuring!! Looks like only new sites got the new PR!!

  89. Hi Matt, sorry to leave this so late, but I wanted to clarify a point you made about getting sites indexed:

    [b]”One of the classic crawling strategies that Google has used is the amount of PageRank on your pages. So just because your site has been around for a couple years (or that you submit a sitemap), that doesnโ€™t mean that weโ€™ll automatically crawl every page on your site. In general, getting good quality links would probably help us know to crawl your site more deeply.”[/b]

    Does this mean webmasters should have a linking strategy in place? To get organic links when people can’t find the pages in search engines in the first place IS rather difficult! I for one am loath to buy links or chase links as organic links are so much better, but it appears the only way to get properly indexed by Google?

  90. Sobriety Online

    Matt, can you delete this post and my prior post (3/31/06 3:27 pm) on this thread? Ever since I posted here I have had nothing but bad luck with my web-site and google. Thank you.

  91. Thanks for clearing up the Googlebot issue, we were begining to wonder why our site was being ignored by Google!

  92. We are getting ready to move a very large site that we’ve been developing for some time into production. The client is a well established, national lending company. The site is content rich and large, approximately 250k pages. Because of their prexisting relationships in the industry, they have 4 or 5 strategic partners linking to them that generate another 50k backlinks. What do you believe is the best way to deploy this site without getting torched and to minimize time in the sandbox. We are concerned about the strong search engine saturation and link popularity metrics even thought they are ‘legitimate.’ Because lending is such a competitive space, do we need to worry about this? Should we move everything into production at once or in phases? Any thoughts?

  93. I used domain.com and http://www.domain.com
    I start linking to domain.com
    january I checked
    site:domain.com= about 130 000
    february I checked
    site:domain.com= 80 000
    march
    site:domain= 50 000
    appril
    site: domain=about 30 000
    may
    site: domain= about 790

    i checked
    link:domain.com = 0 !!!!
    link: http://www.domain.com
    site:www.domain.com = about 10 000

    i do not know what is happened, why google choose http://www.domain.com if my linking strategies was directed to domain.com ?
    why my sites, links was throwing out from google index?

    my site has not any outbound links.
    plz help
    best regards

    Siluk

  94. Matt,
    Why is it that new products or new developments invariably show up in Google competitors’ search engines first? If I’m dealing with a new product, I invariably have to search a competing search engine in order to find it.
    Best regards,
    Gerald

  95. Does Googlebot crawls .pdf, .swf, etc files ?

  96. Hi Matt,

    Thank you. Thank you. Thank you.

    I broke a couple redesign rules: not installing a proper 301 (initially) and changing too many page urls (page content remained the same – change was file extension). Did my pages lose their credibility?

    Give Emmy a scrath under the chin for me.

    See ya,
    Bill

  97. Thanks for the info. A lot of great answer!

  98. Thanks for the clarification, Matt!

  99. This post is just useful. It really helps me lots.
    I have a question:
    recently I bought a few expired domain with PR, will it lose on next update or Google would just treat it as a new domain? Thanks!!!

  100. Does Google crawl database generated pages with session id ?

  101. Hi Matt,

    can you say us something to double content – i make very good recipes on my site http://www.besterezeptesuche.de – and I also give some recipes to yahoo.de, web.de and gmx.de. to health or lifestyle site. in all recipes at the site of our partners you can find a link with definition of copyright and “quelle”.

    Is it a problem for us (bad listings in gindex) and does google check that the content is from us?

    Thank you for your answear if possible.

    thomas

  102. Does Google read the style sheets on a webpage? I have heard that it can help my website placment if I have H1 and H2 tages. I have created custom H1 and H2 tags for the purpose of making them look better. Does anyone know if Google still regards this has an “H1” tag?

  103. Thanks for the clarification, Matt!

  104. Hi Matt,
    We have a client who has 1 site for domestic visitors and another for international visitors. If a user originates from outside of Ireland according to their IP address, a layer will roll out before the homepage loads. A pop up will allow the visitor to select their home country and click on a link to the relevant website. Do you have any idea what are the SEO implications for this?

  105. Hi. How do i pre-pay for Adwords from Kenya – Nairobi, I have no credit card.

  106. Hi There,

    We launched a site for dating 45 days ago and we are trying to get the best exposure on the SE. What I am still observing is that either though we have links still no PR is given to us (we have few PR4-5 sites linking to us) and the generic searches are very little.Since our site is for dating, we have user generated content as profile pages. The question I have is should we add all of the profile pages in the sitemap and allow to be browsed by spyders. Do you believe this might help with the ranking? Also, do you recommend linking with higher PR sites or with more but lower PR sites. Using google Adwords shows that almost every ‘spam ad site’ lists us, but they have no PR at all and I doubt that the users actually convert into signups from these sites.

    Thanks a lot.

  107. Quote: See the trend? If you are below those and have a real product you lose, those above sites appear to be very official and suck in the limited business.

    I see alot of this, i.e. where one vendor buys out all the ranking directory listings and gets affiliate links indexed. I found a really extreme example of this the other day. Try [tutorial builder]. Every single link on the front page is a freeware software site listing the same product.

  108. Very useful q and a’s thanks

  109. Question and answer in comments is really helpful. I learned things instantly in just minutes of scanning comments from sites or personal pages from people like Matt. Kudos!

css.php