Q & A thread: March 27, 2006

by on March 28, 2006

in Google/SEO

Okay, let’s try tackling a few questions from the Grab bag thread. Just a hint for next time: if your question takes three paragraphs to ask, your odds of getting an answer go down. :)

Q: “Is Bigdaddy fully deployed?”
A: Yes, I believe every data center now has the Bigdaddy upgrade in software infrastructure, as of this weekend.

Q: “What’s the story on the Mozilla Googlebot? Is that what Bigdaddy sends out?”
A: Yes, I believe so. You will probably see less crawling by the older Googlebot, which has a User-Agent of “Googlebot/2.1 (+http://www.google.com/bot.html)”. I believe crawling from the Bigdaddy infrastructure has a new User-Agent, which is “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”

Q: “Do you take Emmy with you to San Francisco?”
A: Nope, Emmy is a true indoors cat; she doesn’t like to travel.

Q: “Any new word on sites that were showing more supplemental results?”
A: An additional crawling change to show more sites from those sites was checked in late last week, but it may still take a little bit of time (another few days) for that to show up in the index. I’ll keep an eye on sites that people have given as examples to see how those sites are showing up.

Q: “Is the RK parameter turned off, or should we expect to see it again?”
A: I wouldn’t expect to see the RK parameter have a non-zero value again.

Q: “What’s an RK parameter?”
A: It’s a parameter that you could see in a Google toolbar query. Some people outside of Google had speculated that it was live PageRank, that PageRank differed between Bigdaddy and the older infrastructure, etc.

Q: “Now that Bigdaddy is out, will there be a new export of PageRank anytime soon?” and “Will the deployment of BigDaddy stabilise the rolling PR issues we are experiencing at present?”
A: I’ll ask around about that. If there aren’t any logistical obstacles, I’ll ask if we could make a new set of PageRanks visible within the next couple weeks. I’d expect that as Bigdaddy stabilizes everywhere, the variation in toolbar PR for individual urls is more like to settle down too.

Q: “This datacentre http://64.233.185.104/ works differently to all of the others. Noticed just a few hours ago. . . . . Where does that DC fit into the scheme of things? Is it mainly made from newly spidered data?”
A: Sharp eyes, g1smd. That wouldn’t surprise me. As Bigdaddy cools down, that frees us up to do new/other things.

Q: “Not so much a question… GET A PSP!”
A: I got one today, TallTroll. I picked up Me and My Katamari (MAMK) and a PSP that turned out to have firmware v1.52 on it. So I could upgrade to 2.0, then downgrade to 1.5 so I could run homebrew programs. But I think MAMK requires firmware 2.5 or 2.6 to play, which means a one-way upgrade or maybe using RunUMD or a similar program. Suffice it to say I’m having fun just geeking around. :)

Q: “Can you give us a general way of getting a good idea in front of Google?”
A: If it’s bizdev, there’s a bizdev dept. at Google you could contact. If it’s not a business/patent/proprietary idea, I’d mention it here or blog about it somewhere. Writing a snail mail letter could work well too.

Q: “Did you check out the guys all painted in silver doing the robot on milk crates in San Fran?”
A: Nope, that’s down by Fisherman’s Wharf. We’re hanging near Union Square.

Q: “Why do you focus your attention so much on SEOs and not at webmasters who make actual quality websites?”
A: I think that’s an issue I have personally, because I spend so much of my time looking at spam. Lots of other people focus on helping general webmasters, like the Sitemaps team, for example. I have started to do “SEO Advice” posts instead of just “SEO Mistakes” posts, but you’re right: I personally could use a reminder to keep focusing on the sites that make quality content and how to pull those sites up, not just how to counter sites that cheat. Thanks for bringing that up.

Q: “My sitemap has about 1350 urls in it. . . . . its been around for 2+ years, but I cannot seem to get all the pages indexed. Am I missing something here?”
A: One of the classic crawling strategies that Google has used is the amount of PageRank on your pages. So just because your site has been around for a couple years (or that you submit a sitemap), that doesn’t mean that we’ll automatically crawl every page on your site. In general, getting good quality links would probably help us know to crawl your site more deeply. You might also want to look at the remaining unindexed urls; do they have a ton of parameters (we typically prefer urls with 1-2 parameters)? Is there a robots.txt? Is it possible to reach the unindexed urls easily by following static text links (no Flash, JavaScript, AJAX, cookies, frames, etc. in the way)? That’s what I would recommend looking at.

Q: “When I change a robots.txt to exclude more existing files from being crawled, how long does it take for them to be removed from the index? Perhaps the answer is a function of how often the site is crawled and it’s PR?”
A: It is a function of how often the site is crawled. I believe in the past that every several hundred page fetches or several days, the bot would re-check the robots.txt. Note that for supplemental results, you need recrawling to happen by the supplemental Googlebot in order for the robots.txt file to take affect on those pages. If you’re really sure you never want those pages to be seen, you can use our url removal tool to remove urls for six months at a time. But I’d be very careful with the url removal tool unless you’re an expert. If you make a mistake and (for example) remove your entire site, that’s your responsibility. Google can sometimes clear out self-removals, but we don’t guarantee it.

Q: “I would love to be able to search for html code and see how that ranks.”
A: I would like that too. Indexing non-visible things like punctuation, JavaScript, and HTML would be great, but it would also bulk up the size of the index. Any time you’re considering a new feature (e.g. our numrange search), you have to trade off how much the index would get bigger versus the utility of the feature. My guess is that we wouldn’t offer this any time soon.

Q: “Seriously, How do you plan on picking which of these questions to answer?”
A: I’m tackling the ones that looked interesting, short, and general enough that more than one person would be interested.

Q: “I am seeing a lot of sites with “%09″ (tab) and “%20″ (space) in front of the URL in Googles index.”
A: I’ll ask someone about that.

Q: (paraphrasing) The sitemaps validation fetch seems to happen with a User-Agent of “-”? My auto-reject rules reject that user agent.
A: I’ll ask someone about that. You could whitelist the IP range that Googlebot comes from in the mean time.

Q: “If one were to offer to sell space on their site (or consider purchasing it on another), would it be a good idea to offer to add a NOFOLLOW tag so to generate the traffic from the advertisement, but not have the appearence of artificial PR manipulation through purchasing of links?”
A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.

Q: “On sites directed to international audiences with the same (high quality) content in several languages is it better to do several TLDs like mydomain.com, mydomain.de, mydomain.fr, mydomain.eu and so on or do subdomains like en.mydomain.eu, de.mydomain.eu, fr.mydomain.eu or something else like mydomain.com/en, mydomain.com/de, mydomain.com/fr?”
A: Good question. If you’ve only got a small number of pages, I might start out with subdomains, e.g. de.mydomain.eu or de.mydomain.com. Once you develop a substantial presence or number of pages in each language, that’s where it often makes sense to start developing separate domains.

Q: “Any results on why IDN Domains don’t show pagerank?”
A: I’ve seen a couple that do, but I’ll check into why most don’t. My guess is that there’s a normalization issue somewhere in the toolbar PageRank pathway.

Q: “Would it be possible to add a date range to queries? I might get 91,000,000 results, but the first 200 are 2-3 years old. I would like to limit results to items no more than 6-12 months old.”
A: Check out our advanced search page for this option. Tara Calashain also did some really interesting digging into this too, e.g. this info she uncovered. Google Hacks is a pretty solid book if you’d like to read more fun Google hacks.

Q: “What about the problem of directories and shopping comparison spam overriding real pages?”
A: Fair feedback. I heard that recently from a Googler, too. Sometimes we think of spam as strictly things like hidden text, cloaking, etc. But users think of spam as noise: things that they don’t want. If they’re trying to get information, fix a problem, read reviews, etc., then sites that like aren’t as helpful.

Q: “Are you planning to visit/speak in the UK at all in the near future?”
A: Sadly not. I’m hitting the Boston Pubcon and SES San Jose, but I can only do 4-5 conferences a year.

Q: “The one thing that seems to be getting to people generally, is what are the post Big Daddy intentions? Fixes, spam issues, regeneration of ‘pure’ indices, supp. issues, PR and BL update, etc.”
A: I can’t give a timeline (e.g. “scaling up communication in April, more work on canonicalization in May”) because priorities can change, esp. depending on machine issues, deployments of new binaries, webspam developments, etc. Short-term, I wouldn’t be surprised to see some refreshing in supplemental results relatively soon, and potentially different PageRanks visible in the next couple weeks.

Q: “Even Matt is afraid to use a redirect from www.mattcutts.com/ to www.mattcutts.com/blog/ because Google might penalize his website and put it into supplemental hell.”
A: Heh. No, that’s not it. I’m deliberately leaving them separate as a test case to see how we do now and down the road.

Q: “Just like you told me a couple of months ago, the Supplemental Googlebot (SG) got around to my site and things got sorted out. Thanks. . . . . If you are in San Fran and want to check out the Monterey Aquarium, could you please write a short review? I’ve been thinking of visiting and wondering if it is worth the trip.”
A: I would definitely recommend the Monterey Bay Aquarium, especially if you can find a coupon or other good deal. I highly recommend the otters, the kelp forest, and the jellyfish area.

{ 110 comments… read them below or add one }

Stephen March 29, 2006 at 12:35 am

Hey Matt,

That is some pretty impressive posting :D

I have noted that a couple of sites that I believe had canonical probs have come back – but only sites that have been sent to your engineers.

Not sure if this is a conincidence or that a correction is starting to roll out. If it is a correction then cool :D – will it hit some sites before others – depening on crawl cycle etc ? – If it is a engineers intervention then when would you want reports of these ?

Cheers.

Reply

Stephen March 29, 2006 at 12:51 am

Oops – just to clarify what I would call a correction for these sites.

EG: Site:domain.com – domain.com is first.

domain.com as a phrase – domain.com is first

etc – eg the Homepage returns to its true value – rest of the site seems to follow :D

Reply

OWG March 29, 2006 at 1:03 am

Matt, some great answers there, thanks.

This will help put to bed some of the crap that floats around about the Google mystique LOL.

I know that the Supplemental hell and the Lack of deep crawling are especially important to some people :-)

Reply

TallTroll March 29, 2006 at 1:15 am

I’m pretty sure that MAMK only requires firmware 2.0 to run, so you should be able to back and forth as required. You need 2.0 for the browser though – depends how much surfing you want to do. AFAIK, the only game that requires 2.5 is EXIT, so you should be able to wait until a downgrade form the 2+ f/wares is available before going there.

I find that Soulseek, a USB cable and a PSP is a memory hungry combo though…. need to get a 2Gb card soon ;)

Reply

McMohan March 29, 2006 at 1:22 am

Matt, that was a fair amount of time spent on writing answers this night. Thanks.
Apart from addressing supps, canonicals, pagerank re-calculation etc, will there be an imminent change in ranks as a result of these corrections?

Reply

jake March 29, 2006 at 1:34 am

Hi Matt,

As part of your review of the supplemental problem, are you also monitoring any sites whose pages have simply vanished (rather than gone supplemental)? I think the BD bug is responsible for both types of errant behaviour – sometimes it just refuses to index tens of thousands of pages, despite crawling them over and over again. That’s what we see anyway. None of the supplemental tweaks have yet made any difference to the missing pages problem.

Reply

Henry Elliss March 29, 2006 at 1:43 am

Well well, you can answer questions about Google and SEO very well, but you didn’t answer my “why are there no blue foods in nature” question?! I shalln’t be picking you as my phone-a-friend on Millionaire any time soon Mr Cutts… well, unless they start asking SERP questions in the next few shows! )

P.S. Saw a mobile dog-grooming van drive past our office the other day, called “Mutt Cutts” – I had a little chuckle.

Reply

Paul Reilly March 29, 2006 at 1:55 am

Cheers for all these answers..

I do have one question though, with so many different sources of Pagerank, Live Pagerank, future Pagerank etc. What would you suggest we use to see an accurate measurement?

Reply

Asle Ommundsen March 29, 2006 at 2:00 am

Please answer this:

From: http://www.mattcutts.com/blog/miscellaneous-monday-march-27-2006/#comment-19408

«For accessibility purposes, my site has ‘skip navigation’ etc… to allow screen readers to get straight to the content. [..] so I have ‘hidden’ these accessibility links using display:none in the stylesheet. [..] Will Google regard this as hidden text and penalise my site?»

Reply

Jeremy March 29, 2006 at 2:19 am

On TLDs and international audiences: When a site is in one language how should it be expressed to Google that it is for a global audience?

For example restaurant reviews and shopping could be seen as local and localised respectively; but product reviews (where the product is available globally), encyclopaedia entries and reference material are more for a global audience.

There are suggestions the site be duplicated at the various TLDs e.g. .com, .co.uk, .ca, .au, etc. But this wastes bandwidth for the site and the google bots, encourages link splitting and can confuse the users.

The geo of the IP doesn’t always work as for example 1and1.co.uk gives out German IP addresses, and many other websites use US hosting for cheaper costs.

Just wondering for a clarification on how this issue should be tackled as the various Google SERPs are becoming more and more local even if the user is not requesting pages only from their contry (google.com vs. google.co.uk or even it seems google.com used from a US ip vs. google.com used from a UK ip).

P.S. Keep up the good work!

Reply

Wayne March 29, 2006 at 2:53 am

Matt thank you for taking the time to answer all these questions. What you are doing here says a lot about your character and commentment to the webmaster community.

I didnt get to ask a question but let me try now. If I agree to buy you Starbucks every morning could you place my website at the top of the results :) Since my new site isnt ranked yet, thats all I can afford is one cup per morning ;)

Reply

Harith March 29, 2006 at 3:21 am

Thanks for your time, Matt.

Very generous of you. Much appreciated.

Reply

HaHa March 29, 2006 at 3:57 am

Very disappointed no comments on expired domains.
Looks like we will continue to see domains such as
macalstr.edu/
astronomy-national-public-observatory.org/
rarestonemuseum.com
iasicongress2005.org
papyrusinternational.org/
and many others in the adult serps.

Seems like its all too hard for the webspam team and this reflects badly both on google and the adult internet industry.

Reply

Maria March 29, 2006 at 4:03 am

So how long does it take for 301′s to take effect across all the DC’s? Even Y*hoo and M*N don’t seem to have a problem with it. :)

Reply

Eternal Optimist March 29, 2006 at 4:36 am

Matt, Firstly many thanks for both your time and efforts. I appreciate that you cannot be specific on certain points, due to the nature of privacy at Google.

Is it within your power to explain exactly what the following GoogleBots do? [You already answered 5.0 above ] – thanks :-)

crawl-66-249-65-225.googlebot.com
Mozilla/4.0 compatible ZyBorg/1.0
Mozilla/4.0 compatible ZyBorg/1.0 Dead Link Checker
Mozilla/5.0
Googlebot/2.1

Reply

Chris Bartow March 29, 2006 at 5:11 am

Thanks for answering these questions! Great information.

The URL Removal Tool has been broken for weeks. For example I’ve tried to remove directory.sysice.com from the index cause I took it down a few months ago, but I just get a Page Not Found when I try to submit it.

Reply

301 Redirect Problem March 29, 2006 at 5:12 am

The biggest problem that I’ve seen many worry about here and that google is way behind in addressing is 301 redirects with domain moves from domain1 to domain2 and Matt seems to forever be ignoring this question .. Even though it was asked about more than 3 to 4 times in the list of questions here and in many other comment posts by viewers Matt and google continue to ignore it or give vague answers about how or when google plans to address this..

Matt can you please once and for all address the question and webmasters concerns of how and when we can expect to see googles / bigdaddy properly handle domain name moves using 301 redirects?

Reply

Andrew March 29, 2006 at 5:23 am

One comment that you may not publish but I hope will read… WHAT is going on at blogger? It is google’s worst product by a country mile. Regularly unreliable and I can’t recall a single new feature that has been added since you brought it on board. It is dreadful and if I hadn’t been unfortunate to *start* using it I wouldn’t still be using it. I try and warn everyone away and it makes me sad :-(

Reply

Olney March 29, 2006 at 5:43 am

Thank you Matt for taking the time to answer questions or even to look into the IDN Domain issue with the pagerank. These domains will truly advance the international internet experience.

Reply

ClickyB March 29, 2006 at 5:50 am

Hi Matt,

Great effort answering so many questions, thank you.

One thing I’m still curious about (so are many others):
[blockquote]A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.[/blockquote]
Does this include linked images?

Reply

ClickyB March 29, 2006 at 5:53 am

Damn…. if there are 2 choices I always make the wrong one – lol – sorry about the

blown tags

:(

Reply

Kestrel March 29, 2006 at 5:59 am

Hi,

If BD is out now then how comes SERP’s are showing pages that haven’t existed for 9 months plus and return 404′s?

Cheers,

K

Reply

Ronald R March 29, 2006 at 6:04 am

Good job in answering so many questions, and I know you can’t answer every single one. But, it’s a shame you didn’t answer one of the most popular questions, about the loss of pages. Did you not want to answer it, or did you just miss it?

Thanks

Reply

Ulysee March 29, 2006 at 6:09 am

No answer……………
It has been three months since spam has taken over the majority of adult search results in Google.

It’s strange to see “somewhat” relevant results one day Dec 26th then Dec 27th just about the whole adult white hat community was wiped out, filter maybe?.

I believe that the adult serp problem is bigger than the supplementals – I just hope that it’s not being ignored.

What I am saying here applies to the entire adult industry in Google not just my little ole site.

Reply

Mike (Germany) March 29, 2006 at 6:18 am

========
Q: “Now that Bigdaddy is out, will there be a new export of PageRank anytime soon?” and “Will the deployment of BigDaddy stabilise the rolling PR issues we are experiencing at present?”
A: I’ll ask around about that. If there aren’t any logistical obstacles, I’ll ask if we could make a new set of PageRanks visible within the next couple weeks. I’d expect that as Bigdaddy stabilizes everywhere, the variation in toolbar PR for individual urls is more like to settle down too.
========

Hi Matt,

I think, it would better, the PageRank is not visible in the toolbar.

Reply

Your fan March 29, 2006 at 6:36 am

Hi can you post some photos of Emmy? We are cat lover.

Reply

SEO Swede March 29, 2006 at 6:37 am

I have reported several sites that use different spamming techniques. But nothing happens. For exampel look at this site http://www.kickoff-konferens.se/rw/ and go to the bottom of the site. They mention Mirror1, mirror2, mirror3 and mirror4. Why don´t Google exclude them? It feels like its ok to spam i Sweden and get top positions..

// Not so fun being a white hat SEO in Sweden.

Reply

Ryan March 29, 2006 at 6:41 am

and this reflects badly both on google and the adult internet industry.

Ahh.. that’s why so many people say bad things about porn… expired domains. Here I was thinking it was some sort of morals issue.

Mike, I agree with that.. Take visible pagerank out of everything. People put way more faith and dependance in it than they should, and it’s still easy to fake.

Give some site a PR higher than 4 and they instantly think they’re worth millions and have hit the big time.

Reply

JohnMu March 29, 2006 at 7:25 am

>You could whitelist the IP range that Googlebot comes from in the mean time.
Do you have a listing of all the Googlebot IP addresses?
Thanks.

Reply

Victor March 29, 2006 at 7:36 am

Q: “What about the problem of directories and shopping comparison spam overriding real pages?”

A: Fair feedback. I heard that recently from a Googler, too. Sometimes we think of spam as strictly things like hidden text, cloaking, etc. But users think of spam as noise: things that they don’t want. If they’re trying to get information, fix a problem, read reviews, etc., then sites that like aren’t as helpful.

To balance that feedback: We maintain a niche B2B directory and customer feedback and high listing CTR seems to indicate that a large number of visitors are indeed looking to “buy” products when they type in a product keyword and the directory is indeed relevant.

Google has to make an educated/algorithmic guess about the searchers intent (Information or Purchase). If an action keyword complimenting the product keyword is not specified in a search, the type of product itself can be used to yield a decent intent relevancy.

SERPs should not be flooded with directories, but there is always bound to be more -ve feedback on directories, since there are a lot more individual site webmasters than there are directories!

Reply

Rob L March 29, 2006 at 7:47 am

I have a question about one of your answers.

You responded to a question about NOFOLLOW tags with:

A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.

I was wondering if you could clarify something for me. I’m setting a website for my sailboat and I have the opportunity to join an affiliate program that places banners on my site.
Is Google going to see my joining an affiliate program as selling space on my site? I would like some advice before I put my site live.

Thanks for the great blog and information, I know you’ve helped me from making countless mistakes!

-Rob

Reply

Clint Dixon March 29, 2006 at 7:55 am

Hi Matt

Wow I can say I’ve been pretty close on what Googles been up to…even something so simple as robots.txt usage…amazing how many sites do not have one or one that is validated.

Let me ask you a question…besides those selling links and not using nofollow tags getting filters…care to explain to everyone how artificial link building is what results in their being penalized or what they think is ‘sandboxed’…?? You would do Google and yourself a world of good…

Also I like the SEO Focus as opposed to a single website as SEOs can reach many more websites than just one…Techniques ideas that work are always good though from either viewpoint.

Thank you

Clint

Reply

Liana Evans March 29, 2006 at 8:06 am

Thanks for taking the time to answer the questions Matt! I greatly appreciate the time and energy it takes you to address all the questions that you did. I’m sure that all of us in the SEO Community feel the same way. :)

Reply

Adam Senour March 29, 2006 at 8:18 am

Ahh.. that’s why so many people say bad things about porn… expired domains. Here I was thinking it was some sort of morals issue.

Silly rabbit. What would make you think that?

While we’re at it, the bad reputation associated with porn has nothing to do with the underage actresses, the B-grade production values, or the lack of frills such as plot and content.

Reply

Adam Senour March 29, 2006 at 8:55 am

By the way, Matt, I thought of another question (not counting this one) that would relate to a lot of people.

Should we keep asking questions now, or wait until the next grab bag thread?

Reply

Aaron Pratt March 29, 2006 at 8:57 am

Aquariums are extremely cool, my son goes nuts when we visit them and so do I! :)

Thanks Matt, a little vague on the RK thinger but I am sure PR obsessed Jim W. will figure that out, right Jim? ;)

Reply

Tonnie March 29, 2006 at 9:26 am

De site gebruikt artikelen van derden en plaatst deze middels korte (RSS)feeds op duizenden pagina’s om zo op diverse trefwoorden gevonden te worden.

Matt, thanks for your time to answer all those questions.

I have been a criticaster and will be, please don’t shoot me for that. :)

Regarding a question about RSS-feeds that seem to have become more of value and are even placed above regular and content-rich site, i would like to ask a suplementary question:

More and more RSS is being used to post feeds on other websites regarding articles that were scraped of other sites and placed on the website that is sending feeds for them.

I see websites shooting up in the rankings for specific keywords, that were nowhere bevor they scraped other sites and post them in unbeleivable quantities of RSS-feeds.

I don’t think this is an ethical way to promote a website but don’t see any action taken by Google.

Could you comment on this one.

Thanks in advance,

Tonnie

Reply

Tonnie March 29, 2006 at 9:27 am

Sorry for the Dutch that slipped in. :)

Reply

nsusa March 29, 2006 at 9:29 am

Using PR as piece to decide how deep a website is being crawled and indexed and saying that in public just forces everyone to continue the quest for artificially getting incoming high PR backlinks.

Not every high quality website receives many backlinks to get a high PR. Nor should a webmaster be forced to ask for backlinks or buy links to increase Page Rank. Isn’t there a better way for you guys to determine how deep a website will be crawled and indexed?

Anyway – thanks for taking the time to answer so many questions.

Christoph

Reply

Nikos Kapsomenakis March 29, 2006 at 9:29 am

Hi Matt

When are you going to make Google handle non English languages better ? For example, to make Google stop considering the usual stop words (in Greek) and the plurals as completely different words?

By the way, keep on providing us with useful information.

Reply

Gary R. Hess March 29, 2006 at 9:52 am

Lots of good information Matt, however, I hope the Wikipedia problem with the + and ” ” is fixed soon.

Reply

Nick March 29, 2006 at 9:54 am

Thank you Matt,

I hate those directory sites and your post really makes it feel like someone at Google is listening. Your time on these posts is very much appreciated. It’s like we’re not flying blind anymore.

Reply

Aaron Pratt March 29, 2006 at 10:12 am

Phrase: Dog Gift Basket
Position #1: Directory
Position #2: Same Directory
Position #3: Regular Site
Position #4: Affiliate of above “regular site” Boo!!!

See the trend? If you are below those and have a real product you lose, those above sites appear to be very official and suck in the limited business.

There is some truth to this pet conspiracy and seeing “SEO’s” in forums with their signature links to their gift basket affiliates makes me sick! Weee, let’s all sell the same freakin’ product, lame!

Damn but here I go again, did Matt have the patience to read my rambling? It’s got some mad truth in it dudes.

Word!

Reply

Hawaii SEO March 29, 2006 at 10:36 am

Thanks Matt,

I still don’t know why you fight Spam with one hand and monetize it with the other. Do the sales people even speak to the quality control people? It almost seems like the Spam fighting is being funded by the profits you earn from the Spam. This is why I don’t believe Google is serious about the problem.

Why not fight Spam by not monetizing it in the first place?

It seems simple enough to me.

• Have a person look at the websites before you allow them to place Google ads.
• Make reporting Spam easy and actively encourage it.
• Have a person with a brain evaluate the Spam reports.
• Remove offending websites by hand if necessary.
• Impose legal and financial penalties for people who violate the terms and conditions
• Delay payments by 30 days or so like some of the major affiliate programs. If a Spammer is caught you can deny up to 30 days of his Spam revenue.
• Prevent the spammer from ever signing up again by blacklisting his SS# and Tax ID# for the rest of the guy’s life

Or do I just not understand the problem?

Reply

Aaron Pratt March 29, 2006 at 10:55 am

“Hawaii SEO” sounds so hot and sexy, anyone got that URL yet? Nope, you better grab it dude! :)

Reply

Glenn March 29, 2006 at 11:18 am

Hi Matt,

That was a fantastic set of Q&A. Bravo.

Reply

Matt Cutts March 29, 2006 at 12:55 pm

BTW, I talked to someone from Sitemaps, and they’re working on a more descriptive user-agent for the next release.

Also, welcome Memeorandum-ers. :)

Reply

Armen Shagmirian March 29, 2006 at 12:59 pm

Hi Matt,

I was curious what Google was planning on doing about these ‘link vaultage’ sites. These link vaultage sites basically require you to run some code on specific pages on your site that render static text links on your page. These links look like natural text links so there’s nothing that really gives it away that they’re automated through these link vaultage sites. In return, other sites enlisted into the same program post text links pointing to your site. They’re not reciprocal links, but just a very quick way to get many high PR inbound links right away.

This is obviously unfair to those webmasters that actually email some other webmasters to be added to their link page or some other more manual process of adding a text link on a specific web page. How is google planning on leveling the playing field? One of my competitor is obviously enlisted in this program because they have over 3,000 backlinks all from pages that currently do not even have a text link pointing to their site.

As a webmaster and bussiness owner, I’m worried about this. I can not compete at the same level because they’re cheating…take a look at their back links to see what I’m talking abt hookah bzz with a u.

I know other webmasters would be interested to hear what will happen in the future to sites who ‘cheat’ by using a link vaultage service.

Reply

Michael Martinez March 29, 2006 at 2:44 pm

Matt,

A number of people have expressed some concerns about your comment: “Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.”

It seems you’ve caused a growing panic among many people who pay Yahoo! to link to their sites from the Yahoo! directory. Can you elaborate somewhat on Google’s paid link policy?

Thanks.

Reply

PhilC March 29, 2006 at 3:37 pm

Well done on the Q&A thread Matt. It’s been very good.

In one of your answers, you suggested mentioning ideas here (in this blog), but there isn’t really a place for them. On a couple of occasions, I’ve jumped into threads that were vaguely on-topic, and both were successful, but there must be many times when people wants to get something across, or to make a comment, but there isn’t a suitable thread, or what threads may be suitable are so old that you are unlikely to read new posts in them.

So how about starting a “comments” thread that you will dip into regularly – similar to this one but perhaps more continous?

Reply

Vick March 29, 2006 at 3:55 pm

Hi Matt,

do you know if this DC was converted to BigDaddy?

http://64.233.187.104
http://64.233.187.99

Results on those 2 very different from others.

Can you please comment?

Thank you,

Vick

Reply

PhilC March 29, 2006 at 3:59 pm

We’ve known for years that Google’s spidering has a lot to do with pages’ PageRanks, but why? Fair enough if it’s only the frequency of spidering that PageRanks affect, but in one of your answers, you said that it affects the depth of spidering. Why?

Doesn’t Google want to index pages? If not, why not? It’s common sense that sites with lower PageRanks are every bit as useful as those with higher PageRanks. There are many types of site that other sites don’t naturally link to very much, so why keep their inner pages out? Why encourage them to go on link-building campaigns, knowing that they are pretty much forced to do it in unatural ways – ways that you don’t want?

Sorry, Matt, but I don’t see the sense in it. Either Google wants quality pages, or they don’t. Just because some of them are deeper than others, doesn’t mean that they aren’t good quality. Imo, PageRank shouldn’t have any bearing as to whether or not a page in spidered.

Reply

Dave March 29, 2006 at 5:35 pm

errr, I don’t this is another grab bag post already!

I feel for you Matt, even when you answer umpteen questions it only serves for some to ask even more.

Reply

Adam Senour March 29, 2006 at 8:25 pm

One too-late question, but one that still bears asking, since it bears a direct effect on the future of the universe:

Did you go to San Fran specifically to beat up the cast of Full House? I don’t think you could take Uncle Jesse, but I think you could smoke Danny Tanner straight up.

Reply

Peter T Davis March 29, 2006 at 8:43 pm

Cheers for answering my question Matt. Whenever you’re ready to take a break from the SEOs, and walk among the regular webmasters, you can find loads of them on Sitepoint’s forums. ;)

Reply

Harith March 29, 2006 at 9:03 pm

Good morning Matt

It seems your cat Emmy is very famous among the folks at WMW :-)

Do you care to share with us a picture of Emmy girl?

Have a great day.

Reply

bobmutch March 29, 2006 at 9:04 pm

Matt there appears to have been a PR export start around or on Feb 18 which as in the middle of the BigDaddy upgrade which started Jan 4th.

Many DC’s have been showing differenet PR. Was there a PR export on Feb 18 during the BigDaddy update? Or was the last PR export on Dec 19th?

I just want to get this list right.
http://www.seocompany.ca/pagerank/page-rank-update-list.html

Thanks!

Reply

Jonathan March 29, 2006 at 10:35 pm

Great Q/A session Matt, really helpful. Can you give any insight to the new deisgn of the Google homepage? I noticed it a couple weeks ago and haven’t heard much more about it. Would love some more details on why it popped up, if I’m on a beta testing account, or anything more about it. Thanks Matt!

Reply

adam March 29, 2006 at 11:13 pm

Great post, you should do Q&A stuff weekly. Posts like this is exactly why I keep visiting your blog.

Reply

Jakob Boyer-Dræby March 30, 2006 at 12:38 am

Hey Matt

You should consider making this Q&A into a tradition. Would certainly bring me back each month, to read your answers, even if they are about your cat.

By the way, where does the name “Big daddy” come from, are you “Big daddy” :) ??

Cheers
Jakob Dræby

Reply

J March 30, 2006 at 3:58 am

Hi Matt,
i’m very happy to have found your blog and thanks for taking time to do this.

Question:
Are you able to give some guidlines to what the difference between a site with PR6 and one with PR10. Our company run several public acces news websites which have PR5-6 on the top level pages (like http://www.politics.co.uk and http://www.inthenews.co.uk). What things make http://www.bbc.co.uk/news better/higher value?

Reply

ELIAS and Google KAI March 30, 2006 at 6:01 am

Best Blog Post Ever, Thanks MATT.

Can we know if any of those parameters will affect the new Google ( web 2.o ) … just wondering ?
Thanks.

Some weired results are still appearing:
Query Google Swedish http://www.google.com/search?hl=en&q=google+swedish&spell=1

SERP: http://www.google.com/intl/xx-bork

And concerning PRs , YEs many Unique evaluations showing up on unique sites pages. Some Main index pages getting O and the rest of the site pages PR 8 – 6 or 7

Reply

Nate March 30, 2006 at 6:36 am

“Yes, if you sell links, you should mark them with the nofollow tag.”

As Michael pointed out above, Yahoo doesn’t.

Reply

Nedguy March 30, 2006 at 6:41 am

Matt,

I’m another who’d love just a little more background on the reL=”nofollow” issue.

How do you recognise a paid-for link?

Presumably, if you are simply looking for sites with a lot of links and a paypal account (eg directories) there is still no way for Google to distinguish between ‘pay-for-inclusion’ and ‘pay-for-review’?

So a genuine directory site full of human edited links that are published precisely and specifically because they are worthy sites of benefit to visitors (just the sort of “with-juice” vote that Google needs for indexing) also needs to wear the Google safe-linking condom to avoid being penalised?

Reply

Keith Ort March 30, 2006 at 7:53 am

Q: “If one were to offer to sell space on their site (or consider purchasing it on another), would it be a good idea to offer to add a NOFOLLOW tag so to generate the traffic from the advertisement, but not have the appearence of artificial PR manipulation through purchasing of links?”
A: Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google.

this answer raised my eyebrows. i’ve never heard of “reputation in Google”. could you please expand upon this.

otherwise, this was an excellent post with some true merit. it might be seen as, *gasp*, quality content! :shocked: :D

Reply

fastballkid22 March 30, 2006 at 8:40 am

Matt,
I am trying to wrap my hands around this whole BigDaddy thing and I don’t unerstand why my site has different rankings still for different datacenters. In the three IP’s listed in February’s Q&A my site comes up first for the 66.249.93.104 and 216.239.51.104 but it isn’t listed at all at 64.233.179.104. Why would this be?

Thanks…

Reply

Sarah March 30, 2006 at 8:53 am

THANK YOU Matt!

As soon as we saw your answer yesterday regarding supplemental URLs and robots.txt, we updated the robots.txt and crossed our fingers :)

We really appreciate your being there.

Reply

Serge March 30, 2006 at 9:04 am

Hello Matt,

First of all thank you for your answers and I really appreciate your communication with people like me.

I’ve got a very strange situation. Some of my affiliates made a redirect to my website and Google show their pages in search results higher then my website’s page (and sometimes no my website in search results at all). I’ve sent you few examples to your email (hope you will read it)

Since I’ve changed my domain name from olddomain.com to newdomain.com (about 3 weeks ago after request of Microsoft) I’m still getting results in Google with my old domain name! I’ve created 301 redirect to the new one.

Would you please let me know is this Big Daddy update results and I should expect for a next update, or are there any chances to get in “normal” in next days/weeks?

Also, would you please let me know how long should I expect for a new domain will be “returned” in Google index to a “normal” positions as the old one was? Is there any changes in this process after Big Daddy update?

Google in novadays sometimes working like Altavista in 1998… It’s realy upset :(

Thank you

Reply

Manoj March 30, 2006 at 11:32 am

Matt:
Referring to your comment above ” Yes, if you sell links, you should mark them with the nofollow tag. Not doing so can affect your reputation in Google” — my two cents: Selling text links on a site and getting penalized (… affected reputation..) shouldn’t be related at all. For example, I could have a great content site, on which I offer advertising space (for a variety of reasons, including my UI preference, say I offer text link advertising). By putting a no follow for this content-rich page, wouldn’t search engine visitors be deprived of some valuable content that could/should be showing up on SERPs?
Manoj

Reply

kid disco March 30, 2006 at 2:59 pm

Hey Matt!

I am excited to see that you are going to be at PubCon – Boston! This must be a recent add? I look forward to meeting you there…

Reply

Schachin March 30, 2006 at 3:16 pm

OUCH Matt :) – I see my post is gone, but very confused as to why because i read your guidelines and was not asking questions about my site, but about Google’s spam policy and one I have seen others ask :( I posted a long list of links to give you an idea of how prevelant the issue is, but they were not my links….

The post was about why does Google allow link sites with no real content and that are not directories collect sponsored links and make money off them…was not sure if I was missing something about Google’s interpretation of spam..

“I know for a fact that there are literally hundreds maybe thousands of these sites run by these same two companies ..”

Well I do hope you answer because I think this is a real issue because these sites look relevant to the spider and yet are not and they often show up higher in search results than real sites do.. in addition they are earning a lot of money through this deceptive practice by the sheer numbers of sites they have on the net.

Hope this does not count as a double post.. thought maybe you just didn’t like my links ..

Thanks!
KS

Reply

Zee Mee March 30, 2006 at 8:58 pm

Very very useful thread… I will put this all info on my blog too….

thanks

Reply

Adam Senour March 30, 2006 at 9:00 pm

By putting a no follow for this content-rich page, wouldn’t search engine visitors be deprived of some valuable content that could/should be showing up on SERPs?

If it were that valuable and useful, then why wouldn’t you link to it for free? Seems to me a resource of that much weight should be shared with the world.

Reply

susan March 30, 2006 at 11:19 pm

Hi Matt, this dully was a slap-up interview, informative and consistent, I would totally fairly often want such highlights once again good thank you for it.

Kind Regards

Susan

Reply

Harith March 31, 2006 at 12:36 am

Hi Matt

Take a look at the bottom of this PR6 page. If you have time you may wish to click some of those hundreds links too.

http://www.bizwiz.com/index_v09.html

Should such pages reported to Google WebSpam Team?

Thanks

Reply

chris March 31, 2006 at 2:05 am

I wouldn’t post my url in the sig, not even reading the guidelines, but me, as other romanian webmans have a problem:

One of my sites was on top for afew of it’s keywords yesterday (29), and now is gone for good (I cannot find it in the first 10 pages). Wich made me think if the damn site has something wrong, and I checked it. It’s not a big deal, a personal blog, but a good one (maybe not as good as this one, thanks Matt!). Why the romanian datacenters are showing completelly different datas?

Thanks, hope it’ll solve out… (damn image codes, I always mix up with those…)

Reply

Bill Kelm March 31, 2006 at 4:44 am

Hey Matt, I didn’t know you were into cats! My wife and I have three abysinians. If you ever get a chance, check out their unique personality.

I was hoping you could attend the Santa Clara University’s (Markkula Center for Applied Ethics) Conference on “The Ethics and Politics of Search Engines” held on 2/27/06. Perter Norvig, Director of Research for Google had some interesting things to say about the “Randomization” of Google’s search results:
“Yeah, and we do use some randomization and experimentation in our results. So at any one time, we’re probably running dozens of different experiments where we’re trying out variations to see is this variation going to be better than the standard one? So you do see a lot of turn and mix, both because of our changes in the algorithms and also because of the changes in the Web. So the results that are number one today may be different than the results tomorrow for very subtle reasons having to do with both changes in the link structure of the Web and with the changes we’re experimenting with.”

Reply

christian March 31, 2006 at 9:02 am

hi matt,

Szenario; a .com site hosted in the US but in a non english language;

Is the pointing (redirect?) of the country specific domain to the .com site enough to indicate to Googlebot that the .com site should be included in the country specific search?

cheers
viggen

Reply

Sobriety Online March 31, 2006 at 3:27 pm

Hi Matt,

I’ve learned alot from you. Thank you!

What is the best way to make a page printer friendly for Google and SEO?
I currently use a media=”print” css with display: none for nav bars etc. I just don’t want it to look like I am hiding things.

Thanks again!

Reply

Joseph Hunkins March 31, 2006 at 5:04 pm

Thanks Matt – very extensive and helpful information as always.

Reply

Video Gratuite April 1, 2006 at 8:35 am

dommage pour le RK :s, on devra attendre la GG dance maintenant :s

Reply

BostonScott April 2, 2006 at 6:09 pm

So Matt,

When will one of the major business magazines smarten up and put you on the cover as the the poster boy for quality, non-spam search results?

Reply

Andre April 3, 2006 at 2:11 am

Matt,

Just excellent! You should do this more often.

Reply

Paul April 4, 2006 at 3:34 am

I started linking with other sites a long while back. The majority of these links are non-themed. Should I leave them in place, or delete?

Reply

Wendy April 4, 2006 at 10:54 am

Speaking of redirects …because many people ‘mis-hear’ my url I bought the ‘mis-heard’ url and pointed (or redirected ) it to my site. Does Google see this a problem?

Reply

Al April 4, 2006 at 5:52 pm

A serious implication to the storing of old data has emerged.

An Australian Police’s blunder of retaining sensitive anti terrorism contacts online and the subsequent reporting of it by the Sydney Morning Herald this morning http://www.smh.com.au , potentially involves the retention of old data by Google, possibly in it’s supplementary index. Mention is made in the article that the data can be recovered on Google.

Although this is to be verified it does demonstrate a serious problem of not being able to remove supplementary results by the site owner – notwithstanding the original blunder and other issues

Without wanting to be melo-dramatic I think the practice of storing old data online may have serious implications for some website owners who do not wish it to be there, or believe for good reason it should not be there.

Matt – is there a way to better manage this in conjunction with Google ?

Reply

Al April 4, 2006 at 5:55 pm
Bradly April 5, 2006 at 10:14 am

Heh, I live in Monterey and would definately suggest seeing the aquarium :)

Matt, you cant forget about the great white shark.
http://mbayaq.org/efc/efc_smm/smm_meetBrowser.asp?tf=12

They are the only aquarium in the world to have one on display :)

Reply

Nintendo April 5, 2006 at 6:44 pm

Any one notice the PR update Matt requested is now occuring!! Looks like only new sites got the new PR!!

Reply

Toby Sodor April 10, 2006 at 5:44 pm

Hi Matt, sorry to leave this so late, but I wanted to clarify a point you made about getting sites indexed:

[b]“One of the classic crawling strategies that Google has used is the amount of PageRank on your pages. So just because your site has been around for a couple years (or that you submit a sitemap), that doesn’t mean that we’ll automatically crawl every page on your site. In general, getting good quality links would probably help us know to crawl your site more deeply.”[/b]

Does this mean webmasters should have a linking strategy in place? To get organic links when people can’t find the pages in search engines in the first place IS rather difficult! I for one am loath to buy links or chase links as organic links are so much better, but it appears the only way to get properly indexed by Google?

Reply

Sobriety Online April 22, 2006 at 5:21 am

Matt, can you delete this post and my prior post (3/31/06 3:27 pm) on this thread? Ever since I posted here I have had nothing but bad luck with my web-site and google. Thank you.

Reply

Matt April 24, 2006 at 1:58 am

Thanks for clearing up the Googlebot issue, we were begining to wonder why our site was being ignored by Google!

Reply

Rich Smith April 24, 2006 at 9:02 am

We are getting ready to move a very large site that we’ve been developing for some time into production. The client is a well established, national lending company. The site is content rich and large, approximately 250k pages. Because of their prexisting relationships in the industry, they have 4 or 5 strategic partners linking to them that generate another 50k backlinks. What do you believe is the best way to deploy this site without getting torched and to minimize time in the sandbox. We are concerned about the strong search engine saturation and link popularity metrics even thought they are ‘legitimate.’ Because lending is such a competitive space, do we need to worry about this? Should we move everything into production at once or in phases? Any thoughts?

Reply

Siluk May 3, 2006 at 7:29 pm

I used domain.com and http://www.domain.com
I start linking to domain.com
january I checked
site:domain.com= about 130 000
february I checked
site:domain.com= 80 000
march
site:domain= 50 000
appril
site: domain=about 30 000
may
site: domain= about 790

i checked
link:domain.com = 0 !!!!
link: http://www.domain.com
site:www.domain.com = about 10 000

i do not know what is happened, why google choose http://www.domain.com if my linking strategies was directed to domain.com ?
why my sites, links was throwing out from google index?

my site has not any outbound links.
plz help
best regards

Siluk

Reply

Gerald May 6, 2006 at 8:06 am

Matt,
Why is it that new products or new developments invariably show up in Google competitors’ search engines first? If I’m dealing with a new product, I invariably have to search a competing search engine in order to find it.
Best regards,
Gerald

Reply

Sima September 22, 2006 at 10:42 pm

Does Googlebot crawls .pdf, .swf, etc files ?

Reply

Bill November 10, 2006 at 7:39 am

Hi Matt,

Thank you. Thank you. Thank you.

I broke a couple redesign rules: not installing a proper 301 (initially) and changing too many page urls (page content remained the same – change was file extension). Did my pages lose their credibility?

Give Emmy a scrath under the chin for me.

See ya,
Bill

Reply

Sterling Silver Jewelry December 1, 2006 at 1:34 pm

Thanks for the info. A lot of great answer!

Reply

Scented Candles December 1, 2006 at 1:35 pm

Thanks for the clarification, Matt!

Reply

Xin Chern December 11, 2006 at 1:13 pm

This post is just useful. It really helps me lots.
I have a question:
recently I bought a few expired domain with PR, will it lose on next update or Google would just treat it as a new domain? Thanks!!!

Reply

Decorative Pillows February 27, 2007 at 6:36 am

Does Google crawl database generated pages with session id ?

Reply

rezepte March 10, 2007 at 1:10 pm

Hi Matt,

can you say us something to double content – i make very good recipes on my site http://www.besterezeptesuche.de – and I also give some recipes to yahoo.de, web.de and gmx.de. to health or lifestyle site. in all recipes at the site of our partners you can find a link with definition of copyright and “quelle”.

Is it a problem for us (bad listings in gindex) and does google check that the content is from us?

Thank you for your answear if possible.

thomas

Reply

Justin May 21, 2007 at 2:04 pm

Does Google read the style sheets on a webpage? I have heard that it can help my website placment if I have H1 and H2 tages. I have created custom H1 and H2 tags for the purpose of making them look better. Does anyone know if Google still regards this has an “H1″ tag?

Reply

Jetski June 8, 2007 at 12:45 pm

Thanks for the clarification, Matt!

Reply

Cara - Ireland August 21, 2007 at 2:04 am

Hi Matt,
We have a client who has 1 site for domestic visitors and another for international visitors. If a user originates from outside of Ireland according to their IP address, a layer will roll out before the homepage loads. A pop up will allow the visitor to select their home country and click on a link to the relevant website. Do you have any idea what are the SEO implications for this?

Reply

safiya September 4, 2007 at 3:42 am

Hi. How do i pre-pay for Adwords from Kenya – Nairobi, I have no credit card.

Reply

Outdoor Dating January 7, 2008 at 11:54 am

Hi There,

We launched a site for dating 45 days ago and we are trying to get the best exposure on the SE. What I am still observing is that either though we have links still no PR is given to us (we have few PR4-5 sites linking to us) and the generic searches are very little.Since our site is for dating, we have user generated content as profile pages. The question I have is should we add all of the profile pages in the sitemap and allow to be browsed by spyders. Do you believe this might help with the ranking? Also, do you recommend linking with higher PR sites or with more but lower PR sites. Using google Adwords shows that almost every ‘spam ad site’ lists us, but they have no PR at all and I doubt that the users actually convert into signups from these sites.

Thanks a lot.

Reply

Aaron Newton September 9, 2008 at 12:25 am

Quote: See the trend? If you are below those and have a real product you lose, those above sites appear to be very official and suck in the limited business.

I see alot of this, i.e. where one vendor buys out all the ranking directory listings and gets affiliate links indexed. I found a really extreme example of this the other day. Try [tutorial builder]. Every single link on the front page is a freeware software site listing the same product.

Reply

Bob Smith January 9, 2010 at 8:18 am

Very useful q and a’s thanks

Reply

Shane Col March 18, 2010 at 4:05 am

Question and answer in comments is really helpful. I learned things instantly in just minutes of scanning comments from sites or personal pages from people like Matt. Kudos!

Reply

Leave a Comment

If you have a question about your site specifically or a general question about search, your best bet is to post in our Webmaster Help Forum linked from http://google.com/webmasters

If you comment, please use your personal name, not your business name. Business names can sound salesy or spammy, and I would like to try people leaving their actual name instead.

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous post:

Next post: