SEO advice: url canonicalization

(I got my power back!)

Before I start collecting feedback on the Bigdaddy data center, I want to talk a little bit about canonicalization, www vs. non-www, redirects, duplicate urls, 302 “hijacking,” etc. so that we’re all on the same page.

Q: What is a canonical url? Do you have to use such a weird word, anyway?
A: Sorry that it’s a strange word; that’s what we call it around Google. Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages. For example, most people would consider these the same urls:

  • www.example.com
  • example.com/
  • www.example.com/index.html
  • example.com/home.asp

But technically all of these urls are different. A web server could return completely different content for all the urls above. When Google “canonicalizes” a url, we try to pick the url that seems like the best representative from that set.

Q: So how do I make sure that Google picks the url that I want?
A: One thing that helps is to pick the url that you want and use that url consistently across your entire site. For example, don’t make half of your links go to http://example.com/ and the other half go to http://www.example.com/ . Instead, pick the url you prefer and always use that format for your internal links.

Q: Is there anything else I can do?
A: Yes. Suppose you want your default url to be http://www.example.com/ . You can make your webserver so that if someone requests http://example.com/, it does a 301 (permanent) redirect to http://www.example.com/ . That helps Google know which url you prefer to be canonical. Adding a 301 redirect can be an especially good idea if your site changes often (e.g. dynamic content, a blog, etc.).

Q: If I want to get rid of domain.com but keep www.domain.com, should I use the url removal tool to remove domain.com?
A: No, definitely don’t do this. If you remove one of the www vs. non-www hostnames, it can end up removing your whole domain for six months. Definitely don’t do this. If you did use the url removal tool to remove your entire domain when you actually only wanted to remove the www or non-www version of your domain, do a reinclusion request and mention that you removed your entire domain by accident using the url removal tool and that you’d like it reincluded.

Q: I noticed that you don’t do a 301 redirect on your site from the non-www to the www version, Matt. Why not? Are you stupid in the head?
A: Actually, it’s on purpose. I noticed that several months ago but decided not to change it on my end or ask anyone at Google to fix it. I may add a 301 eventually, but for now it’s a helpful test case.

Q: So when you say www vs. non-www, you’re talking about a type of canonicalization. Are there other ways that urls get canonicalized?
A: Yes, there can be a lot, but most people never notice (or need to notice) them. Search engines can do things like keeping or removing trailing slashes, trying to convert urls with upper case to lower case, or removing session IDs from bulletin board or other software (many bulletin board software packages will work fine if you omit the session ID).

Q: Let’s talk about the inurl: operator. Why does everyone think that if inurl:mydomain.com shows results that aren’t from mydomain.com, it must be hijacked?
A: Many months ago, if you saw someresult.com/search2.php?url=mydomain.com, that would sometimes have content from mydomain. That could happen when the someresult.com url was a 302 redirect to mydomain.com and we decided to show a result from someresult.com. Since then, we’ve changed our heuristics to make showing the source url for 302 redirects much more rare. We are moving to a framework for handling redirects in which we will almost always show the destination url. Yahoo handles 302 redirects by usually showing the destination url, and we are in the middle of transitioning to a similar set of heuristics. Note that Yahoo reserves the right to have exceptions on redirect handling, and Google does too. Based on our analysis, we will show the source url for a 302 redirect less than half a percent of the time (basically, when we have strong reason to think the source url is correct).

Q: Okay, how about supplemental results. Do supplemental results cause a penalty in Google?
A: Nope.

Q: I have some pages in the supplemental results that are old now. What should I do?
A: I wouldn’t spend much effort on them. If the pages have moved, I would make sure that there’s a 301 redirect to the new location of pages. If the pages are truly gone, I’d make sure that you serve a 404 on those pages. After that, I wouldn’t put any more effort in. When Google eventually recrawls those pages, it will pick up the changes, but because it can take longer for us to crawl supplemental results, you might not see that update for a while.

That’s about all I can think of for now. I’ll try to talk about some examples of 302’s and inurl: soon, to help make some of this more concrete.

177 Comments »

  1. Harith Said,

    January 4, 2006 @ 9:20 am

    Hi Matt

    Very informative educating post!

    Power to you, Matt :-)

    So when do expect the call for BigDaddy? a week from now?

  2. VJ Said,

    January 4, 2006 @ 9:25 am

    Hello Matt,

    Thanks for all the info. It is great to finally get this info from a ‘real’ source.

    My question is, however, that would it be ok for a page that no longer exists to be 301 redirected to the home page, or site map? Or would this create problems?

    Thanks, and love the blog.

    V

  3. Nick Said,

    January 4, 2006 @ 9:27 am

    Hi Matt,

    Great article. I’ve been doing the 301 redirects to the full URL for a while now. If nothing else, it gives the site a much more professional feel. All the major sites do it and I don’t get why it’s not a standard on server configurations. Most seem to default to the multiple URL versions.

    Happy new year.

  4. Tim Linden Said,

    January 4, 2006 @ 9:27 am

    Thanks for the information! Subscribing to this blog has given me alot of information and I haven’t subscribed long ;-)

  5. Tony Hill Said,

    January 4, 2006 @ 9:30 am

    Hi Matt,

    Quick clarification… quote..

    “Suppose you want your default url to be http://www.example.com/ . You can make your webserver so that if someone requests http://example.com, it does a 301 (permanent) redirect to http://www.example.com

    I noticed that when you say to do a 301 redirect you didn’t include a trailing slash. Is there a distinct difference between http://www.example.com and http://www.example.com/ (with the trailing slash) ?

    Would it be worthwhile to make sure that any links within our site that point to the homepage be consistant using or not using the slash?

    Thanks

  6. Smoke2Much Said,

    January 4, 2006 @ 9:31 am

    “So how do I make sure that Google picks the url that I want?”

    Ok, so basically we tell the crawler which url we want to use by internally linking it consistently throughout the site? Would doing the same thing for external inbound links also help? Seems like it’s still a little out of our control. Can’t this be set in the sitemap? That would seem logical to me.

  7. zeus Said,

    January 4, 2006 @ 9:34 am

    that was some good info for the main webmaster, but most of us who are introuble because google could not handle 302 links and now we see on the test server, that the ww.domain.com is back, caches from old 302 links are gone and many pages has been spidered by the mozilla bot, can we expect a ranking again when the real update starts.

    Thanks for your time

  8. Anh Said,

    January 4, 2006 @ 9:34 am

    Matt,
    Would you please give us some reasons which cause the URL-only listing? What should we do to correct this listing?

    Thanks

  9. Peter (Brane) Said,

    January 4, 2006 @ 9:35 am

    Hi Matt,

    Nice post, I am especially happy that you mention the supplemental results. What exactly are they? I can’t understand the purpose of them or how they benefit the searcher. They´re mostly a pain in the butt because they push other, often more valuable results, down.

    Thanks,

    Peter

  10. Dirson Said,

    January 4, 2006 @ 9:43 am

    Are we to wait so much for the next ‘BigDaddy’ update?

  11. Michael Martinez Said,

    January 4, 2006 @ 9:52 am

    Great post, Matt!

    As a programmer who has done a lot of text parsing, I’ve been lobbying SEOs and Webmasters for years to keep their URL formats consistent. I think that, coming from you, the message will be heard more loudly and clearly than before.

    This is the kind of stuff that makes your site so worthwhile to me. I know, you’re Inigo Montoya, and spammers all have six fingers, but this straight-forward, tell-it-like-it-is content is what I enjoy most.

  12. Nursing Bras Said,

    January 4, 2006 @ 10:02 am

    Hi Matt & Everyone,
    Just yesterday I’ve set my 301 redirect, but my question is about PR and links.
    If you have links that suppose to bring you some PR to let’s say: http://domain.com instead of http://www.domain.com would the links PR effect the http://www.domain.com as well?

    If anyone knows the answer, please let me know. (I have that problem with an old site of mine)

  13. jestep Said,

    January 4, 2006 @ 10:14 am

    While I normally I don’t comment, I think this is a great post that really gives the low-down on what webmasters should do. Anyone can speculate all they want, but it is great to hear this from a reputable source, and with all of the information all together.

  14. brewster Said,

    January 4, 2006 @ 10:20 am

    ..and he even puts to rest the definition of ‘Canonicalization’. Great information, Matt!

  15. german Said,

    January 4, 2006 @ 10:22 am

    Hi Matt,

    Nice explanation. Could you go on further with some case like mine:

    I have several sites. Each of the sites is hosting a different language version of the same site (you will find my main site from my e-mail address and click of any of the links of this placeholder site).
    Site 1 is my main site
    Site 2 is the site in another language.

    For some technical reason I have not yet been able to host the site 2 at the same place than site 1( and I do not think of changing the host of site 1 since they have worked so well with us in the past). some of the php pages are however sharing the same database so that I asked the host of site 2 to 302 redirect to site 1( thus visitors are seeing URL of site 2, yet the pages are taken from a subdomain of site 1 because I want the visitors to keep site 2 in mind as I will remove the redirect as soon as I can).

    The funny thing is that because of the redirect, Google is not able to define the root of site 2. The site: command for site 2 is only showing the index page.
    All pages of site 2 are as URL only under the site: command of site 1 although site 2 has its independant directory.

    However,
    Your test DC shows URL only for the index page of site 2, fully indexed pages for site 2 UNDER the site: command of site 1.

    Does it means I will only have the choice of either loosing my index or all of the other pages? Isn’t there a way for google to determine that these subdirectories are independant from each other, just hosted together?

  16. Brian M Said,

    January 4, 2006 @ 10:25 am

    Hi Matt,

    This may be your best post ever!

    I only have one stupid question: Is it better to spell out “www.example.com/index.asp” everywhere within a site (and in googlemap.xml), or to just leave it blank (as in “www.example.com/”)?

    I can see that it makes a huge difference in the Page Rank of a “default.asp” page, so I am just curious…

    Thank you for this very informative post!

    Brian M

  17. SEOPod Said,

    January 4, 2006 @ 10:26 am

    As said Matt GREAT post. It is nice to hear this information from someone at the source. As far as your non-www issue I think the “contest” and the words of your friends are going to help ya there :-P

  18. Matt Said,

    January 4, 2006 @ 10:32 am

    Smoke2Much, that’s a good suggestion. It never hurts to ask external people to link to your site in your preferred manner as well.

    Anh, normally if you don’t see crawled snippets and only see urls, it usually means that it would help to have more links to your site. More PageRank helps to get more crawling. The other common explanation is that you’ve got a robots.txt which forbids crawling a url. We can see the references to the url, but the robots.txt doesn’t allow us to crawl the url, so we can only show the uncrawled url reference.

    Brian M, I would opt for http://www.example.com/ personally instead of http://www.example.com/index.asp. It’s easier for users to remember and to type in, so people linking naturally are more likely to link to the page without the index.asp.

  19. Matt Said,

    January 4, 2006 @ 10:36 am

    Tony Hill, great question about http://www.example.com vs. http://www.example.com/ . This is one of those cases where I’d pick a preferred format and stick with it uniformly. I would lean toward having the trailing slash, because that’s what most people expect. Maybe I’ll just duck in and update that in the post.. :)

  20. McMohan Said,

    January 4, 2006 @ 10:39 am

    Thanks for the post Matt. It will be very helpful if you have such Matt-Sessions once a month :-)

    About Canonicalization - What about https and http versions? I have a site is indexed for https, in place of http. I am sure this too is a form of canonical URIs and how do you suggest we go about it?

    Thanks Matt, and wish you happy n prosperous Y2K6 :-)

  21. Rod Said,

    January 4, 2006 @ 10:52 am

    Hi Matt
    great post, many thanks. One of my older sites has a mix of www and non www internal links.
    The site is OK but I would like to change to all www but dare not lest duplicate content problems etc. I am on a windows server with no ht access.
    Would you advise changing the non www or leave well alone?
    Thanks in advance
    Regards
    Rod

  22. ummm.... Said,

    January 4, 2006 @ 11:18 am

    > It never hurts to ask external people to link to your site in your preferred manner as well.

    What if external pepple are linking in your non-preferred manner? On purpose? A lot?

    What safeguards exist for small sites?

  23. Scott Said,

    January 4, 2006 @ 11:37 am

    A few weeks ago one of our clients bought a new domain and wanted the traffic from the old domain to point to it. We set up a 301 redirect and the first domain completely dropped off the face of the earth in the search rankings within 24 hours.. (the first domain had a good page rank) - will the page rank for the old domain be redirected to the new one (it has basically the same content) in due time or will the SEO have to start again from scratch?

    I’m sure I’m not the only person who has experienced this, so it would be useful to find out more about it!

  24. George Said,

    January 4, 2006 @ 11:53 am

    Matt,

    Great and extremely informative write up once again. I guess in fact all of your posts about SEO will be informative to us because you know so much inside info.

    You’re doing a great service by showing webmasters how to “properly” SEO their sites and in doing so you likely contribut to the reduction of SE spam and mak it a better place for everyone.

    I too am wondering about the https:// thing by the way.

  25. Matt Said,

    January 4, 2006 @ 11:55 am

    McMohan, Google can crawl https just fine, but I might lean toward doing a 301 redirect to the http version (assuming that e.g. the browser doesn’t support cookies, which Googlebot doesn’t).

  26. Supplemental Challenged Said,

    January 4, 2006 @ 1:35 pm

    “When Google eventually recrawls those pages, it will pick up the changes”

    This is still false, and it is hard to understand why you keep saying it Matt.

    Google doesnot obey 301s or 404s involving suplemental listings, even if those 301s and 404s are seen every single day.

    Apparently this is still the principal reality disconnect at Google. Printing this false information is not helpful. (The other stuff is very usefule though.)

  27. jimbeetle Said,

    January 4, 2006 @ 2:07 pm

    Hey Matt,

    In all of the discussions on removing old pages from the supplementals everybody talks about serving a 404. Even you say, “If the pages are truly gone, I’d make sure that you serve a 404 on those pages.”

    This always confuses me. Why not serve a 410 so there can’t be any ambiguity as to whether the page is just ‘not found’ at the moment, or the webmaster is actually telling the bot, ‘Hey, this is gone, don’t look for it any longer’?

    Do some bots not handle a 410 correctly? Or am I (as usual) missing something simple stupid here?

  28. stuntdubl Said,

    January 4, 2006 @ 3:48 pm

    Very nice knowledgebase Matt. Thank you.

  29. Matt Said,

    January 4, 2006 @ 5:22 pm

    Supplemental Challenged, a better way to say it is “When Supplemental Googlebot recrawls those pages…” I believe, it’s true that normal Googlebot crawling won’t affect supplemental results. Supplemental Googlebot can take (sometimes much) longer to get out and recrawl pages.

    jimbeetle, I’ll have to check to make sure that 410 (permanently gone) is handled correctly.

  30. Aaron Pratt Said,

    January 4, 2006 @ 7:28 pm

    You rock Matt!

    You answered this question perfectly for what I needed, I set 301 redirects just this week and was wondering if I had done the right thing. Looks like I have.

    Thank you

  31. Prestige Said,

    January 5, 2006 @ 12:34 am

    Matt,

    Is there something to do on cases when you’re getting backlinks to
    http://www.domain.com versus http://www.domain.com/ ?

    Does it have the same bad effect as domain.com versus http://www.domain.com?
    and if so, can I use an apache 301 redirect the same way for solving
    that problem?

  32. Krijn Hoetmer Said,

    January 5, 2006 @ 1:05 am

    I’m also very interested in the ‘410 Gone’ response and how Google handles that one. Normally it takes quite some time for Google to ‘forget’ about pages returning a 404. Perhaps with a 410 there’s no question about the gone-ness :)

  33. Alan Perkins Said,

    January 5, 2006 @ 5:14 am

    Matt

    You say “www.example.com” and “www.example.com/” are different.

    What would be the HTTP GET request for “www.example.com” and how would it differ from the HTTP GET request for “www.example.com/”?

    I have

    GET / HTTP/1.1
    Host: http://www.example.com

    for both.

    I agree that http://www.example.com/dir and http://www.example.com/dir/ are different, but I don’t think this applies at the root level of the domain.

  34. Aaron Pratt Said,

    January 5, 2006 @ 10:10 am

    This is all a little confusing actually.

    I changed my urls in my blog to be index.html, keyword1.html, keyword2.html and so on, BUT in blog format the categories end in http://www.domain/category/. Is this a bad thing?

    If you want to take a look click my URL in my blog. Anyone?

  35. Tim Said,

    January 5, 2006 @ 12:52 pm

    Very useful, I’m just in the market for a 301 because ’someone’ decided to cache two versions of my homepage! No names….

    Surprisingly, my host told me they don’t support 301 redirects! So now I have to switch hosts too!

    My question however is related to some restructuring of my site, tidying up and putting files in folders etc.

    If I do not or am not able to implement 301’s for all of the moved pages, will these changes have an adverse effect? Will duplicate content flags start waving?

    Will Google eventually drop the old URL’s and only pay attention to the new ones and what will happen in the meantime?

    I am adopting 301’s as good practice from this moment forth but without them will sites suffer….?

  36. spike Said,

    January 5, 2006 @ 1:30 pm

    >> What if external pepple are linking in your non-preferred manner? On purpose? A lot?

  37. spike Said,

    January 5, 2006 @ 1:30 pm

    >> What if external people are linking in your non-preferred manner? On purpose? A lot?

    Adding the “base” tag to all pages of the site, and especially the index page, can help a lot. If you use absolute linking within your site (like “http://www.domain.com/folder/page.html”), or use relative links which count from the root (i.e. begin with a slash, like “/folder/folder/page.html”) then the URL in the “base” tag only needs to be “http://www.domain.com/” as in, the root domain.

    If you use relative links (like “folder/page.html” {NO leading “/” on URL} or like “../../../folde/page.html”) then the URL in the “base” tag MUST be the full URL of the page that it is on. The “base” tag can help canonicalisation even before you get a 301 redirect from non-www to www in place.

    Oh, and when I say “links” in this post, I mean URLs in clickable links, as well as paths to images, and any external CSS or javascript fies too.

  38. David Said,

    January 5, 2006 @ 5:22 pm

    Thank you Matt.

    Great post. There aren’t many blogs where you feels smarter after you leave them… :)

    For some reason, my mind had put canonicalization into the cache date bin, but never researched much further… kinda like when you think a word means one thing until you start using it in public. You quickly realize it has the opposite meaning. Ever done that?

    Thank you for answering the question.

  39. Tim Said,

    January 6, 2006 @ 10:08 am

    In researching this to understand the necessity, I discovered my main competitor has both the www and non-www version of his homepage cached by Google. (Non-www version, PR0. www version PR4)

    He’s still ranking No.1 for a very popular generic search. As are many of his internal pages for other permutations.

    So I throw this into the melting pot and say - How big an issue is this…??

    I understand as a matter of cleanliness it’s good practice but as for it’s effect on rankings I’m now sceptical….

    I would not necessarily attribute a loss in rankings to this. There are other factors in play. This is not ‘the answer’. Just one of them…..

  40. Mary Johnson Said,

    January 6, 2006 @ 3:07 pm

    Some of the posts seem to be mixing apples and oranges.

    Please clarify whether it is important to standardize the canonization of links externally pointing to a site’s home page to links internally pointing to a site’s home page. Example below.

    Matt, you obviously are talking about external links to a site when you replied “Brian M, I would opt for http://www.example.com/ personally instead of http://www.example.com/index.asp. ….”

    But what about internal links? Is it OK to use “index.asp” or “/index.asp” (and not just “/”) for relative addressing?

    For example, I prefer to use relative addressing for the ease of maintenance and testing locally. Because I use Dreamweaver, I don’t have the option to use “/” for the home page links. I must use either “/index.asp” or “index.asp”. Neither of these match the external link to the home page in the format http://www.example.com/.

    I need help in understanding if it is OK to use a different format for internal links vs external links for the home page. If the difference is minimal, then to me there is a much bigger benefit derived by still being able to use a tool that lets you test locally.

    I agree with Tim in questioning the true importance of all this in light of the big picture.

  41. Key_Master Said,

    January 11, 2006 @ 12:10 am

    Hi Matt, I think it’s great that the folks like yourself at Google take the time to work with the webmaster community.

    My question is, does Googlebot consider the HTML “base” element in web pages as part of the canonicalization process ?

  42. Brajeshwar Said,

    January 12, 2006 @ 10:31 pm

    Cool, clear a doubt that had been troubling me for quite sometime! Thanks

  43. vikasamrohi Said,

    January 13, 2006 @ 10:02 am

    Hi Matt,

    Thanks for all this, u looks dude in blue shirt :). Well Google Sitemap can help us in the caching of supplement pages.

    Thanks

    Vikas

  44. Tim McCormack Said,

    January 13, 2006 @ 11:06 am

    So, I’ve been thinking about what sites can do to provide canonical-URL hinting to search engines. I’m wondering if perhaps a might be an option. Obviously, since I just pulled it out of thin air, it wouldn’t do any good unless search engines began to look for it. I’ve written up more of my idea elsewhere to keep this comment short. What do you think?

  45. IceTea Said,

    January 16, 2006 @ 2:59 pm

    Dear Matt,

    I do not understand the problem.

    Some providers doesn’t support 301 redirects and so many people have no possibility to canonicalize (strange word :-) their url’s.

    I think I can’t be so difficult for a search engine to find out that the content is exact the same with and without www and so the url’s are identical.

    It’s time to fix the problem.

  46. James Said,

    January 18, 2006 @ 1:42 am

    Matt - Scott posted the following question to you on 4 January, so far without reply:

    “A few weeks ago one of our clients bought a new domain and wanted the traffic from the old domain to point to it. We set up a 301 redirect and the first domain completely dropped off the face of the earth in the search rankings within 24 hours.. (the first domain had a good page rank) - will the page rank for the old domain be redirected to the new one (it has basically the same content) in due time or will the SEO have to start again from scratch?”

    We’re about to proceed down exactly the same path (i.e. changing domain as a result of a rebranding) and we’re keen to minimise the impact on our hard earned page ranking.

    Any words of wisdom that you can pass on?

  47. ProClub Said,

    January 18, 2006 @ 9:32 am

    Mary Johnson, why not use a local webserver? Apache, or if you have trouble setting it up, Xitami, for example? Both are free and really easy to use.

  48. ProClub Said,

    January 18, 2006 @ 9:33 am

    Matt, you really should do something about this security code thing. It doesn’t work without JS. And you exclude blind people as well…

  49. Adam Senour Said,

    January 19, 2006 @ 9:03 am

    Hi Matt,

    I found an example of a canonical URL issue.

    http://www.libreriapandevida.com
    libreriapandevida.com

    Two different PageRanks, same site.

    But in this case, I just redesigned this site and relaunched it Tuesday so that might have something to do with it (although I doubt it, I throw it out there as a possibility).

  50. Jonah Stein Said,

    January 19, 2006 @ 12:04 pm

    Matt:

    When we see two results from the same site (with the second one indented), isn’t that a c14n issue? online poster printing, http://64.233.179.104/search?q=online+poster+printing&ie=utf-8&oe=utf-8&num=10&hl=en&start=0, yields two results from the same site where the second one it the homepage.

    Shouldn’t we only see one result from each site or at least see the homepage suppressed in favor of the inside page which is presumably more exact/granular a match for the search.

    Thanks

  51. Marko Said,

    January 24, 2006 @ 7:20 am

    Thanks for these tips. I was thinking the same before but now you confirmed my thoughts and I am sure it these are facts.

  52. rohit Said,

    January 31, 2006 @ 4:22 am

    dear Matt thanx for this wonderful information

    but can any one help my website was coming in good position in google but all of sudden it disappear from google

    my website url is http://www.pramodmarutiparts.com

    can any body help me what to do

    regards
    rohit

  53. Clint Dixon Said,

    January 31, 2006 @ 5:18 am

    Hi Matt

    Wonderful post. So the spiders are now a pair that work in tandem??

    Would one be considered a quality control check of the sites links??

    Trying to clear up other peoples mistaken ideas and thoughts.

    Thank you.

  54. Karl Said,

    January 31, 2006 @ 6:29 am

    Hi Matt!

    nice 2 hear from googles intentions, but here more questions than anwers.

  55. Sherine Said,

    January 31, 2006 @ 11:15 am

    HI Matt.

    Always value your insight.
    I have http://example.com ranking PR5 with more in links and http://www.example.com ranking a PR4.

    I would prefer a http://www. If I do a redirect will the PR follow as well?

    Hope you can help clear this for me before I make the move.

    Thanks again

  56. Tea Lover Said,

    February 2, 2006 @ 7:28 am

    Thanks MATT, posts like these are helpful to business owners like me who tend to optimise their site on their own and avoid SEO companies. I am a learner in SEO, but am picking up.

    Matt is right about http://www.example.com and http://example.com
    Example site I have noticed is:
    http://teahistory.net and http://www.teahistory.net

    One question that may prove a petty thing for you, but its burning in my mind. Suppose I link my internal pages like “pages.html” instead of complete url “http://www.example.com/pages.html” - will this effect with the result?

    The other thing is that linking requests come to my site and I never include a trailing slash and give them my URL as http://www.example.com and not http://www.example.com/ . If someone links to my site with http://www.example.com and the other links as http://www.example.com/ - will the two linkings be considered as linking to two different domains - just an info update that is not needed - its not important.

  57. Karl Said,

    February 6, 2006 @ 8:36 am

    maybe helpful or..

    think about words as circles in the theory of sets, and the momentary semantics of a word
    is the spot of the deposit

    lamp lamppa lampe lampada luminaire all the same circle and nearly same spot

    light Leuchte light fixture
    are next circle

  58. nayalbugs Said,

    February 7, 2006 @ 11:49 pm

    Very informative !!!! I am impressed….i am expecting a detailed article about big dadduy update :-)

  59. Karl Said,

    February 8, 2006 @ 2:42 am

    And words have axes of covering

    For example look at

    http://www.google.com/search?hl=de&q=vesuv+ma%DFe&btnG=Suche&lr=

    This looks fine to me

    But all other may laugh or…

    Because maße
    Means dimensions or gages

    And this word/formula has an axis to the object of ALL
    By me it is to lamp

    In the search the object should be to Vesuv / Berg / mountain are next axis

  60. Karl Said,

    February 8, 2006 @ 2:59 am

    Another sorry but german is …

    and so the serp is a mix of errors

    There are Maße first more a spoken - dimensions

    and Masse scharper s and more end e means - bulk or compound

    http://dict.leo.org/ende?search=ma%DFe

    and say if it is enough!

  61. Karl Said,

    February 9, 2006 @ 5:24 am

    A SearchEngine buisness is to recognize the pattern which the user is looking after and
    to give him reasonable terms.

    Therefore it must know about
    1. language specifics (see grammar)
    2. psychological affectations of humans ( in their cultures)
    3. your stats of searches (are the expression of 1+2)

    I can say you something about point 1

    To build up some little filters of context, you had to bring
    the Gramma of the language to some short rules.

    Look at the system that works in You, from the eye to your consciousness.

    There are some arts of sensors in your eye.
    There are some filters before the signals comes into the brain, which
    works with the combined signals from the different sensors.

    So are you able to see first, what the first capital of importance for a biped is (motion).
    There are filers for edges; this is the first step of your search engine to build you the room you live in.

    Really in your mind comes only about 15% of all. If there were no compression, you could see less pictures per time and don’t act because of impulse flooding in your brain.

    And read about the strategies criminalists use for their searches.
    Who
    Where
    Why…..?
    Are axes of content between the words of any document and there are more axes like this
    Examples which can compress it.

    For my Site I can tell you a little rule you can sort them easy:

    Ever shorter ( not on letters ore words) the URL, ever higher it had to stand
    Or more level ( /../ ) ever deeper down to sort

  62. James Said,

    February 10, 2006 @ 5:33 pm

    11 Feb 2006.
    ————-
    Hi Matt, Hope you can help here…In your entry above it is written…
    Q: If I want to get rid of domain.com but keep http://www.domain.com, should I use the url removal tool to remove domain.com?

    You say NO, …So how SHOULD it be done? by a 301?
    I have had a 301 in place for several months now, and yet the non-www pages have NOT been removed, how long does it take? Is this the way to do it?

    and whilst on the 301 subject, is there a definitive way a 301 should be written?

    RewriteCond %{HTTP_HOST} ^domain\.com [NC]
    RewriteRule ^(.*)$ http://www.domain.com/1 [L,R=301]

    OR

    RewriteCond %{HTTP_HOST} ^domain\.com
    RewriteRule ^(.*)$ http://www.domain.com/1 [R=permanent,L]

    OR EVEN…

    RewriteCond %{HTTP_HOST} !^www\.yourdomain\.com
    RewriteCond %{SERVER_PORT} ^80
    RewriteRule (.*) http://www.yourdomain.com/1 [R=301,L]

    which (if any) of the above are correct (seeing as all 3 have been recommended :-(

    Some assistance/clarification in this matter would be appreciated

    Regards
    James

  63. sfisher Said,

    February 16, 2006 @ 3:03 am

    Matt,

    Would the following two be considered canonical URLs?
    (1) http://www.example.com/
    (2) http://www.example.com/?affid=123

    The 2nd URL is an affiliate link posted on the affiliate’s site. When a visitor comes to the example site, is it necessary that the affid be stripped off the 2nd URL and redirected to 1st URL?

    Thanks
    sfisher

  64. Fred Said,

    February 24, 2006 @ 8:21 pm

    Matt,
    How critical is it to use the

    RewriteEngine On
    RewriteCond %{HTTP_HOST} … redirect to the www version?

    I was experimenting with going without it, so I do have both versions of my domain active, but I always use the www version for links. I have read some claims that other search engines such as Yahoo don’t like 301 redirects.

    I’m doing terrible on Google right now. I don’t suppose this www issue could be my problem? I know of other people who’s sites are doing well on Google without it.

  65. Nirupam Said,

    March 6, 2006 @ 11:15 pm

    Recently i have implemented 301 permanent redirect from http://domain.com/ to http://www.domain.com/. All my internal pages except home page is listed in Google without “www”. My internal pages are ranking high in SERP. After implementation of “301″ will i see any changes in the SERP as well as PR ? Or it will remain the same ? If at all there is some changes, how long it take in Google to index all my pages with “www” ?

  66. Colin Said,

    March 7, 2006 @ 2:13 pm

    Quality Guidelines - Basic Principles

    * Make pages for users, not for search engines.
    * … Does this help my Users? Would I do this if search engines didn’t exist?
    * … Webmasters who spend their energies upholding the spirit of the basic principles listed will provide a much better user experience and subsequently enjoy better ranking…

    Google are full of it! One the one hand they say produce your website for your users and not for search engines and on they other they tell you that you must do certain things just for search engines. For example:

    * Allow search engines to crawl your site without session Id’s or arguements that track their path through the site.
    * Don’t use “&ID” as a parameter.
    * Have other sites link to yours.
    * Fancy features [which enhance the user experience] such as JavaScript, cookies, session ID’s, frames, DHTML, or flash…search engines may have trouble crawling your site.

    …and now the latest gripe from the search engines, they can’t tell the difference between:

    example.com
    http://www.example.com
    example.com/index.htm
    http://www.example.com/index.htm

    ..and that is our fault, why?

    Google are full of it.

  67. Smallest Violin Said,

    March 8, 2006 @ 9:34 am

    Put in my vote that there should be a function in SiteMaps to let webmasters/siteowners choose which version they’d like to return in results (this would cut down on your db site considerably), assign all link value to one, etc. I would really help for those that have a dmoz listing with non-www and say a business.com/yahoo listing with the http://www. version. Maybe they already get the same link value as long as there is a link, but to be able to define it without going into htaccess or conf would be great!

  68. RG Said,

    March 10, 2006 @ 11:36 pm

    Google is behaving strangely to one of my sites’ PR…:(

    It is 2…3…5…6 ..keeps changing every hour!

    Looks like a problem w/ Canonical Issues..In the rankings, it has stopped showing “www” with the Url.

    Should i go for a 301 redirect from non www to www version or wait till Google Update is over?..Any suggestions would be great!

    Thanks.

  69. Frady Rose Said,

    March 20, 2006 @ 8:56 am

    Thanks Matt, your information is very helpful. I was wondering if you can help. We have over 1,000 supplemental results due to a site redesign. We have worked very in making sure they all have the most relevant 301 redirects. So far Google has not removed the supplemental results. Is thier anything I can do to speed up the process in having Google get rid of the supplemental results? Does supplemental results have any effect on rankings?

    Your help in this matter would be greatly appreciated.

  70. Tim Day Said,

    March 23, 2006 @ 4:26 am

    Canonicalization makes no difference. All this time people are chasing their tails worried about being penalized by the big ‘G’ for canonicalization issues.

    My major competitor has both versions of their URL cached by Google and yet they have been consistently ranking No.1 for ‘credit cards’ on Google.co.UK regardless.

    They even have some really neat appandments to their search result with other internal pages directly linked from their listing and a ‘more results from this site’ link. Google apparently really likes this site!!

    They have been ranking consistently for years (as I had) for this popular search and they still are.

    I have set up redirects and I no longer rank at the top like I used to. OK, so I had some bad neighborhood inbound links for a while which didn’t help but setting up the redirects hasn’t made Google forgive me by any means.

    So if think it is the answer to getting back in the Google SERPs you are gravely mistaken.

    It is also highly unlikely to be the reason your site was removed in the first place.

    Personally I think this is all part of ‘G’s big power trip. Two guys from a garage that have forgotten their values….

  71. Tim Day Said,

    March 23, 2006 @ 5:00 am

    Thoughts on the trailing slash…

    Websites for visitors not for search engines.

    Go out in the high street with two pieces of paper. One with your URL showing a trailing slash and one with your URL and no trailing slash.

    Ask random people which one they are more likely to type into an address bar if they knew the site and wanted to find it.

    There’s your answer.

    Unless of course your site is wholly targetted towards knowledgeable developer types… Like this one…

  72. Chris Said,

    March 30, 2006 @ 3:22 pm

    Colin: I’m with you!

    I (and many, many other people) have spent my life building quality websites that work well for real people. I’m a consultant and people come to me for quality work and quality advice.

    Now I have to do what my clients’ SEO “gurus” tell them to tell me to do - and it regularly contradicts not only my interaction design decisions (not to mention usability and accessibility). My advice is not being sought in these cases: I am being told what to do.

    Not only that but most what I get told does not hold water - it’s full of contradictions about session IDs (if Google doesn’t index them how come there are so many in Google’s index?) and redirects (they’re bad, they’re good, they’re essential) are just a starting point… And the SEO people behind all this are typically not very technical so - in effect - they are spreading FUD to larger companies to make money from them. Isn’t that called scare tactics?

    I’m not being precious here: I’m pointing out that we have crossed a threshold where what’s good for people may not cut it anymore. Effective websites are now Google-friendly - and this means holding back on the innovation and user interface improvements.

    Is this how it should be?

    I mean, think on this: Google and the like are essentially page-based. Web 2.0 is not.

    So perhaps we should not bother with Web 2.0 applications..?

    Chris

  73. Brad Said,

    April 5, 2006 @ 6:34 pm

    I recently had a good experience with the 301 redirect of pretty much my entire site. After two and a half years I figured out the importance of sitemaps, redirects, keywords, etc. So this morning all of our newly updated pages received their updated PR. The first level navigation pages received PR5’s and the next level PR4. The only problem I have is my home page for the www url is PR3 and my non-www url is also PR3. My guess is my homepage should be a PR6. ???

    I use windows server and IIS and have used 301 redirects throughout my entire site structure with success, but I can’t figure out how to do a 301 redirect on either the www or the non-www in the IIS console. I was trying to do a 301 on the directory, but wondered if you do the redirect on the index.htm file? I know with Unix you can write it into your httpaccess.txt file, but is that possible on IIS. Anyone know the answer here? Much appreciated.

    Great articles on this site by the way. Thanks Matt!

  74. Brad Said,

    April 5, 2006 @ 7:13 pm

    Oh, the 301 redirects from the old files to the new were done about a month ago. Right after the previous update and right in time for this latest update it appears. So not sure if I just had lucky timing with the update or if it takes a month normally. From what I’ve been reading my guess is a month is not the norm.

  75. Joe Hayes Said,

    April 18, 2006 @ 10:15 am

    So, what is the correct .htaccess line to resolve this issue?

  76. VB Said,

    April 27, 2006 @ 6:27 am

    How can I do a 301 redirect under Apache?

  77. Ryan Tongg Said,

    May 3, 2006 @ 1:31 pm

    Hi Mat,

    My question has to do with www version vs. non - www version. The non-www version of my site has a PR of 3. The www version has a PR of 0. Recently, I have been added to a ton of listing directories all linking to the www version. I don’t want to split backlinks between these two sites and would prefer to build on the non-www version PR. Should I do a 301 redirect from the www version to the non-www version? Will the backlinks from those directories still count towards my non - www version?

    Much Thanks,

    Ryan Tongg
    808-223-8833

  78. George Newman Said,

    May 3, 2006 @ 2:03 pm

    Matt,
    We have not used Base Hrefs in a long time (’cause we have not needed them) but is this not at least as valid if not a better way to take care of multiple domains responding to the same site? W3C is very clear about what a 301 re-direct should be used for AND this should also avoid 302 attacks.

  79. Sinan Osan Said,

    May 16, 2006 @ 4:34 pm

    My site is on shared hosting windows 2006

    I wanted to redirect non www to http://www.domain

    First I used code below

    this code gave error message on firefox
    error message: “Redirection limit for this URL exceeded. Unable to load the requested page. This may be caused by cookies that are blocked.”

    Second I used thi code and it seems working fine for Intenet explorer and the mozilla firefox.

    my question is I hope I am not bordering the Google guides, help me please if I am.

    working code
    “www.mysite.com” Then
    HTTP_PATH = request.ServerVariables(”PATH_INFO”)
    If Left(HTTP_PATH, 8) = “/default” Then
    HTTP_PATH = “”
    End If
    QUERY_STRING = request.ServerVariables(”QUERY_STRING”)
    theURL = “http://www.mysite.com” & HTTP_PATH
    if len(QUERY_STRING) > 0 Then
    theURL = theURL & “?” & QUERY_STRING
    end if
    Response.Clear
    Response.Status = “301 Moved Permanently”
    Response.AddHeader “Location”, theURL
    Response.Flush
    Response.End
    end if
    %>

    thanks.

  80. Margaret Said,

    May 18, 2006 @ 7:41 pm

    I’m totally confused
    I wasn’t getting indexed so decided to put in a sitemap after making some file name chages to my site: I changed every page but the homepage index.html. (site is just about a year old)
    The site map area showed a non-www url for my site and wouldn’t take a site map. I use the http://www. for all links in my site out of my site and for inbounds. So I deleted that domain from the sitemap area., to me it didn’t reflect the domain I paid for which was http://www. (i’m pretty green to this stuff) I added the http://www.domain and it accepted the site map and everything is crawling complete with 404 errors. :)
    I did this and am now finding out the indexing wasn’t happening probably because I need more inbound links.
    So in deleting the non www domain from the sitemap area how much damage have I done? Will everything get pulled for 180 days? how soon will that happen if it does? if it does pull both then does the one resurface after and the other disappear?
    I haven’t had any penalties and the only guideline not followed is the lots of links one.

    Thanks for any help on this
    Margaret

  81. rck Said,

    May 20, 2006 @ 5:39 am

    Hi Matt, this is a very useful post for me. I’ve been wondering, why all of the sudden links from Google changed from http://www.kiesler.at/ to http://kiesler.at/. I didn’t understand this.

    Knowing what’s in this post, I apparently need an permanent redirect from kiesler.at to http://www.kiesler.at. Now if I only knew how to do that on my Apache webserver…

  82. Guy H Said,

    May 30, 2006 @ 9:51 pm

    For rck
    There is an article here on appache 301 via htaccess.
    http://www.searchenginepromotionhelp.com/m/articles/search-engine-problems/domain-redirection.php Im no programmer but i enlightened my programmers to this article & they were thankfull. Hope it helps you.

  83. Jayne Said,

    June 3, 2006 @ 9:11 am

    I wonder. Were you ever struck by lightning whilst fishing ?

    BTW. It keeps saying (5th try) Invalid security code. Press your browsers back button and try again.

  84. TiMoGo Said,

    June 13, 2006 @ 6:07 am

    VERY happy I found this website! I get sick of some of the “speculation” on the forums and it is nice to have the “final word”.

  85. Stephane Said,

    June 13, 2006 @ 2:52 pm

    - Initially our site was configured solely as site.com and not http://www.site.com
    - Google initially indexed all pages starting with site.com (some http, other https)
    - Both https and http versions were accessible on the site.com domain (which I know is not good practice. This also need to be fixed).
    - A redirect in the apache config file was put in place from non-www to www for all pages starting with http://site.com/
    - Ex: http://site.com redirects to http://www.site.com
    - By error, this was not put in place for pages starting with https://site.com
    - Ex: https://site.com does not redirect to https://www.site.com

    Result
    ——-
    - Since the pages starting with https://site.com were not redirected to https://www.site.com, both versions of these are accesible and indexed, creating duplicate content on Google.
    - Our home page: http://www.site.com is the only page who has a page rank as all other links starting with www were not indexed by Google.

    Questions:
    - Is the best way to redirect all https://site.com to http://www.site.com and restict access to https only to those pages that need to be secured?

    Any advice or help would be appreciated.

  86. Martin Jones Said,

    June 13, 2006 @ 10:08 pm

    While in the process of changing my pages in .htaccess, part of my site was indexed. As a result, when I look at my Google sitemap account some pages are listed with both a trailing slash and no trailing slash.

    Should I be worried about the spider looking at this as duplicate content? I don’t want to be penalized. All pages now have a trailing slash.

  87. rck Said,

    June 20, 2006 @ 1:45 pm

    Hi Guy H, your tip worked great, thanks! :-)

  88. Adam Moro Said,

    June 27, 2006 @ 4:02 pm

    Is there a suggested canonicalization? I see most every authority site is now directing example.com/ to http://www.example.com/ and I’m wondering why. Isn’t the “www” counted in the character length of the URL?

  89. PJ Said,

    June 29, 2006 @ 1:00 am

    “but because it can take longer for us to crawl supplemental results, you might not see that update for a while.”

    A very crude cybersquatter hijacked my client’s domain for a short period of time, this was about a year ago. (I helped her get the domain back.) The cybersquatter used “bad neighborhood” keywords that are too disturbing to mention. I recovered the domain about a year ago and have done my best to repair the damage to her reputation. Now this respectable doctor (a year later) still has these raunchy keywords in Google’s supplemental results, long after the domain has been returned to the rightful owner and cleaned up. Is this why her PR is still 0 after a year and she can’t be found in the organic results?! (Not sure how long she owned it before the cybersquatter got his hands on her name…) Ugh!

    Is there anything else I can do to repair her reputation in Google? Anyone can agree that this cybersquatting hijacker has damaged her reputation in Google, terribly! I’m going to try the “automatic removal tool” next. I just can’t believe these raunchy results are still in Google after all this time. I feel terrible about it. Imagine how much business she has lost because of these old uncrawled results. Because of this she’s had to buy AdWords traffic, because it’s nearly impossible to find her buried in the organic results. I hope that Google will consider the webmasters of domains that have been hijacked and/or cybersquatted for a short period of time–it’s terrible what can happen to your reputation in a short period of time, in Google especially. Thank you so much for your time. Best regards.

    PS: She ranks well in MSN and not good in Yahoo
    PPS: I love this blog. I have learned so much here!

  90. Rita Afshar Said,

    July 28, 2006 @ 8:07 am

    hi Matt,

    I got two questions:
    Q1: I have been assinged to do 301 redirect from
    http://www.mysite.com/default.asp to http://www.mysite.com

    When I use the following code:
    Response.Status = “301 Moved Permanently”
    Response.AddHeader “Location”, “http://www.wsicorporate.com/”
    Response.End()

    It goes into infinit loop, what else I can do about it?

    Q2: The search engin shows two different page ranking for the same page:
    http://www.mysite.com/default.asp and
    http://www.mysite.com/Default.asp

    How can I make sure, it only one of them comes up in search enging?

    Thanks a million
    Rita

  91. Anand Said,

    September 3, 2006 @ 12:15 pm

    Thanks for the info. I always had a doubt about the URL’s which is now cleared.

  92. Rohit Said,

    September 15, 2006 @ 10:18 pm

    Dear matt

    thanx for the information

    i need your help to get all information abt google update is there any website where i can study all the information abt google updates till date

    Regards
    Rohit

  93. Will Johnson Said,

    September 16, 2006 @ 3:45 am

    Thanks for the info.

  94. BlueDevilMedia Said,

    September 20, 2006 @ 10:58 pm

    I’ve seen this article referenced many, many times…definitely a useful resource! As always thanks for the insight Matt!!!

  95. Wouter Said,

    September 21, 2006 @ 3:17 am

    It has been asked many times in this topic but I don’t see an answer.

    The question is:

    domain.com has PR4
    http://www.domain.com has PR2

    Now we are going to use 301 on domain.com to http://www.domain.com.
    What kind of effect will this have on the PR and the backlinks?

    Will you loose the PR4 and all of your backlinks? Or will it go to http://www.domain.com

  96. Sima Said,

    September 24, 2006 @ 7:32 pm

    Dear Matt,
    thanks for explaining difference between domain syntax http://domain.com and http://www.domain.com in such a lucid way. I was confused to find different PR to my old site.

  97. Small Business Hosting Said,

    September 27, 2006 @ 6:12 am

    Will a 301 redirect transfer the pagerank to the new location?

  98. Anders Gustavson Said,

    September 27, 2006 @ 6:24 am

    Hi

    I have what I think is a simular problem. I have a site that is index on both http://www.domain.tld and the corresponding IP-address. SiteMaps did solve the problem with www vs non-www but I can’t find anything regarding IP.

    Is there any safe way to remove the xxx.xxx.xxx.xxx pages from the index and keep them out?

    \Anders

  99. mike Said,

    October 5, 2006 @ 6:41 am

    I have the same problem. Some of my pages are indexed as www and half are indexed as the ip address. I would love to find an answer.

  100. Yankee Said,

    October 7, 2006 @ 8:47 am

    In my way i have the same problems.

  101. Lee Said,

    October 12, 2006 @ 1:47 am

    Hi,

    If I have a url:

    http://www.example.com/index.php?play=yes&test=ok

    and want to redirect to:

    http://www.example.com/index/play-ok

    Should I use redirectMatch or rewriterule to do this, as currently I can’t seem to redirect urls with get params.

    Thanks,

    Lee

  102. grey lantern Said,

    October 15, 2006 @ 4:23 am

    Thanks Matt for the continued explanations and advice about this stuff. I have been reading up on Canonical issues for a while (suffering from one myself due to not knowing about them before hand and not using 301 protection), I have set up a 301 and on server name resolution so that all requests for the main index page go to http://www.theurl.com/ (the trailing slash is always added anyway).

    Google still shows http://www.theurl.com/ and http://www.theurl.com/index.php in the serps and is docking my PR due to it. Will the 301 be “picked up” by the main googlebot and remove the “index.php” reference from the results in due course?

    Also I can’t fathom out why this sort of thing isn’t under the webmaster’s control? If I know that the result http://www.theurl.com/index.php is WRONG then there should be a system to remove JUST that reference? Is this impossible?

    I was thinking that a verification code tag could be uploaded to the page (as per sitemaps) to prove you are the owner and google could see at once that any reference in it’s database to THIS page should be removed and if need be replaced with the TLD (or specified alternative).

    thanks.

  103. grey lantern Said,

    October 15, 2006 @ 4:40 am

    To clarify, I realise you can remove URLs right now (supposedly) and set your tags etc to aid this process, but there seems to be no specific tool to address the removal of results that point to a valid page “index.php” that you most certainly do NOT want to remove from google but just want to point to the TLD instead.

    I can remove a page I don’t won’t but not a page that is being retured twice in the results.

  104. John Clark Said,

    October 17, 2006 @ 12:31 am

    Personally, I think everyone should stop kissing google’s a**. I do just fine without google. Most of their search results seem to be crap anyway.

    Admit it, accept it, and make google follow the people that own the sites. 100 million people think they have to do what google wants? Ridiculous!

    Old fashioned style works for me. REFERRALS FROM THE CUSTOMERS… Much more productive than some damn search engine that wants to dictate how I should construct MY site. Then wants to charge me hundreds or thousands of dollars to buy advertising in their site.

    I am sure that 1000 “gurus” will be chomping at the bit to explain why I am wrong. Save your breath. I don’t care what you say. I could care less what “canonical” refers to. Google is on my site everyday and if they stop immediately then i just save a little bandwidth. I am not going out of my way to please someone who doesn’t work for me and is not a customer. F**k them. If all of you thought the same way then google wouldn’t be causing you so many problems.

    I say take care of what you have and they will take care of you.

  105. James Murray Said,

    October 26, 2006 @ 8:57 am

    What would cause Google crawlers to index our site under an https://www. url verses http://www. Google has crawled and provided a PR for pages that start with https pages (only) rather than http://.www.

    Our home page (http://) url has a PR of 4. I realize the pr does not mean much but it still is an indication of being crawled and recognized. The https:// has a PR of 2 as well as the inner pages of the site.

    The first item under ‘Indexed pages in your site’ list in Google is the following

    Title of the Website Site
    This is the desription of the site and what it is all about, etc, etc.
    https://www.sitename.com/ - 43k - Cached - Similar pages

    Note the https:// instead of http://, previously it was the http:// in my old provider world.

    It has been 3 months since I switched providers and no pages are being re-indexed by Google on the http:. The site is over 3 years old and has never played in bad neighborhoods. This does not make any sense, it would seem that somewhere Google has been told that https:// is the main url to follow.

    Any insite would be greatly appreciated…

  106. Mike Helfand Said,

    November 8, 2006 @ 8:28 am

    Would google deciding to show my site at http://domain.com vs. http://www.domain.com have a result on google rankings? We have sustained a drop for a bunch of various terms and have noticed that Google has made those pages non http://www.

    I’m a bit of a novice, but also don’t understand how google finds non http://www. pages when we have never created anything that wasn’t http://www.

  107. Afzal Said,

    November 13, 2006 @ 1:02 am

    Hii Matt,

    Gr8 to read about Canocinalzation frm an trusted source. Here m having an question to ask bout hope u’ll mail provide the solution asap.

    Q : Suppose if I provide an 301 Permanet Redirection from ABC.com to XYZ.com and again that page XYZ.com has been redirected to PQR.com, will it cause serious problem, coz m facing such problem or simply u can say Dual redirection problem. Instead of providing 301- redirection from ABC.com to PQR.com it is getting one more in between re-direction. Site search results is showing pages as Supplemental. Plz throw some lights on this “Double Canocanalization” too…………..

    Regard’s

    Afzee

  108. VW Mike Said,

    November 21, 2006 @ 4:00 pm

    Hi Matt,

    Thanks for the elucidating article!

    I have been using the 301 redirect since the start, but I wasn’t sure why…

    This makes it clearer.

    Mike

  109. Ha Bui Said,

    November 28, 2006 @ 6:24 pm

    Hello,

    My website: http://www.travelsvietnam.com have page rank 5 in google. But from last month, I cannot seach it in the google instead of I enter the search query by my domain name: travelsvietnam.com. Could anyone please, show me the reason and how to reslove this problem.

    Thanks,
    Mr Ha,
    Vietnam Paradise Travel.

  110. Viz Said,

    December 2, 2006 @ 11:14 am

    I am using dynamic IP from DynDns and the only solution they can fix the problenm is the 301 redirection.
    I guess using a static ip will improve web page rating

    Good and informative, thanks

  111. Colyn Said,

    December 8, 2006 @ 1:27 pm

    Matt, I can’t find any definitive answers so I’m counting on you to give me the final word.

    Does url parameter order matter?
    (http://www.domain.com/search.php?a=1&b=2 vs.http://www.domain.com/search.php?b=2&a=1)

    If it does, can you briefly explain? Thanks.

  112. SGD Networks Said,

    December 22, 2006 @ 7:31 pm

    Hello Matt,

    Thanks for your information on canonicalization. Please suggest us to do a liitle to our blogs, we have heard that “.html” is having more value that urls ending with “/” . so we made url rewrite options in wordpress and all.

    Now as we noticed we haven’t get any pagerank and updations for our blogs yest where there are about 1 year in age.

    So i request you to just give me a little suggesstion that we do remove domain from the google index and make changes to our urls and again we request google for reinclusion.

    If yes, please tell us a word “yes” will help us a lot. Hope your suggesion also will help many blogs on http://www.

    Thanks in advance

  113. venkat Said,

    December 23, 2006 @ 1:08 am

    Dear Matt,

    We have PR previously 5 for our website http://www.exaltinfo.com and all these are static pages. when we are check with iwebtool it is showing 4 is this because of canonicalization?

    thanks

  114. George Said,

    January 5, 2007 @ 3:19 pm

    I changed my preferred URL in web master tools and now a lot of non-www links are showing up in the search. It also seems to have affected my ranking since I get a lot less visitors after this.

  115. Jenny Baln Said,

    January 7, 2007 @ 5:31 pm

    Great article, I especially like the comments! Heres what I’ve been using. From: Ultimate htaccess Article

    Options +FollowSymLinks
    RewriteEngine On
    RewriteBase /
    RewriteCond %{REQUEST_URI} !^/robots\.txt$
    RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
    RewriteRule ^(.*)$ http://www.example.com/1 [R=301,L]

  116. mihomes Said,

    January 13, 2007 @ 4:22 pm

    Matt,

    I’ve been reading up on this and my question follows over the actual domain put to folders within that site as well…

    For instance, would you also recommend :

    http://www.mysite.com/folder/index.htm

    be changed to

    http://www.mysite.com/folder/

    ???

  117. IG Said,

    January 13, 2007 @ 10:00 pm

    I have also tried playing around with the link in setting but still I get zero link-in count on google search for my domain http://www.parcusgroup.com/index.html
    or
    http://www.parcusgroup.com/index.html
    I am positive there are links to it if anywhere but in few web business directories but google just does not seem to pick them up.
    Can someting be done there ?

  118. Robert Said,

    January 25, 2007 @ 2:35 pm

    In general practice, and for Google in particular, is it a problem to have two separate URLs for the exact same web page - i.e., the web site would be cloned to display with either URL in the address line. example: http://www.abc.com and http://www.AlphaBetaCorp.org both displaying identical content (except for the respective URL in the address line)?
    Thanks.

  119. inter-dev Said,

    February 6, 2007 @ 9:43 am

    Hi Matt,
    Thank you for sharing this info.
    I have only one question,
    which is the proper way to link to the main root:
    http://www.x.com
    or,
    http://www.x.com/
    is there a difference?
    Thanks again Matt
    Aia

  120. Web Design Said,

    February 28, 2007 @ 4:04 am

    I found before several weeks that in the google webmaster tools there is an option. You can select how to be interpreted your url with or without www for and i’m already activated for my site and you can check it here
    http://www.bgpages.com

    thanks for that article

  121. Inetzeal Said,

    March 5, 2007 @ 9:31 pm

    I’ve used 301 redirection for one of my site, it works excellent for Google but adversely affected, Ranking in other search engines :(

    Any idea Matt?

  122. Catherine Said,

    March 8, 2007 @ 5:38 am

    Canonicalization? We`ve never heard before and never had problem.But now we have a “canonical” problem, i.e. our site is accessible via both http://www.mongolia.co.uk and mongolia.co.uk (without the http://www.) and google is indexing these separately (site:mongolia.co.uk returns 95 pages; site:www.mongolia.co.uk returns 64). We decided version with http://www.mongolia.co.uk, and also created 301 redirect file. Seems to be working. But now our Google search result is gone. We thought maybe we are not indexed again in Google (as has happened several times recently), but we are there. What shall we do to bring our site’s results back to as they were 2 days ago before we did the 301 redirct ? Will the next time Google spider crawls it search result will be same is before? How did this canonical problem arise ?

    Thank you.

  123. Jocuri Said,

    March 26, 2007 @ 12:19 pm

    Great explanation Matt, in conclusion it’s not good to use and promote both form of urls?
    How long it take to google to fix the problem if I choice from webmaster google tools to use the domain with www?

    Thank you

  124. Jocuridivertis Said,

    March 29, 2007 @ 12:06 am

    Hi Matt, thank you for explanations.
    Is this available and for subdomains? I mean, will Google see any difference between http://subdomain.domain.tld and http://www.subdomain.domain.tld ?

  125. Andy Said,

    April 2, 2007 @ 8:34 pm

    Very informative infomation. I noticed alot of sites with url like ab.ab.abb. I was just wondering how they do that?

  126. MATT Said,

    April 3, 2007 @ 5:41 am

    Can we talk about sub domains

    I use sub domains for tracking in Google addwords and offline marketing and sometimes on banners linking to the main site

    pp1.example.co.uk (links to the home page)
    banner1.example.co.uk (links to the home page)
    links.example.com (links to the home page)

    Is this spamming? Will this low our result?

    If this is classed as spamming I do not want to loose the external links to the sub domain is it a good idea to canonicalization the sub domain?

    Please help

    Matt

  127. JoshK Said,

    April 4, 2007 @ 12:14 am

    Could sure use some help.

    I am working on a site which has multiple domain names, the main domain ranks well and until recently the other domain did not show up in google at all. However in mid february i changed the dns around for the the “alternate domain” to correct some problems I was having. During the past 2 weeks I have changed the layout of the site and added a bit of new content.
    Today the main domain moved a bit higher in rank, and suddenly the alternate domain’s “index.html” has also appeared in the rankings. From what I have read it seems having both domain names for the same site show up in google could cause problems for the site, if this is true what can I do to prevent any problems?

    Note: The alternate domain is the domain that the owner of the website would prefer to be shown in searches, however it is the one with very few backlinks. If a 301 redirect is the solution, would 301′ing the main name to the alternate cause me to lose position?

    Thanks in advance for any help in this matter :)

  128. Edward Mills Said,

    April 5, 2007 @ 8:18 pm

    Hi Matt. Great post. You seem to have opened up quite a can of worms here. And it appears that there are not any clear answers to a lot of the questions that have been asked. So into the pot of unanswered questions, I will add a few more!

    I recently put a 301 redirect on my blog to combine the non-www domain into the http://www.domain. I chose the www version because in my research most people seemed to agree that it was the best choice. Unfortunately my non-www version had the page rank for my site (a 4 at the time I combined them). I also used Google’s webmaster tools to combine the two domains into the www version. My first question is, did I mess up? Should I have combined them into the version with the higher page rank and the majority of incoming and internal links? If so, should I now go back and change it? And if now, will Google eventually figure out that the new, combined, www version is the same and correct the page rank (currently unranked)?

  129. sherry Said,

    April 25, 2007 @ 12:24 am

    Hi Matt,
    Please suggest which is better
    anchor linkking Home with domain name or
    anchor linking Home with domainname.com/index.html

    will our index.html page get supplement if we not give its link in any page and use domainname.com for reaching index page

    Regards,

    Sherry

  130. Iain Said,

    April 26, 2007 @ 3:28 am

    Hi Matt,

    Recently we have had a URL replacer upgrade for our CMS so that we can replace the horrible aspx?page=22 to a nice products/products. The only problem is that now 2 urls are showing for the same page! what is the risk for getting penalised for this by the search engines and what ways are round this?

    Regards,

    Iain

  131. Jocjocuri Said,

    April 29, 2007 @ 2:12 pm

    Hi,

    Does anybody knows if this problem affect also the google, PR, I mean, when my site is without www, has PR 0, and when is with http://www., has PR4?

    Regards
    Jocjocuri

  132. Jocuri Copii Said,

    May 1, 2007 @ 8:07 am

    So , it is better to use with www if that was how you start! If you change in the future, and cut off www, google penalize you?

  133. Used Car Parts Guy Said,

    May 3, 2007 @ 12:03 pm

    How do we fix this? seems that my web site auto adds, index.php to then edn of my usrl even when I do not want it to show up and every one thatinks to me, links in with just .com at the end.

  134. bonyuttams Said,

    May 4, 2007 @ 4:43 pm

    Thanks for the great tips here. I never knew that all these ( http://www.example.com, example.com/, http://www.example.com/index.html, example.com/home.asp) got some difference lol :)

  135. russell Said,

    May 8, 2007 @ 1:42 pm

    luckily it’s only my php pages which have these issues (which there are only a couple) thanks matttttt

  136. raheel Said,

    June 6, 2007 @ 8:02 am

    Hi,

    How can I check that how many pages are in supplemental index at google?

  137. lvance Said,

    June 15, 2007 @ 10:18 am

    Matt,

    The vast majority of our inbound links contain “www” and we recently updated the site to have domain.com 301 to http://www.domain.com. But, Google appears to already have a preference for our non-www listing. Will this 301 hurt us? Should we be concerned that Google preferes the non-www version?

    Thanks,
    LV

  138. DNVC Said,

    June 16, 2007 @ 3:44 am

    Helo, i have same question as what raheel asked. How can I check that how many pages are in supplemental index at google? Please advice…

  139. Peter Said,

    July 5, 2007 @ 11:30 am

    Thank you for this excellent free facility. I was never aware of this issue with Google and spiders until now. I had unwittingly assumed that all the global differentails pointed to my site, I think I will take you adivce and redirect the unnecessary linked pages back to my proffered domain.
    I note Igor concerns over viewing these seemingly duplicate pages as spam. I certainly hope this is not true and does not detract from page rank and site performance.

    Thanks for the advice,

    Peter

  140. scott Said,

    July 9, 2007 @ 11:11 pm

    Thanks for the useful information. Lots of this seo stuff can be confusing and there are so many little things to know about to do it right.

    Please keep the good info coming. It is greatly appreciated.

    Regarding the ’supplementary’ results in Google: I had some old files show up as supplementary because I changed the name of the scripts but the content was the same.

    Should I convert these scripts to redirect to the renamed scripts? I’m doing that now but I’m concerned that redirecting will penalize me.

    Any thoughts?

  141. Greg Knighton Said,

    July 10, 2007 @ 8:31 am

    Hello Matt,

    This is my first time commenting or asking a question on your blog. I felt it was my last resort for an issue I am having with a form of canonicalization on my website. The problem I am having is that the googlebot is indexing pages in the secure form https as well the non secure form http. I am finding that this is creating duplicate pages in the Google index and is killing the pages ability to rank well since they are showing as supplemental results. I found a blog where someone mentioned using the following php code would stop the googlebot from indexing any https pages.
    ‘. “\n”;
    }
    ?>
    In your opinion do you think this would work? Is there any other options?

    Best Regards,

    Greg Knighton

  142. TONY Said,

    July 12, 2007 @ 2:17 am

    Hello. Im not really shure, but f.e. in space of internet in the Czech republic is using different system. You can write this url in bouth way and you get the same result.

  143. Rohit kapur Said,

    August 7, 2007 @ 4:56 am

    The vast majority of our inbound links contain “www” and we recently updated the site to have domain.com 301 to http://www.domain.com. But, Google appears to already have a preference for our non-www listing. Will this 301 hurt us? Should we be concerned that Google preferes the non-www version?

    Thanks,
    ASHOK

  144. Giannis Tsakiris Said,

    August 8, 2007 @ 5:07 am

    This is a great post, clarified many things I was doubtful for. But I still have a question: I recently added the http://www.- prefix in all my internal links in my blog, and added 301 response code to all non-www requests. I don’t want non-www URLs to show up in Google’s results any more. Will they ever go away?
    Thanks,
    Giannis

  145. Joey Said,

    August 21, 2007 @ 4:21 am

    Matt and rest of the fellows

    The issue of Canonicalization can also be resolved by registering your domain to google.com/webmaster and then later you can submite your site for sitemap and it also provide usful information. It also provide a featuer called “Preferred domain” If you select option which states

    “http://www.domain.com and domain.com are same”
    That can solve your issue right away, because if you dont mentioned if explicitly google will defautly take www and non www version URL as different sites and when it will index it will found similar contacts which can cause spamming and other issue.

    So try webmaster feature of google that helps in many other ways
    Regards
    Joey

  146. SEO Expert Dubai Said,

    August 26, 2007 @ 11:23 pm

    Thanks for sharing this info Matt, but when i redirected my website from the www to non www url’s all my links in the Google Webmaster Tools dropped done and even with the link:domain.com in google, So am just wondering is this normal and happens?
    Thanks again this was a really helpful information.

  147. Neo Said,

    August 30, 2007 @ 1:40 am

    Hi I found the one Scenario in which one company has registered multiple domain about 15-20.

    For an example

    if you click the following link for keyword “one stop motors” the company has number of websites ; and the most interesting thing is this websites do not redirecting to one site but have almost same theme, content and if you goto contact us link same telephone number which suggest this all sites have same parent company regardless these sites has been registered in same server or different servers

    http://www.google.com/search?q=one+stop+motors&hl=en&client=firefox-a&rls=com.google:en-US:official&start=20&sa=N