I answered an interesting rel=canonical question over email today and thought I’d blog about it. If you’re not familiar with rel=canonical read these pages first. Then watch this video about rel=canonical vs. 301s, especially the second half:
Okay, I sometimes get a question about whether Google will always use the url from rel=canonical as the preferred url. The answer is that we take rel=canonical urls as a strong hint, but in some cases we won’t use them:
– For example, if we think you’re shooting yourself in the foot by accident (pointing a rel=canonical toward a non-existent/404 page), we’d reserve the right not to use the destination url you specify with rel=canonical.
– Another example where we might not go with your rel=canonical preference: if we think your website has been hacked and the hacker added a malicious rel=canonical. I recently tweeted about that case. On the “bright” side, if a hacker can control your website enough to insert a rel=canonical tag, they usually do far more malicious things like insert malware, hidden or malicious links/text, etc.
I wanted to talk today about another case in which we won’t use rel=canonical. First off, here’s a thought exercise: should Google trust rel=canonical if we see it in the body of the HTML? The answer is no, because some websites let people edit content or HTML on pages of the site. If Google trusted rel=canonical in the HTML body, we’d see far more attacks where people would drop a rel=canonical on part of a web page to try to hijack it.
Okay, so now we come to another corner case where we probably won’t trust a rel=canonical: if we see weird stuff in your HEAD section. For example, if you start to insert regular text or other tags that we normally only see in the BODY of HTML into the HEAD of a document, we may assume that someone just forgot to close the HEAD section. We don’t allow rel=canonical in the BODY (because as I mentioned, people would spam that), so we might not trust rel=canonical in those cases, especially if it comes after the regular text or tags that we normally only see in the BODY of a page.
But in general, as long as your HEAD looks fairly normal, things should be fine. If you really want to be safe, you can make sure that the rel=canonical is the first or one of the first things in the HEAD section. Again, things should be fine either way, but if you want an easy rule of thumb: put the rel=canonical toward the top of the HEAD.
Hello Matt and thanks for this great article
I didn’t use the “rel=canonical” tags until I see google archive my orginsl link not the one I made it with rewrite – I’m using a custom permlink in WP – so I think it’s important because the it tell google bot directly what link it should to be archive , right ?
Thanks again
Regards
Hey Matt
Google has indexed a genuine desired URL http://www.example.dk/X.htm and undesired same URL in the form http://www.example.dk/X.htm?abc . Though I have added for months a rel=canonical between [head] and [/head] on the genuine desired URL http://www.example.dk/X.htm pointing to the same URL. How to get Google to remove the undesired URL http://www.example.dk/X.htm?abc
Thanks:
Whoa! It’s great that you are also posting the transcription of this video… reading is so much faster than watching.
Given that Youtube does automatic captioning, how about posting the transcriptions for all webmaster videos? It would make them easier to search and also support accessibility.
I think you should also post your rule of thumb at http://www.google.com/support/webmasters/bin/answer.py?answer=139394 so that people can be more alert while applying canonical tag.
Thanks for the insight on rel=canonical. I was doing some extra research and came across your intro video: http://www.google.com/support/webmasters/bin/answer.py?answer=139394 and found it extremely helpful. Thanks for the breaking it down, I learned a great deal just from that video alone!
Hi Matt,
Thanks for the advice, I’ll put in practice.
Thanks Matt that is helpful.. malformed page headers is a good clue for other problems.
On a related note (haha.. couldn’t resist) what about the following edge case? You publish original content, and include a rel canonical tag just to be safe. You then syndicate that content (via your RSS feed) to another site, who re-publishes it (with permission). They don’t use a rel canonical tag, but they heavily cross promote the content on their own site. They do reference the source, but they do it in an SEO-savvy way (linking out through jump scripts, maintaining a profile page for your site, your author name, etc).
Google decides they are the original, not you, despite the re-canonical and the time/date of publication. They outrank you, get their own Google News credit, and in many cases apparently cause your original URL to be dropped from the index.
I know I know.. an extreme edge case. But since it’s happening so much lately, maybe there’s a problem with Google?
I’m guessing that a lot of spam uses of rel=canonical include either a 3rd party domain name or an unlinked page on the same domain, so I would ask whether those are automatically ignored – seems logical, or are there occasions where I would legitimately post a cross domain rel=canonical instruction?
Hi Matt
Thanks for clearing this rel-can vs 301 issue up. For some reason, there’s been so much chatter about rel-can’s in the SEO community over the last few months, but opinions vary as to which is better (301 vs rel-can).
Keep up the good work 🙂
Derek
Hey Matt. Do you work closely with the Webmaster Tools folks? I have a bunch of dynamically generated pages which are being misflagged as “Soft 404″s by Webmaster Tools despite not being 404 at all. I think the rel canonical I have on them ought to be enough for Google to figure out what’s going on, but it seems at least Webmaster Tools is missing it. BTW, if you want to see the pages in question, Google [inverse graphing calculator], go to it, type in anything and hit enter… -Sam
HTML allows omitting the opening and closing tags for
<html>
,<head>
and<body>
. Browsers automatically generate these elements. I assume Google does the same when interpreting such a document — is that correct?Hem, a customer had a problem with rel=canonical (forgot to close attribute), Google intervenes automatically or manually ? Without canonical, no more problem. 😉
Canonical = nofollow = rubber patch.
Hello Matt,
Very good post and really clarifying. That is the trouble with rel=canonical if you don´t do it when its right, affects badly the whole site’s search presence. when we should have done a 301. Definitely a weapon of absolute last resort. Thanks
Better things to do then do a 301 redirect? Rubbish! 🙂
Cheers for the video, interesting to hear the rationale why some canonicals don’t work
Thanks for clearing all that up!
I am also very interested in the question above, by John Andrews and what the answer would be?
How you deal with canonicals when there is a double head in the body? Even with the whole shazam of the doctype etc. Sounds very akward but guess what… it happens. What should you advice (beside cleaning up there mess of the double heading) if they can’t because their site became to big (say 15.000 pages) to solve this problem?
Matt have you considered adding to GWT a panic button so that if canonical tags have been done incorrectly site owners can tell you to ignore them untill they are sorted out?
no names no pack drill 🙂
Hey Matt, you’ve just addressed an issue that I was worried about for the last couple of days:) I want to move my blog from /tutorials to /blog – as this is a bit more user-friendly URL.
Thought it can hurt my rankings – seems that I was wrong! 🙂 Thanks for pointing this out!
I have canonicals in place, but Google has still indexed all of the pages… how normal is this? I have never seen this before.
Andy
haha that works in more than one context.
should canonical tag be used in conjuction with syndication meta tags?
Funny – I was just testing the non-existent and BODY tag cases on my sites, because we’ve had questions about both on SEOmoz. Thanks for the clarification. The foot-shooting with rel-canonical is definitely a significant problem.
You can run your code through the W3C validator to make sure you’ve closed your tag and don’t have any other major snafus that might confuse search engines (or browsers).
I never heard about rel=canonical before reading this article .
It is really great to come here and read … every time I found it some thing new 🙂
Gracias Mat. Resumiiendo y en español entiendo que dices:
Google tendrá en cuenta como una posta fuerte la etiqueta rel=canonical pero no necesariamente siempre la seguirá. Por jemplo.
Si la etiqueta dirige a un página error 404, entonces google no la sigue.
Otro ejemplo, si Google considera que el sitio está hackeado, tampoco tendrá en cuenta la etiqueta canonical ya que evidentemente le hacker reenviará el enlace hacia sitios maliciosos y de spam. Recientemente Mat puso un twitter sobre este tema de sitios hackeados con etiquetas rel= canonical.
Y ahora Mat nos aclara que otro caso en el que no hará caso de dicha etiqueta será cuando la misma esté colocada en el body o cuerpo de la página, porque entre otras cosas, se recibirán muchos mas ataques si tenemos la etiqueta canonical en el body.
Y por último, tampoco seguirán la etiqueta si google ve cosas raras en el head como enlaces que se dirigen con textos regulares o cosas similares que se sospeche pueden estar tratando de influir en las busquedas.
Espero haber entendido con estas lineas lo que has querido indicar y trataremos de llevarlas a la práctica que en mi caso la vengo utilizando para evitar que google considere contenido duplicado en mis web.
Thanks for the new info, Matt. One of my takeaways is that since 301 Redirects do not pass along 100% of the link juice, it does make sense (where it is not too labor intensive) to try to get those older links to re-point to the correct (new) pages.
thanks Matt, we’re just running a project on a large scale ecommerce store and are using the rel canonical as theres literally too many routes to the actual products, we can use this to justify that Google do take notice of rel canonical in the head html near the start.
Thank you for this. I have been fighting with understanding the canonical links for a while and it was actualy some of the links that you have provided that are of more benefit but the more we know the better and it is always nice to have a face to a company and so good for you.
Cheers
mark palmer, the videos we post do get transcriptions (sometimes they take a few days though), so that’s helpful.
john andrews, there’s folks at Google looking at that sort of case. We’d definitely like to provide the authoritative source, but it can be tricky to determine things sometimes.
In other words, if you’re a fat you might not be trusted as much?
Dear Mr. Cutts,
The last paragraph of this article seems to be the most important, according to me. Especially the phrase “you can make sure that the rel=canonical is the first or one of the first things in the HEAD section”.
With this sentence, do you confirm that order of elements in section is important for Google(bot)?
Thanks in advance.
I’m working on a website with a search function that can produce a number of url variations for the same results. It’s difficult to work out how to use the rel canonical tag to get the best combination of urls and unique content! I’m tempted to noindex it!
How can there be regular text or body-only tags in the head? If you use an HTML parser, regular text or body-only tags will implicitly close the head and open the body. Specifically, the last case in the “in head” insertion mode of the HTML5 parser implies that any unexpected content will implicitly close the head element. This is how browsers work — is Google’s search engine really using a parser that’s so incompatible with browsers that it doesn’t implicitly close the head tag?
. . . oh, wait, I see. You mean that you actually do close the head element, and therefore in the DOM, the tags are actually in the body, even though in the source they look like they’re in the head. Got it, never mind.
matt, what if I use canonical attribute while commenting on a blog? 🙂 or what effect it’ll get if we use it within container?
It is very good to hear there is little difference between the two…I would always still go for 301 redirects mind…just what I prefer…
At last – Some authoritative advice/info on the rel=canonical case. Matt, you’ve saved a lot of IM and SEO forums valuable page space by addressing this topic. Thank you.
Regards, Jenny
I know its little bit of topic, but why are you using Thesis rather than default theme. I have several sites with thesis but many more just using twenty/ten due to easy customization.
Its little bit like microsoft employees using IPhone, lol.
But on the post, just curious what makes Google think site got hacked when looking at rel=canonical.
Very very insightful information when it comes to the trust factor of the rel canonical tag. Very important for me as I’ve been doing cross-domain canonicals (sp?).
Hi Matt,
I have a blog called newtechtips.org/. Before 9th of May 2011, I got 56.78% visits from google organic search. But I don’t know why after 10th May onward I just got 15.35% visits from organic search.
I used webmaster reconsideration tool and inform google about this issue. But they reply me that my site is working fine and no issue found.
I couldn’t find out the issue for this bad ranking. Can you please help me sort out this issue.
Regards,
Rupm
A little bit over my head, but useful never-the-less. I am liking the use of video in your posts!
My question is how do you determine which page is canonical if you publish and article and someone scrapes it. If they get indexed first is it strictly a time based thing as to who gets the credit for the content?
Thanks for further clarifying this issue, I see it all over the forums and boards.
Hi Matt,
Have to admit – I wasn’t familiar with this topic.
thank you for sharing your knowledge and time with us 🙂
Helmuts
What is the right syntaxis to use cross-domain rel=”canonical”.
I have two domains pointing to the same hosting directory, with no ability to make a 301 redirect.
I want to point to the main domain name.
I try to put a cross-domain canonical in my robots.txt
my main website is
abcd.com
, what I made isrel="canonical" href="http://abcd.com" />
It doesn’t seems to be the right way in the robots.txt checking tool.
Please Matt or somebody show me the right way, syntacs.
Thanks, now i know what is a rel=canonical, i think it is necessary to add a rel=canonical in the HEAD section.
Hi Matt,
I have a friend who is a Google adwords specialist and generates at the moment about NZ$30,000 a month in revenue for Google.
Recently he has been suspended because his website apparently ‘violates adwords site policy’ by ‘ad spam techniques’; apparently his website includes ‘business models designed to create artificial ad clicks .. or ads and landing pages that make claims or guarantees intended to mislead the user into believing that a third party has a unique relationship with Google which allows them to offer the following: discount on keywords etc etc’.
He has asked for clarification about this but no matter who he speaks to (and thats difficult in itself) no one can tell him what he has actually done to violate policy or what on his site violates policy. His website is at: tweqd.co.nz/
He relies on his site and his adwords account (he has PPC for his own site) to generate his income but is now in the position where he may need to close his business.
He is obviously keen to fix whatever the problem is and get on with his job but no one will tell him what the problem is. He has spoken to the adwords people in India but he can get no further.
Who does he contact to actually find out what is going on? Thanks
thanks… should canonical tag be used in conjuction with syndication meta tags??
I am not sure how much importance this matter have from the point of WEB-SPAM but there is huge number of websites that does ‘website analytics’… like alexa rank, how much website is worth, where is hosted… they generate bunch of pages that are always similar and 0 value but clutter inbox coming daily in ‘web alerts’.
Like many, we were eaten up by panda. I thought because our apache server runs both .asp and .php extensions that our site may be appearing to produce duplicate content, so I included a simple redirect for all .asp pages to php. Yes, in the prehistoric days we did run on a windows box. I don’t want to lose any link value or “juice” as Matt called it. Keep in mind that some of the .asp links are 11 years old and from high ranking sites. What would you do? Should I change it back or try a canonical?
Sorry, Matty, I know you made it pretty clear to go with the 301 and call it good, but I was hoping someone could respond with some high level expertise on our specific situation. I’m not sure if it’s appropriate to include all the back info here, so feel free to disregard my next few paragraphs. I have tried the Webmaster Forum, as suggested and it quickly turned into a WWE verbal wresting match over there. I mean, a person can only read so many google conspiracy theory responses in one place. No thanks.
Here is the back-story… Proceeding the farmer update, the mighty, not so cuddly panda bear drastically decreased our site’s referral traffic. Ready for the sticker shock? We went from around 20,000 uniques to around 2,500 daily. After redirecting the .asp extensions, traffic doubled, but this is still way outside our usual traffic numbers. I know that the panda update was really focusing in on health sites and garbage info out there. We do write tons of articles on fad diets and try to debunk them with research and funny input about them. I hope the subject matter of our site alone is not somehow making us fall into this content farm site categorization. Any who, that is enough of my wining. Here are the real questions I have for you.
We don’t want to lose any link juice even if just a tiny bit. I can always manually have the old links point to only .asp extensions and set everything else to .php. Will this effect performance in anyway? Maybe I’m over thinking it? But at this point, I’m willing to try anything to get back into google’s good graces (say that three times fast). What would you do? Focus on SEO for yahoo and MSN, ha? I know they (the other search engines) don’t like 301 redirects whatsoever and it been speculated that a lot of “link juice”, if not all is lost in translation. On a side note, feel free to let me know any gripes you have about our site content or site design. Maybe that would help us figure something else out. Thanks!
Hi Matt
What about putting rel=”canonical” on the parent page as well? For some reason our tech department is unable put it only on child pages as they don’t know which one is child and which one is parent. Apparently some third party tool is causing duplicated content. They said if they wanted to look into it, it would take hours so they wanted to ask me if it was OK to put rel=”canonical” on all pages including the original page (in that case it would references itself ).
If you could answer this, I would really appreciate it.
Al
Hi Matt, I am relatively new to the whole SEO ranking thing. We only recently put our website up and have been told by our webhosting company that we have to get a new domain bc Google black listed us as we have some dodgy links linking to us. Probably irrelevant to the post above and I will be asking the Google webmasters. just thought I would say hello.
does rel=”canonical” helps to discriminate http://example.com with http://www.example.com as well & is it required for all pages?.
Some CMS’s were created Before Google and unfortunately haven’t modified their core software to suit SEO. This is actually pretty helpful for the real-world use case where a publisher using one of these systems needs to fix duplicate content problems inherent in the CMS
Thanks for sharing this video, after watching this video, I realised that there are many duplicate content in my cms Joomla which I didn’t realise. But after this video, I have fixed it and my search engine ranking does improve.
Hello Matt,
Thanks for clearing the 301 phonomena.
I think this is the quote of the month:
But in general, as long as your HEAD looks fairly normal, things should be fine.
Great post. Thanks for the clarifications. My only question is what if a small part of the content on a page is derived from some other part of the site and otherwise the content is original. How do we get out of such a situation. To do or not to do (rel=canonical) is the question?
I am glad to hear that the juice lost on a 301 is less than I thought, was a bit worried that redirecting my root to the /content for my blog might hurt.
Matt, when are you guys going to update the Google PageRank on toolbar?
I have a six months old site and it has unique content and has naturally acquired links from New York Times, USA Today, etc. Yet it’s PR is 0. And that hurts me badly.
Why? Because when I try to ask for links from quality websites, the webmasters look at my site, see it’s a good site, then look at PR, see it’s 0, think “probably this guy is doing something shady” and according to Google commandment of “Don’t link into shady areas of Internet”, they don’t link, thinking it’s better to be safe than sorry.
I’d like to bring to your attention the fact that the toolbar PR shows the PR that is almost one year old. One Year Old. The Pagerank was updated in January, but it’s misleading. Currently all articles in NYT that are more than a year old have a PR, all those that are not have a PR of 0. That means the toolbar’s PR is one years old.
And because of that all webmasters of new websites are hurting because its unfairly hard to get quality links because Google is showing PageRank that is one year old.
I think it would be fair to either get rid of PageRank or update it properly so that decent webmasters of new sites aren’t hurt in the process. You used to update PR every month. Why don’t you do it now?
This is important for me and I’m sure for many other webmasters of new quality sites and I look forward to your thoughts on this. Thanks in advance.
Hurrah for the transcription! Did it had anything to do with Graywolf´s post the other day entitled “Matt Cutts, Will You Please Stop Waking Up My Wife in Bed”?
Hi Matt
I always read your posts. Thanks for talking about ‘canonical’ and also for clearing my doubts regarding canonical. Can you please tell me 1 more thing i.e. what is better to build a website; www or non www domain? Which is better according to ranking? Is www or non www domain can make any affect on ranking? Please tell me as well as all your fans.
Do you suggest if the URL is properly redirected by 301 redirect than there is no need of rel=canonical tag at all as per your recommendation in the video? Since in my experience search engines never give webmasters any alert for exactly how many duplicate copies of a page they have got in their belly that is effecting their rankings. It is always there neither search engines nor webmaster is aware with how many url versions a site page can get opened up like with WordPress sites.
And lastly for the webmaster community can you please add some clarity that one needs to keep canonical tag on separate pages as well along with home page. Thanks!
Now everything is clear :>.
Thanks for the clarification. I just had this exact conversation with someone about a week ago and told them that Google doesn’t use the canonical tag signal if it’s in the body of any HTML page. They didn’t believe me. Thanks again for covering this topic.
Thanks Matt! Just looked in my Websites code to find out that the famous WordPress SEO Plugin places the rel=canonical at the end of the head section. It’s probably better to do that manually.
If someone was to add a rel=canonical tag into the header of their site would any external links on that page be void of passing any PR authority?
I’m glad you guys are putting more emphasis on canonical URLs. It’s particular challenging for us internet retailers, because many of these software applications fail to keep the URLs unique and as organized as we’d all like. For example, Magento is a great e-commerce platform but it lacks some out-of-the-box features such as this one. Luckily there are some extensions that can alleviate the problem, though… Hopefully as time goes on, the developers of these platforms will put more stock into canonical URLs as well. Here’s to a better web!
Thanks. For some folks, we’d like to keep one foot in and one foot out, so perhaps rel=canonical will work? If you 301, can you go back and resurrect the original page’s rank if you go back to that page?
Reason why I ask is that I am thinking of subsuming one website into another, broader spectrum site of related material – but want the option to split back up if it’s a bust.
Hi Matt,
Quick question.
On the basis of a site syndicating a summary of an article/blog and it is not a unique excerpt but the first few lines from the start of the article, would a site be penalised whilst using the rel=canonical tag?
In an ideal world and in accordance with webmaster guidelines I know the best practices are to ensure unique summary’s are provided on each individual higher level category, but many sites, especially those that are blogs (using the default WordPress for example) would fall foul on this.
I’ll use your blog as an example. If you used rel=canonical on your homepage, then again on this particular blog post, would Google view that as trying to game the feature?
Would your recommendation be to solely use rel=canonical on the individual article page or is the above example permissible in Google’s eyes?
Thanks.
Hi Matt,
I had a question relating to how the Googlebots crawls 301 redirects which I posted on the Google Webmaster central forum http://www.google.com/support/forum/p/Webmasters/thread?tid=683e71557db7fd54&hl=en and I was wondering if we can get your input on this issue. It is basically to do with the Googlebot not following 301 redirects. In fact in this specific instance the googlebot did not follow a 301 redirect of the robots.txt from the non-www to the www version. I am sorry that I had to result to commenting on your blog but I don’t have your email and could do with some authority input from Google on this issue. I am also sure a lot more people will benefit from your comments on the issue eventually.
Thanks
Zan
Exactly the information I was looking for…. though still a little confused
A perfect example for use of canonical is vBulletin with it’s duplicate content in viewthread.php. The same site for every post in a thread, and they all get referenced by postid. You can add canonical in the head of these template, and make it point to the one version, that is referenced by the threadid.
Hi Matt,
Just let you know that Panda doesn’t work the way it should.
A scrapper copies my post and publishes it on his blogspot site and you know what happens?
His page is on the very first place in Google SERPs and mine… mine is nowhere! I guess it happens because blogspot has more authority than my site by default.
Just let you know. Not like you really care abou it but anywhay.
Very frustrating and discouraging experience.
I’ve never seen is so bad like that since I started my sites back in 2005.
mm… I have a question: What happens with content people grab from my site using the Creative Commons license I publish with. Should I need to somehow require them to use rel=canonical on their posts?
I mildly understand.
Nice post, Cutts.
Thanks for the post, Many of the people using 302 instead of 301 redirection may be.
Hi Matt,
Should external site rel=canonical be used as a measure to reduce bounce rates ? So I can keep the content on mu url with a canonical tag … whereby i eliminate bounce rate issues ?
Pls. shed some light on this.
Thanks.
Jaspal
Nice and clear video 🙂
We were accustomed to work hard on .htaccess rules and URLs rewriting and try to avoid duplicate content by hiding session ID’s from URLs… It’s not easy to change habits 😉
Thank you 🙂
One or two weeks ago I noticed some new messages in the webmaster tools of one of my projects. The diagnostic tool ignores the canonical-tag and suggests different meta-descriptions for the canonical redirected pages.
Is it important to write an invidual description for redirected pages?
These pages shouldn’t exist for google because of the canonical tag.
The canonical tag redirects to homepage.
Sorry to be so persistent, but I’m still concerned about the site restructuring we are doing using 301s. We have close to 100,000 links according to webmaster tools and just recently redirected a large number of pages. Here is my concern; if link juice is lost, even if just a tad, would that not hurt us? Also, would google like so many darn pages 301-ed, I would not think so. I’m still going back and forth on how to proceed. Input or suggestions? What would Matt Cutts do?
Hi,
Thanks for the video. I had an issue with the canonical last month. On our new ecommerce website, we implemented the rel=canonical tag but there was a bug. As a consequence, a part of our website was not indexed. I’ve never understood that canonical will blocked indexation. Do you confirm that it’s the case? When I corrected the bug, pages have started to be in Google index! Thanks for your input.
It was clear all, I have a joomla web is tricky to manage rel = canonical, but there must be some plugin to do the job. Thanks for the tips.
Greetings
I was wondering about how they would deal with pages being hacked and having this inserted into the html. I’m glad that they have thought of this and are working to make sure they can’t be hijacked,
Hi Matt,
Thanks for sharing this info about rel=canonical, its really helpful.
Graphixter Media | Website Design & Development Company
Thanks for the video.
So i hope i can use the canonicals correct in the future.
But i think it’s easier than a list of redirects in the .htaccess 😉
Nice purple background! 😉
Thanks for posting this Matt. It’s clear to me now 🙂
Matt Johnson,
“Also, would google like so many darn pages 301-ed” – I think Google understands the need for a site to make structural changes to its content hierarchy. This is why 301 exists and is considered to be best practices when moving content from one location to another. I don’t think you need to be concerned about how many pages you are redirecting, more that you are doing it right and in the best interests of your audience. Let’s face it, Google will not have an issue with the quantity of redirects as in the grander scheme, the redirects from your site would be minimal compared to the bigger picture of the web.
The only reason the number of 301’s would be an issue is if you are trying to exploit the function. Google will presumably use their own patterns to identify normal and perfectly reasonable behaviour from unscrupulous tactics.
In my view, I would focus on the method you are using rather than be concerned with the quantity.
There are some some nice WordPress Canonical Plugins that help streamline this for SEO
Cheers for clearing up the whole 301 saga Matt, interesting to see why some of the re-directs don’t work.
Awesome description on the rel=canonical, it’s a ton of valuable information. Thx!
Angela, similarly Yoast has a decent one for Magento, for which I’ve run into some nightmare problems in the past!
If you use the tag, should you still use the rel=”canonical”?
I have a duplicate page that I want the search engines to completely to ignore and someone told me I should use the rel=”canonical” instead of the “no index, no follow.”
Can you use both? Should I use both?
Hi matt,
We run an eccomerce store that grabs our product titles and creates a url out of them i.e netpetshop.co.uk/p-37105-kong-wobbler-small-treat-dispenser.aspx
We sometimes change the name of the product i.e the title for example this one we might change to food dispenser as its more appropriate for food. This changes the url but retains the old one so we have use the rel=can tag on all of our products so that if we do change we get the latest title to be the most important, having now watched your video maybe doing a 301 is better, but this would be very time consuming as the store is not setup to handle this and would be a manual process, are we losing any link juice because of what we are doing.
Great post Cheers Chris.
Angela with regard to canonical plug-ins, Yoast offers one for anyone developing with Magento – I know I’ve run into problems with this issue with Magento in the past.
Matt I can’t figure out the answer to this anywhere. I have content that I made but I want to change the location. I don’t ever want to refer to the old location because I don’t want it there, so it seems like rel=canonical and 301 are not good solutions. If I delete it, how long should I wait to put it in its new location without upsetting the alg?
Can I say we got finally answer about this mysterious rel=canonical. Thanks, Matt!
A company called “eToro” is somehow gaming Google to inject spam (in the form of URL variables) into the site URLs listed at the bottom of Google Blog Search results. Will using rel=canonical on affected sites do anything to stop this? It’s affecting my blog’s results, and I added rel=canonicals but the spam is still showing up for new blog posts.
A good writeup Matt, glad to hear that rel=canaonical is not the be all and end all. It seems 301 redirects are still the way to go if possible.
It’s nice to hear from someone reputable about rel=canonicals and 301s. Cheers Matt.
Thanks G for creating the rel=canonical tag! I like it because it’s so easy to implement. I will also verify my placement to make sure it’s the first thing in the HEAD as you suggest.
Thanks Matt!
On the topic of duplicate content, I want to ask you a question (site linked to above)-
I have a single page, served to both larger screens and handheld devices. I use CSS @media to block out large sections out side the main content area (like ads / redundant navigation / footer / bookmarking links /etc) from handheld devices to fit the page better on these smaller screens and load them faster.
Does it amount to cloaking? I mean, its not being done with any malicious intent.
Thanks for your time.
Ah, it’s definitely still possible to shoot yourself in the foot and point your rel=canonical at a nonexistent page, as I foolishly did just that with around thirty pages. Needless to say, as far as search engines go, they just vanished, and took some time to get reestablished. Luckily I did have a custom error page, which had a correct rel=canonical pointed at my home page, but it did result in quite a lot of inappropriate queries landing on my home page.
Moral = check before you publish!!!
Hi Matt,
Thanks for reminding me of an issue that I was worried about a year or so but then got busy and forgot about. We just addressed it and fixed it. There never seem to be time to fix everything before things change and something else becomes important to mend.
Hi Matt,
Thanks for the video- we had been using rel=canonical links on our in house CMS, for copies of our pages with duplicate URLs- now switching them over to 301s. Some 1000+ sites!
Hi Matt,
Sorry for my previous comments.
I should use a rel=canonical for the RSS Feed?
Thanks 😉
Thank you for clarifying this. The videos are also very useful, makes understanding (for me anyway) much easier.
Good stuff, but covered in more depth at SEOmoz late last year in 301 Redirect or Rel=Canonical – Which One Should You Use?
Thank you for making this clear. I have naturally preferred 301 on some pages and have used rel=canonical in other cases. I have a case now where products change slightly every year but the keywords stay the same. We are using an automatically generated 301 redirect to point from the old to the new product. I guess as long as its possible a 301 redirect is the best thing to do.
Hello Matt, very interesting article. I would also like to ask you a question, that is really interesting for many people – and they can’t to get the right answer. Let me say there is a company ‘ABC’. It is doing pretty well in Google (first page on couple different words). It also has a lot of external, strong thematic links. Now, the company changes it’s name (also the domain) to i.e. “XYZ”. How can it influence it possition in Google? Is 301 the sollution for this problem? The company is afraid of loosing it’s position (and sales) because of drop down in Google.
Hi Matt,
I’d like to use rel=canonical to solve a problem, but I don’t think it will help. Here’s why:
We have hundreds of pages of our site indexed and cached independently for both http:// and https:// versions. We are not even sure where Google gets the links from. Pages that use https are blocked and in addition, any link to an https:// version of a page from these pages are tagged noindex,nofollow. Google still indexes them, more frequently then the version we want.
But that’s not the main issue. All these https:// pages have 301 redirects to the http:// version. They have always had them, but still in the index they are cached independently. So my question is, how can rel=canonical help if it shouldn’t be possible for the crawler to actually load the https:// version of the page? And why does the index maintain caches of url’s that are 301 redirected, and have never not been? We will literally get both an http:// and an https:// version of a page in the index within a week after a new page is put up.
Once again Matt, great stuff. Im a bit confused on one issue though, aren’t there situations where you would ONLY use the 301? Or does “rel=canonical” replace that? Like a rel “redirect” to a supported browser?
Matt – Long time follower. Sorry we missed each other in NC. My question is regarding all the buzz about Schema.org. It seems the information is a little sparse if not dry. What’s the poop?
Viewed the recommended video first at http://www.google.com/support/webmasters/bin/answer.py?answer=139394
Very useful. Fascinated by the session id problem. One website having hundreds of versions of the Terms & Conditions page in the Google database – each one with a different session ID!
How many other sites are there out there with just that problem???
Hey, Matt.
This might be a question better geared towards the webmasterhelpforum, but this caught my eye because rel=cannonical is the only way that I can figure to fix the issue.
A client of mine is using squarespace as a CMS-catchall thing. And while the interface is great from a design standpoint, from a webmaster standpoint I feel like I’m perpetually shooting myself in the foot.
So here’s the deal. Said client uses squarespace. Has a custom domain through squarespace – but they have it as a CNAME to the subdomain automatically given on squarespace – which is giving duplicate content issues like woah. Or at least I think it is. How does Google recognize CNAMES?
But anyway. I can’t do 301 redirects to the custom domain, because it’s the same thing and it would give me a redirect loop with a CNAME – there’s no htaccess file that I can get into, can’t FTP in. My only recourse seems to be rel=cannonical.
This is all assuming that Google is treating this two domains as separate content. My research seems to indicate that is in fact the case.
Please correct me if I’m wrong. But long question short – how does Google classify CNAME, and is this something that could be rectified with rel=cannonical in a last ditch effort?
This is a cool feature. In the past I have seen some sub-pages of my sites get indexed that I didn’t necessarily want to direct users to, while other very similar pages don’t necessarily get indexed. I’ll discuss with my programmers whether this might solve the issue.
Nice stuff, this issue is now cleared to me. Seen some WP plugin for this. Thanks Matt!
Hi Matt,
Thank you for this useful topic. But there are still questions, I wonder how other search engines (ex: Yahoo or Bing) treat rel=”canonical”. Did they all support cross-domain rel=”canonical”? What can we do if the answer is not and we can’t implement 301 redirect (for meeting the redirect policy of Google Adwords).
chenyutn
It is little things like this step that people have a hard time sometimes understanding which is more important to use when it comes to the technical aspect of how your site id displayed in the pages of search.
Good to know you guys have some leeway on this rather than taking it as set in stone. Thanks for clarifying, have often wondered which was best to use.
Well Matt in your blog I saw rel=canonical twice on two different urls for the same content let me show you an example http://www.mattcutts.com/blog/rel-canonical-html-head/ & http://www.mattcutts.com/blog/type/googleseo/.
So google will consider this tag or will ignore it.
I’ve lately been wondering why the 301 re-direct would be used instead of just a simple canonical tag. This question and subsequent answer has cleared it up somewhat, although I’m still new with all this kind of research so I’m still learning. Thanks, Adrian
Thanks for this info, Matt. I just recently used .htaccess 301 redirects on an entire website to move from .html to dynamic (wordpress) based site, for simplicity. I was hoping I made the right choice in doing this, and you’ve really cleared that up. I also implement rel=canonical on all pages. I new that wasn’t a redirect, though =p. Anyways take care!
Hello Matt,
As always this was a great video, that’s going to help me a lot fixing some duplicate content issues on my old site. But I actually was commenting in regards to the fact that I am a 17 year old college student, and I have had my own website for a little over a year now, and your videos have helped me put to rest my black-hat ways, currently Matt I am in need of a mentor to help me and you seem like such a cool self-less guy, I scoured the internet fruitlessly for a way to contact you personally, this was my last ditch effort I would appreciate it so very much if you could lend me a little bit of your valuable time, it would help my life tremendously. I’m willing to do anything it takes to ensure a better life for myself economically and otherwise.
All due respect,
Arya Bina
I have used canonicals for some of my posts but google indexed that pages, how it is possible?
Thank you for the “A rel=canonical corner case” this is really a very help full for us. Now i know what is a rel=canonical and i think it is necessary to add a rel=canonical in the head section.
Good info, but it brings up more questions on how to solve the issue. Angela commented that There are some some nice WordPress Canonical Plugins that help streamline this for SEO and this is what I need to delve further into to make sure I don’t mess up my previous efforts.
Few months ago I was about to study this ‘rel=canonical’ but as I usually do I neglected it but now after reading this, its sure I would like to do something with it. Thanks Matt.
Thanks for the great article and video. As someone who site is being hosted on a family members account, I don’t have access to set up a 301 redirect, so the rel=canonical is the best solution for me at this moment. As a side note, I love your blog and hope the great tips keep coming.
Canonicalization should be at least done to a website’s main pages (specially for e-commerce). The true benefit of rel canonical is to control of link juice that you passed down to your pages with a preferred, cleaner URL. I have forgotten to “rel canonical” a long time ago since this is a long-hideous process but now, I have realized what I was missing. Thanks for bringing the topic again Matt! This taps me 🙂
Hey Matt,
Great blog! I have some questions about Google Website Optimizer.
1. Isn’t this classed as cloacking? For example if I use it for my homepage or products page won’t google think i’m showing different content?
2. Will the spider crawl the page i’m testing? Because I want the spider to keep on crawling my original page until I decided to swap, if in fact I do swap 🙂
Thank you on this great job on the tool it looks awesome!
Cheers,
Cam.
Hi, after using canonical tag, will the duplicate page get de-indexed?
Thanks
Matt, your a saint amongst men….LOL….I dove into researching 301 re-directs vs. rel=canonical and just happened across this video. Saved me time. Thank You!
Hi Matt,
thanks for your post, still duplicate content is one of the things that you sometimes just can’t avoid. Until now, i thought, the rel=canonical is always used as the preferred one by google. if i got it right, there is no way in having duplicate content on the website and not harming the rankings.
I have canonicals in place, but Google has still indexed all of the pages… how normal is this? I have never seen this before.
Hey Matt,
I m suffering from the real canonical issue looking for resolve it & wanted to get some benefit from the website to get higher rank.
My website is been crawled by search engine, but i will get benefit from this articles…
This is really very good post & will get benefit from this…
also want to thanks Ian Smith for sharing video… thank you…:-)
http://www.google.com/support/webmasters/bin/answer.py?answer=139394
Very interesting post Matt, my intern is very grateful. Helped him to get to grips with it all.
Thanks Matt – makes sense – but what are the real possibilities of being penaliased for duplicate content if you fail to do either?
Thanks for clearing up about the rel=canonical. I knew that googlebot is smart enough to handle this. I think now i can relieve and not afraid of the hacker attack again.
Matt, I have a question, On my blog I want to offer a clutter free page to my readers. The urls are like this:
Original Post : ///
Clutter Free : /read/
I am thinking whether I should add canonical url to the clutter free post or block /read/* altogether in robots.txt
Very interesting article I have been trying to do the .htaccess file properly, the video is clear and easy to understand for me, thanks again.
JF-17 here is the answer to your question. According to the About rel=”canonical” page on Google the purpose of the tag is, “A canonical page is the preferred version of a set of pages with highly similar content.” It also says that if Google knows the pages have the same content they may only index one version of the duplicate pages in the Google index. The keyword is may so I bet it varies per situation. In the same paragraph it goes on to say that, “Our algorithms select the page we think best answers the user’s query. Now, however, users can specify a canonical page to search engines by adding a element with the attribute rel=”canonical” to the section of the non-canonical version of the page. Adding this link and attribute lets site owners identify sets of identical content and suggest to Google: “Of all these pages with identical content, this page is the most useful. Please prioritize it in search results.”” I suggest if you want to find out the most information you can about your question you read the first link that Google wrote about the rel=canonical tag and this link
I would advise, not to rely too much on “duplicate content” websites to reference the original source of the content. In the case of websites “stealing” content or duplicate content for various other “non-malicious” reasons, best would be if the original source of the content says “I am the source”. You could then focus on comparing which websites shows new content first (timestamps) and “kick out” all duplicate content pages -> even those who try to pretend they were the original source. Maybe it could be aligned with something like “webmasters site verification”?
I just started to dive in and study rel=canonical tags very recently and this was such a great post for me to find. Thanks for giving examples like you did. That really helped me get a better grasp of when and how to use them. The video was great too! Thanks Matt!
Hi Matt,
You did not tell that what would be the effect of using rel=canonical in SEO. will it decrease the rank and PR or will it increase it ?
Thanks
Mark
A wonderful job explaining the use of the canonical along with where to place it in the head tag Mr Cutts. My question is does it matter if someone puts just the tag in the webpage code or do they also have to set the server to point to the preferred domain as well for Google to index it correctly and not penalize the website for duplicate content?
Hey Matt, did you notice your blog title and the Home tab link to:
http://www.mattcutts.com/blog
but that is redirected to:
http://www.mattcutts.com/blog/
?
But the rel=”canonical” saves the day :-).
I know these rel tags are useful but would it make that white hat but novice webmasters and non tech-savy content writers under html knowledge pressure?
And to my understanding, the first page found by google robots with rel=canonical tag is accepted as original even another blog later copied the content and put rel=canonical. If it is the case, quite logical.
Hi,
story was good but i got a question popping up in my mind. what if i have a unique content on a page and someone steals my content, whether i have a rel=cannonical there and the content stealer adds a rel=canonical on his page; how would Google figure out which one is the best canonical?
Secondly, can we use it in such cases http://www.example.com/harley/ http://www.example.com/harley/?p=125. Redirecting the dynamic URL to the static version is better or i can save the my static page’s worth by using a canonical?
It isn’t mentioned anywhere that one supposed to get an alert. I hope i will get an email notification whenever someone responds to this.
“But in general, as long as your HEAD looks fairly normal, things should be fine.”
I am so effen screwed here. My head looks like a mushy pumpkin. 🙁
Dear Matt,
It is amazingly interesting to see how Google evolves and how different ranking algorithms mature up to the point where they can really make a difference in the search experience.
However i don’t understand something. I know this guy which registered 200 webdomains, put them on 200 hosting accounts and now he ranks #1 for everything. Why isn’t he caught by the GoogleBots. I mean all his sites are interlinked severely. All of them are blank low content templates from templatemonster.com (nulled i might also add)
Hi Matt,
I would like to thank you on these best practices about the rel=canonical tag.
In regard to rel=canonical I have a question yet to be answered.
I have always wondered about the effective use the rel=canonical tag on categories with pagers (When the main page of the category just cannot maintain all items so it is wide to put them on page 2,3 ETC…). With my current CMS, I don’t have an option to separate meta elements such as meta title, description and keywords between category pages- I would like to fix every HTML suggestion in my webmaster tools reports and these are reported as duplicates. I wonder if putting rel=canonical in each of the pager pages would be my best option considering my situation or will it only make problems such as PR not flowing to items on these pages. (Can that be true on such a case?)
Just read about you on “see blue” I’m the Google Ambassador for UK 2011-2012 and was so glad to read your alumni story! So awesome! Got to be the best alumni story yet.
Hi Matt
Hope your injury has healed, good to see you’re giving it time. Think it’s a blessing us folk in our mid to late 30’s get to enjoy.
You likely don’t remember our chat about a site having regionalisation at webmasterworld in Boston back in 2005 (from memory), I think we were at the urinal at the time.
Anyhow, one of the sites we talked about has somehow fallen foul of one of G’s filters/penalties, and I’m clueless as to how. Have made a post onto the google.com/webmasters url as you have suggested, and received one response, which was of no help. Subject is: ‘Why Google referrals suddenly stopped to a top ranking site ‘. Would be great if you could get members of your team to participate on the forum. I’m pretty sure it’s a filter/penalty because when you search for a phrase from any article on the site, G doesn’t even feature my site in the SERP’s, I see only other sites that have scraped my site instead.
Please have a member of your team look into the domain negotiationtraining.com.au
I’ve not touched the site for YEARS.
It’s a resource intensive site comprising free negotiation expert written articles and Q&A’s that I expect has been gathering organic links over the years.
Thanks Calum
what happens when this tag is not correctly implemented?
I try and set my websites up so there is only one version of a page. However, having worked on various CMS’s recently, I’ve realised the importance of the canonical tag.
What continually surprises me is how CMS’s are built without much thought being given to SEO structure.
duplicate content is one of the things that you sometimes just can’t avoid. Until now, i thought, the rel=canonical is always used as the preferred one by google.
Thanks Matt, good advice to put the ‘rel=canonical’ as one of the first things in the head section! I hadn’t thought of some of the possible pitfalls you outline but having it as one of the first items in the source code should avoid any ambiguity. Thanks.
Al sefati, I think that if the tag is not implemented correctly it won’t take it. It’s like if you have a wrongly implemented 301, it just won’t happen.
I had a canonical issue in my site but I solve it with 302. Now can advice me that what will be more better for a website 302 or 301?
That sounds quite interesting but I made different experiences. Sometimes a canc.redirect is in my opinion the better solution.
it should fit to the page and the searchengine realizes that, imho!
cheers and best regards,
chris
Thanks for making it easier to understand Matt. Will put it into practise. Love the way you use videos!
Thanks for your helpful post.
Some days ago I saw a post on this matter in a forum. Many people answering there but it was not very clear to me. But now, I totally understand the ‘rel=canonical’ case. Thanks again….
Please tell me one thing – After using canonical tag, will the duplicate page get re-indexed or not?
Thanks for the information about ‘rel=canonical’. To be truth i had no such knowledge about this but thanks for the information now quite clear. Thanks so much Matt.
Hi Matt
My ecommerce site doesn’t use the canonical, do you think I need to add it to our site, I have read a few blogs where people have had pages de-indexed because of this tag but I guess thats just speculation?
Thanks for clarifying this issue Matt. Despite the horrid name, rel canonicals are no longer so scary 🙂 Keep up the good work – us webmasters really do appreciate it.
Good explanation of how to properly use 301 Re-directs. I always though that rel=canonical was the best way to go but this video explains it much more thoroughly. Thanks!
Quick question regarding the code.
Does it matter which way round the tag appears or do both of the following work?
OR
Hi,
I am having a Magenot based website, I have a question related to the Meta Description.
All the Product pages on my website have “TOTALLY” different content, but what if all the product pages in a particular category have nearly same Meta Description.
Example:
Their are 2 Products in a same category.
Both have Totally different Content on Page.
But have nearly same meta description.
So, will it penalize my website.
Thanks in advance.
If there is not much lost here with rel=canonical with relation to page rank why worry that much about it? If its very little difference then why the concern over it? Just curious 😉
I lost all my link juice and page rank when changing domains even though i used 301’s on all URL’s to exactly the same content on the new domain.
Keywords are coming back on at 200 a day so this will take 8 weeks to get all keywords indexed again
Without PR and domain authority the keywords won’t rank where there did in the SERPS so i have to wait 3 months for the next PR update
Here are the facts showing actual screenshots from Google Webmaster Tools
http://wpsites.net/seo/what-happens-to-your-keyword-search-queries-when-you-change-domains/
You info is out of date Matt
hi matt
can you guide me?
how we want to keep canonical tag into static website?
towards home page ( index.html ) or every page must have its own page redirection?
please tell me
Hello Matt,
in a post Panda world, how would you use canonicals for product variations of the same product?
– Let’s say, different colours and different sizes of essentially the same product. I have seen different approaches so far like:
1. setting up javascript with all the variations included
2. setting up different pages for each product redirecting to a canonical for the base product
3. setting up different pages with different metas and descriptions for each product
I would go for the last method, would you?
Please answer me, thanks a lot!