Learn More about the Canonical Link Element

February 25, 2009

in Google/SEO

A week or so ago I did a post about the canonical link tag, including a pointer to a 3-4 minute video about the tag.

If you enjoyed that video but wanted to learn more, last week I sat down and recreated the presentation that I did at SMX West. You can watch the “director’s cut” of the video (click in the lower-right of the video to get the high-quality version). Here’s the video:

And you can access the slides directly or follow along here:

One exciting new development even since we made the video is that Ask announced that they will support the canonical tag. This means that pretty much all the major search engines will support this as an open standard. That should make life easier for site owners, developers, and webmasters.

If you like this video, you might also want to check out our new webmaster video channel on YouTube as well.

{ 32 comments… read them below or add one }

Philipp Lenssen February 25, 2009 at 12:40 pm

> This mean that pretty much all the major search engines
> will support this as an open standard.

Does that include more non-US focused engines, say Baidu or Yandex?

Matt Cutts February 25, 2009 at 12:48 pm

Philipp, I was speaking in terms of global market share rather than local markets, but the nice thing is that this is a completely open standard, and the data is live on the web (not locked in specifically to any one search engine). So Baidu or Yandex could easily add support for this tag and benefit from the standard as well. I would love if Baidu or Yandex decided to do that.

Agent SEO February 25, 2009 at 12:48 pm

I love the video posts…gives a more personal feel to the blog….

Duplicate content can be such a bear to deal with. From the redirection issues to server stuff (www vs. non-www), it makes you want to pull your hair out.

I’ve been having issues with my Wordpress blog when I place a post in 2-different categories, and only thanks to Google Webmaster Tools did I realize the problem. Luckily I was able to fix the issue and remove the 404-pages it caused from the index via one of the Webmaster Tools.

Daniel Tunkelang February 25, 2009 at 12:58 pm

Matt, great to see Google and friends (well, peers) doing this. We’ve run into this issue in faceted search for years.

http://thenoisychannel.com/2009/02/13/canonical-urls-and-faceted-search/

Morris Rosenthal February 25, 2009 at 3:12 pm

Matt,

Can I understand your presentation to mean that as long as I have submitted a sitemap, Google will automatically assign all crunchy goodness to the URL given in the sitemap, including whatever value links of the other type (relative vs absolute) might have generated?

Morris

Klaus Johannes Rusch February 25, 2009 at 3:23 pm

Matt,

thanks for sharing the presentation; this is most helpful.

One question I did not see answered (or asked) yet is whether the specified canonical URL itself may redirect to another page using a 302 redirect.

The use case that comes to mind is a canonical URL without a session identifier which redirects to the same URL with a session identifier, or a vanity URL that redirects to a lengthy string of unreadable garbage.

I would hope that the specified canonical URL will be used then — is that the case?

Morris Rosenthal February 25, 2009 at 3:51 pm

Matt,

Thinking about it, putting in the canonical tag will only take a couple hours, so that’s what I’ll do. Thanks, I can stop obsessing about this now:-)

The main reason I’m don’t want to go all absolute is my favored HTML editor (from 1995) won’t let me navigate on my own hard drive if I use absolute URLs.

Morris

Dave (original) February 25, 2009 at 4:28 pm

Matt, shouldn’t the big 3 SE be promoting this new element to those who are NOT aware of it?

Morris Rosenthal February 25, 2009 at 4:39 pm

Matt,

An addition to the “ways to shoot yourself in the foot”. I almost stuffed the tag in my Blogger template, that would have been pretty stupid:-)

Morris

Matt Cutts February 25, 2009 at 8:37 pm

Morris Rosenthal, putting the url in the sitemap suggests or indicates a preference. We still reserve the right to make the final call, but we do take that preference into account.

Matt Cutts February 25, 2009 at 8:38 pm

Klaus Johannes Rusch, we’ll try to do something reasonable in this case, but it’s better if you can point to your preferred canonical url in just one hop and without redirects in the chain.

Dave, you have to start somewhere.

Jag (SEO) February 25, 2009 at 8:51 pm

Matt, Dave is right big 3 SE should find some way to promote about Canonical Link Element to users – May be some hint can be shown in search results or any other better idea from Google R&D ;)

Sasa Ebach February 26, 2009 at 1:02 am

If I understand correctly, any kind of *external link juice* to pages like

http://www.example.com/page.html (canonical)
http://www.example.com/page.html?sort_by=1
http://www.example.com/page.html?tracking=123

will NOT be accumulated to the canonical. Correct?

Until it is I actually see no use in recommending this practice. Being able to avoid DC is great. But not at the cost of splitting any kind of link juice and losing ranking power that way.

I hope that the major SE reconsider this because in general I think that the canonical tag is a great thing. In the mean time I will use 301-redirects like always.

Morris Rosenthal February 26, 2009 at 6:19 am

Matt,

As a future tweak, how about recognizing a canonical tag outside an HTML header, like dropped into the end of a PDF document? Could help Google keep straight where the true home of the document is, since most people won’t bother hacking it out.

Morris

Gerry February 26, 2009 at 6:40 am

Curious about something, I know it is possible to have two articles with different case in the url (i.e. lowercase vs upper case) but it is incredibly uncommon (for different content), why do search engines consider this to be duplicated content …

angilina February 26, 2009 at 7:44 am

Hello Matt,

I think this new feature is something very special. I used to do so many redirects and things like that in my forum: all to prevent generation of multiple ULRs for the same pages.

But Matt, I am little bit confused:

If I have a page

matt-cutts.php

And I make a canonical link tag to tell Google that I want

mysite.com/matt-cutts.php

To be the preferred URL.
Now what if there are some other URLs like these?

matt-cutts.php?id=1
matt-cutts.php?id=2

These two pages are not same as

matt-cutts.php

Google is not going to consider all URLs except

mysite.com/matt-cutts.php

?

Regards

cnc turning February 26, 2009 at 8:06 am

why are backlinks from nofollow links on matts blog showing up, at the top of site explorer checks? i thought nofollow means nofollow? what gives? i need honesty!

Ani López February 26, 2009 at 8:33 am

As Yusef Hassan Montero pointed in his great conference at Search Congress in Barcelona something “bad” about copyleft licensed content is that helps to create duplicated content all along the internet (from all other points of view it is marvellous, I state). This represents 2 problems: authoring attribution and search engine algorithms to attribute relevance to the original one.

Although this canonical thing can be very useful, a nice first step, as it is not working across domains it doesn’t help to deal with the duplicate content/relevancy attribution problem.

Some (cross domain) attribute that could be placed in any html tag to indicate the origin of the content would help better, ie:
<div rel="canonical" href="http://example.com/page.html"><p>bla bla bla</p></div>

any thoughts about?

Dan Stephenson February 26, 2009 at 9:05 am

Hi Matt,

I must say, I think that the canonical link tag is a fantastic idea! I’ve been waiting for something like this for a long time now.

Ian M February 26, 2009 at 9:08 am

Matt – there are two mistakes in your slides – please see http://www.mattcutts.com/blog/canonical-link-tag/#comment-249160

Dave (original) February 26, 2009 at 4:14 pm

Matt, I know Google have to start somehwere, but much like the nofollow, most Webmasters still have no idea it exists.

Google needs to devise a way to communicate NEW elements etc to the majority and not just the few who frequent SEO sites.

Sebastian February 28, 2009 at 4:11 am

Hi!

You told it in your video: This tag doesn’t work for different domain-names. But if I have to publish my content unter different urls, what can I do to told Google which of them is the right one? Is it enough to set a source link inside the html-code?

Thanks,
Sebastian

Aaron Shepard March 1, 2009 at 12:07 pm

Matt, the official syntax of this element generates an error message in BBEdit’s syntax checker when inserted into an HTML 4.01 document. I think you need to document that the final slash is only for XHTML. I haven’t seen that distinction made anywhere.

Steven Lewis March 1, 2009 at 5:19 pm

I just posted your YouTube video on my website and blog. You know, it might’ve been a long time before I heard about the Canonical Link Element, I’m glad I took the time to ‘play’ around online for once. I’m not sure how to go about ‘testing’ my site for duplicate URL content, but I added the link element to my blog and hopefully that will handle any major issues (if they were) or have arose.

Tom March 2, 2009 at 3:31 am

Hi Matt,

We use “?example” to tag the links on our site so that we can get more info about where people are clicking when we view our site overlay on Google Analytics.

As a result, we have multiple URL’s linking to the same page:

http://www.example.com/about.htm?topnav
http://www.example.com/about.htm?midpage

and of course:

http://www.example.com/about.htm

Would this result in duplicate content issues? If so, is there a “best practice” to “tag” links for better site analysis in Google Analytic’s overlay?

Thanks in advance.

Tom

Anthony Long March 2, 2009 at 11:05 am

Hi Matt,

I noticed w3.org mentions the LINK element can be used to provide a variety of information to search engines. Besides “rel=canonical”, what other LINK element attributes does Google honor?

What are your thoughts on some of the examples given there?: http://www.w3.org/TR/html4/struct/links.html#h-12.3.3

li March 2, 2009 at 12:48 pm

A couple questions for this example:

example.com/page1?page=3
example.com/page1?sortby_price_asc
example.com/page1?sortby_date_desc

all have cannoical link points to:
example.com/page1

1. Will Google eventually or quickly remove all pages with canonical link element but just keep the page the canonical link points to?

2. Will this hurt the pagerank of example.com/page1 since different variance with different sorting algorithm would have different unique nature such as keyword density. WIll example.com/page1 be able to have all those unique nature or G will just treat it based content on example.com/page1?

Ian M March 4, 2009 at 7:33 am

By the way, it’s not just IIS that has case-sensitivity issues – Apache on Windows has the same issue. It’s the case-insensitive Windows filesystem that’s the problem.

Klaus Johannes Rusch March 9, 2009 at 7:48 am

@Ian M Said, the Windows file system can set to case sensitive. See http://support.microsoft.com/kb/817921/ for details.

Bart Berlinski March 14, 2009 at 3:35 am

Hi Matt,

I’m SEO specialist at the biggest auction site in CEE, I’ve got 2 questions about canoncial link which will help me a lot.

1. Can I use canonical for url’s with added affiliate ID from our Affiliate Program? (i.e. the url is http://site.com/computers.php&affiliateID=10 will it pass the SEO link juice to http://site.com/computers.php)?

2. Can I use canonical at url’s with ended auctions? (i.e. if an auction ends we’ll automaticly put canonical to parent category where this auction was?)

Thanks,
Bart

Ani López April 25, 2009 at 10:46 am

canonical can help sometimes but first a good structure to avoid uplicated content, relevance dispersion and internal link leaking http://dynamical.biz/blog/seo-content-optimization/web-structure-duplicate-content-canonical-12.html

Seo London September 2, 2009 at 6:24 am

Good video nice addition in Search Engines Algo.

Leave a Comment

If you have a question about your site specifically or a general question about search, your best bet is to post in our Webmaster Help Forum linked from http://google.com/webmasters

If you comment, please use your personal name, not your business name. Business names can sound salesy or spammy, and I would like to try people leaving their actual name instead.

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous post: Last call for “no results” pages

Next post: Grab bag questions