A rel=canonical corner case

I answered an interesting rel=canonical question over email today and thought I’d blog about it. If you’re not familiar with rel=canonical read these pages first. Then watch this video about rel=canonical vs. 301s, especially the second half:

Okay, I sometimes get a question about whether Google will always use the url from rel=canonical as the preferred url. The answer is that we take rel=canonical urls as a strong hint, but in some cases we won’t use them:
- For example, if we think you’re shooting yourself in the foot by accident (pointing a rel=canonical toward a non-existent/404 page), we’d reserve the right not to use the destination url you specify with rel=canonical.
- Another example where we might not go with your rel=canonical preference: if we think your website has been hacked and the hacker added a malicious rel=canonical. I recently tweeted about that case. On the “bright” side, if a hacker can control your website enough to insert a rel=canonical tag, they usually do far more malicious things like insert malware, hidden or malicious links/text, etc.

I wanted to talk today about another case in which we won’t use rel=canonical. First off, here’s a thought exercise: should Google trust rel=canonical if we see it in the body of the HTML? The answer is no, because some websites let people edit content or HTML on pages of the site. If Google trusted rel=canonical in the HTML body, we’d see far more attacks where people would drop a rel=canonical on part of a web page to try to hijack it.

Okay, so now we come to another corner case where we probably won’t trust a rel=canonical: if we see weird stuff in your HEAD section. For example, if you start to insert regular text or other tags that we normally only see in the BODY of HTML into the HEAD of a document, we may assume that someone just forgot to close the HEAD section. We don’t allow rel=canonical in the BODY (because as I mentioned, people would spam that), so we might not trust rel=canonical in those cases, especially if it comes after the regular text or tags that we normally only see in the BODY of a page.

But in general, as long as your HEAD looks fairly normal, things should be fine. If you really want to be safe, you can make sure that the rel=canonical is the first or one of the first things in the HEAD section. Again, things should be fine either way, but if you want an easy rule of thumb: put the rel=canonical toward the top of the HEAD.

Google I/O 2011!!1!

This week brings the Google I/O conference. That page has a QR code that lets you install the official Google I/O Android app for the conference.

The conference has a ton of great talks scheduled. You can learn everything from “Building Aggressively Compatible Android Games” to “Cloud Robotics” to “Designing and Implementing Android UIs for Phones and Tablets” to a Google Checkout talk to “Honeycomb Highlights” to “How to NFC,” plus a ton more. Want to hear about Python from Guido van Rossum? He’ll be there. Want to hear how the Google Pac-Man logo happened? Those folks will be there. Web Fonts? Uh huh. You can even meet the Google Ventures team for some VC speed dating.

I’ll be doing an Ignite talk at 5pm on Tuesday about “Trying Something New for 30 Days.” If you see me at Google I/O, come up and say hello!

You can also follow @googleio on Twitter, and the hash tag is #io2011.

Even if you can’t make it to Google I/O in person, a lot of the talks will be livestreamed. They just announced that the keynotes will be about Android and Chrome. I think the videos of the talks should be up fairly quickly as well: the official blog post claims “Recorded videos from all sessions across eight product tracks will be available within 24 hours after the conference.” Here’s the session videos from 2010, for example. Hope to see you at I/O!

Search Engineering at Google

I’m always a fan of Googlers doing more communication and more videos, so when some fellow search quality folks made a video about working at Google, I said I’d be happy to post it:

You can find out more info and apply to be a search engineer at Google if you’re interested.

css.php