Search Results for: 302

SEO advice: discussing 302 redirects

In a previous post I talked a little bit about 302s. Let’s cover them in more detail. A 302 redirect can be on-domain or off-domain. On-domain is simple and not prone to hijacking, so let’s talk about that first. Suppose you go to www.xbox.com and the site does a 302 redirect to some really long url, or a url with a session ID (this used to be what xbox.com did a couple years ago. Now you end up at e.g. www.xbox.com/en-US/, but play along with me). Would you rather see www.xbox.com or www.xbox.com/home/redir/sess?session=23412341234124124231455423633 ? Yeah, I’d rather see just www.xbox.com too. That’s why for on-domain 302 redirects (that is, a redirect in which both the source page and the destination page are both on the same domain), search engines will usually pick the shorter url. Hopefully that makes sense. I’d rather see www.example.com than www.example.com/deep/home/page?last=root&sessid=909345AF2343 , and I think most people would too.

Q: Time out. I’ve got a question. What’s the deal with 302 vs. 301? What does that mean? What’s the difference?
A: The “302” refers to the HTTP status codes that are returned to your browser when you request a page. For example, a 404 page is called a “404” because web servers return a status code of 404 to indicate that a requested page wasn’t found. The difference between a 301 and a 302 is that a 301 status code means that a page has permanently moved to a new location, while a 302 status code means that a page has temporarily moved to a new location. For example, if you try to fetch a page http://example.com/ and the web server says “That’s a 301. The new location is http://www.example.com/” then the web server is saying “That url you requested? It’s moved permanently to the new location I’m giving you.”

Okay, back to our regular discussion. Now let’s talk about off-domain 302 redirects. By definition, those are redirects from one domain A.com to another domain B.com that are claimed to be temporary; that is, the web server on A.com could always change its mind and start showing content on A.com again. The vast majority of the time that a search engine receives an off-domain 302 redirect, the right thing to do is to crawl/index/return the destination page (in the example we mentioned, it would be B.com). In fact, if you did that 100% of the time, you would never have to worry about “hijacking”; that is, content from B.com returned with an A.com url. Google is moving to a set of heuristics that return the destination page more than 99% of the time. Why not 100% of the time? Most search engine reserve the right to make exceptions when we think the source page will be better for users, even though we’ll only do that rarely.

Let’s take an example from the tiny fraction of the time that we may reserve the right to show the source page for a 302 off-domain redirect. If you run wget on www.sfgiants.com, you’ll get a 302 redirect to a different domain, and the url that you’ll get is pretty ugly: http://sanfrancisco.giants.mlb.com/NASApp/mlb/index.jsp?c_id=sf . Please set aside that you are probably a site owner or webmaster for a second, and try to step into the shoes of a regular user on the street. If we had a taste test, how many users would prefer to click on “sfgiants.com” and how many would prefer to click on “sanfrancisco.giants.mlb.com/NASApp/mlb/index.jsp?c_id=sf” ? Normal users usually like short, clean urls. They are less likely to say “mlb.com? I wonder what that stands for? Hmm. Maybe major league baseball? Is that the officially licensed name, I wonder? It probably is. Yes, it looks like sanfrancisco.giants.mlb.com/NASApp/mlb/index.jsp?c_id=sf is the correct url from my query.”

Now you see the trade-offs. Go with the destination 100% of the time and you’ll get some ugly urls (but never any hijacking). On the other hand, if you sometimes return the source url you can show nicer urls (but with the possibility of source pages showing up when they shouldn’t). Different search engines have different policies that have evolved over time. Over the last year, Google has moved much more toward going with the destination url, for example, and the infrastructure in Bigdaddy continues in this direction.

Let’s take a look at how different engines handle the [sf giants] query. Remember that sfgiants.com does a 302 redirect to a url on a different domain (sanfrancisco.giants.mlb.com/NASApp/mlb/index.jsp?c_id=sf). And remember that reasonable people can disagree on which url should show up at #1. I’m not trying to criticize any search engine here, but rather trying to point out that this is a weird corner case.

Current Google behavior: we return sfgiants.com at #1. But we also return http://sanfrancisco.giants.mlb.com/NASApp/mlb/sf/homepage/sf_homepage.jsp at #3, as an uncrawled url, which is definitely poor/suboptimal.

Current Ask behavior: Ask returns giants.mlb.com/NASApp/mlb/sf/homepage/sf_homepage.jsp at #1, sanfrancisco.giants.mlb.com/NASApp/mlb/index.jsp?c_id=sf at #2, and sanfrancisco.giants.mlb.com/NASApp/mlb/sf/homepage/sf_homepage.jsp at #3.

Current MSN behavior: MSN returns giants.mlb.com/NASApp/mlb/sf/homepage/sf_homepage.jsp at #1 and sanfrancisco.giants.mlb.com/NASApp/mlb/index.jsp?c_id=sf at #2.

Current Yahoo! behavior: Yahoo! returns www.sfgiants.com at #1, but also returns sanfrancisco.giants.mlb.com/NASApp/mlb/index.jsp?c_id=sf at #6. You might think that returning sfgiants.com at #1 isn’t what Yahoo! said that they would do with 302 off-domain redirects (i.e. always go with the destination), but if you read carefully, Yahoo! also reserves the right to make exceptions in handling redirects. That allows them to show a nice url at #1.

Current Google Bigdaddy behavior (data center at 64.233.179.104): Bigdaddy managed to find a short url on the destination domain of mlb.com, namely giants.mlb.com, and returns that. We return it at #1 with no other duplicate urls on the first page.

Please don’t take me listing the current results from each engine the wrong way; I think the results from all the search engines are great for this query, because a user would have gotten to the correct final location with any search engine that they tried. This is also an unusual case where reasonable people can disagree on what the best answer is. Also, I’m positive people can find places where the Bigdaddy data center handles things the wrong way. My only point is that the new infrastructure at the Bigdaddy data center will let us tackle canonicalization, dupes, and redirects in a much better way going forward compared to the current Google infrastructure. I’m not claiming that everything is perfect in Bigdaddy, just that it’s easier for us to make changes and improve search quality as we get feedback from you.

Okay, that’s about all the background I wanted to give. Next post will call for Bigdaddy feedback.

Switching things around

This weekend I decided to mix things up on my blog. So I switched things around:

– I took one of my domains, dullest.com, and moved it to TigerTech from pair Networks.
– I installed the latest version of WordPress on dullest.com and copied the MySQL database from mattcutts.com to dullest.com.
– I changed my blog layout to the excellent Thesis theme by Chris Pearson. Previously I was using the Almost Spring theme.
– I added an .htaccess file that will do 302 redirects from www.mattcutts.com/path/file.html to www.dullest.com/path/file.html .

Note: changing your IP address, webhost, domain name, blog template, and blog version all at the same time is the exact opposite of what you should normally do. It’s better to change only one thing at a time so that if something goes horribly wrong, you can trace what caused it.

Also, if you were truly moving a site, a 302 redirect wouldn’t be the right redirect to use–a 301 (permanent) redirect would be better. But if I like these changes, I might migrate mattcutts.com to TigerTech and then migrate my blog from dullest.com back to mattcutts.com. So I’ll stick with a 302 for now.

Sometimes it’s fun to mix things up. It’s not as if I make any money from my blog, so I don’t mind if my search rankings drop for a while. In fact, it will be a pretty interesting experiment to see what happens with search engines and traffic.

I’m sure a bunch of stuff broke; let me know if you see anything especially horrible!

Virtual terminals not working? Check your keyboard.

(This is a boring post that I’m writing for people that have this same problem in the future. Just skip it.)

Every good Linux user knows that if you want to drop from X down into a text-based virtual terminal, you can press control-alt-F1 (or any other key up to F6), and control-alt-F7 returns you to the graphical mode. But what if that doesn’t work? In my case, it turned out to be my keyboard. My Microsoft Natural Ergonomic Keyboard 4000 has a key marked “F Lock” and unless that FLock key is activated (the “F” LED should be on), the wrong keystrokes were being sent to my Linux Ubuntu version of Intrepid Ibex. How can you debug this? Well, it took me a while.

After some Googling, here’s how I’d write the flowchart:

Try running “chvt 1” to switch your console to virtual terminal 1.
– If “chvt 1” does not work, you might get the message “Couldnt get a file descriptor referring to the console”. You probably need to be superroot. Once you run as root, that command should work.
– Maybe “chvt 1” fails in some other way. Dude, you’re outside my area of expertise. You could try typing “sudo modprobe vga16fb; sudo modprobe fbcon” . Or you could try typing the command “setupcon” to set up the font and keyboard on the Linux console. Or it’s possible that you need to edit /boot/grub/menu.lst and tweak the vga= setting or remove the “splash” parameter. Or you might want to check your /etc/gdm/gdm.conf file.

– Quick check: you might have the “DontVTSwitch” option set in your /etc/X11/xorg.conf file, which would disable virtual terminal switching.

If running “chvt 1” as superroot does work, then you probably have an issue with your keyboard mappings somehow.
– If you have a Microsoft Natural Ergonomic Keyboard 4000, make sure that the “F-Lock” key near the top-right of the main part of the keyboard is properly engaged. The “F” LED below the keyboard should be on.
– Next, run xev (possibly as root) to see raw xevents as you press keys. This thread helped, where the person said

Recently I tried to switch to VT (console) and I couldn’t – Ctrl+Alt+F1 didn’t work (and they used to couple of weeks ago). I don’t even know where to look for the problem; xev detects KeyRelease XF86_Switch_VT_1 event, /etc/inittab contain getty respawns.

When I ran xev myself, and pressed control-alt-F1, I saw an event like

KeyPress event, serial 38, synthetic NO, window 0x3400001,
root 0x1a6, subw 0x3400002, time 1848943, (42,37), root:(1751,59),
state 0x0, keycode 146 (keysym 0xff6a, Help), same_screen YES,
XLookupString gives 0 bytes:
XmbLookupString gives 0 bytes:
XFilterEvent returns: False

The fact that I saw a “Help” event rather than “XF86_Switch_VT_1” was what made me suspicious. Sure enough, activating the “F-Lock” key then triggered this event:

KeyRelease event, serial 34, synthetic NO, window 0x3400001,
root 0x1a6, subw 0x3400002, time 2873229, (38,51), root:(1747,73),
state 0xc, keycode 67 (keysym 0x1008fe01, XF86_Switch_VT_1), same_screen YES,
XLookupString gives 0 bytes:
XFilterEvent returns: False

and life was good. You might also consider tweaking xmodmap to return the values you expect. Or you might have a strange XkbModel or XkbLayout setting where switching your keyboard language or layout (e.g. from pc105 to pc104) might help.

My Five Months With Google Chrome

Om Malik wrote an interesting post about Google Chrome one month after the public launch. While I was reading Om’s post, I realized that I wrote a post for the Google Chrome release that I never published. I’ll include it here, and then let’s meet at the bottom and compare notes. πŸ™‚

Like many Google engineers, I’ve been running Google Chrome for several months. When I sat down with a blank piece of paper to write down why you should try Google Chrome, I ended up with several reasons, including speed, security, stability, and openness. I’ll run through them for you.

Speed. Google Chrome is wicked fast, especially if you use AJAX/JavaScript-heavy web applications such as Gmail. And it’s not just “benchmark fast,” it’s end-to-end fast. Google Chrome puts special emphasis on never making the user wait. Opening a tab is essentially instantaneous, and all the little pauses that would normally interrupt your workflow just don’t happen. Of course, sometimes a remote web server is slow to return data–there’s nothing that a web browser can do about that–but for everything else, the browser speeds along like lightning.

When Gmail came out, it took me months to switch over. Before Gmail, I used mutt and I had all kinds of crazy customizations and wild procmail rules, so it took quite a while for Gmail to convince me to switch. In contrast, it took less than a week for me to switch to Google Chrome. It’s so scary fast that I felt like I was taking smart pills because of all the extra work and email I could blast through.

Security. As the head of Google’s webspam team, I prowl around some pretty hairy places on the internet. Almost every day I encounter hacked pages, malware, porn, and generally scuzzy pages. The security model in Google Chrome is much stronger than most other browsers I’ve used. I’ve surfed through hundreds of seedy back alleys of the Internet over the last several months, and Google Chrome has safely kept me from being infected or affected by the junky web pages I encounter.

Stability. I loved my previous browser (and still do!), but I got used to killing my browser and restarting it daily to prevent memory leaks from hobbling my machine. I’ve run Google Chrome for weeks at a time with bunches of open tabs and it hasn’t crashed on me or bloated up my computer’s memory. I also love that Google has a “ChromeBot” which takes each new browser build and throws (put your pinky finger to your lips) one million webpages at the build as a torture test. That testing virtually guarantees that everyday web pages shouldn’t crash your browser. Google Chrome has been rock solid for me.

Openness. You aren’t locked in to using Google’s search; you can choose to use any major search engine in Google Chrome. Plus, as you click around the web, you don’t send surfing information to Google. Google Chrome is open-source under a BSD license, so you can check that for yourself. The cool bits of Google Chrome, including V8 (a from-the-ground-up JavaScript virtual machine), are open for anyone to take and use.

The comic book. Still not convinced? If you’re a geek, read the 40-page comic book about Google Chrome. It’s genuinely educational about the design choices that Google made. It turns out that a comic is one of the best ways to introduce a large piece of new software:

Ben Goodger talks about the Omnibox

You’ve all heard the acronym “RTFM,” right? It stands for Read The *cough* Fine Manual. The next time someone asks whether Google Chrome uses WebKit or something else, I can say RTFC–Read The Fine Comic. πŸ™‚

Okay, how well does that post hold up after a month?

On speed, I think Chrome really holds up well. Om’s comments are filled with people who got hooked on the speedy and nice Google Chrome browser experience. A couple people who didn’t like it only tried it for a day; I really think you need to give Chrome a few days (maybe a week) to really notice the end-to-end difference.

On security, I was impressed that so few security holes were found, and most of them required the user to take some additional action or involved social engineering. I have seen very few (no?) attacks like “surf to a random page and your browser gets pwned.” That’s really nice to see; I’m sure the Chrome team was anxious to see what would happen when the outside world tried to attack Chrome. Chrome has been quite robust for a web browser that was only recently released into beta. I continue to surf to really dangerous places with no resulting hijacks or malware.

How about stability? I always thought this would be the weakest point of the Chrome launch, and not because of web pages that would crash Chrome, but because it’s hard to test on a wide variety of real-world hardware when you’re trying to keep a product secret before releasing it. And again, I was surprised that so few things broke. The fact that the Chrome team has released four updates to Chrome in four weeks tells me two things: 1) the worst bugs are going to get knocked down pretty quickly and 2) the Chrome team is very serious about iterating to improve the browser.

Openness is an interesting one. I think the EULA issue caused a short-term goodwill hit. Google corrected the terms in about a day, but it still provided material for the people who dislike the fundamental notion of the Chrome browser. I have to admit that I was surprised that people objected to the “Suggest” feature when you’re typing into the address bar, but it’s good that Google reacted quickly on that one as well. I had a conversation with Danny Sullivan where he urged Google employees to try to look at Google as if they were outside the company and didn’t work for Google. It’s excellent advice and definitely provides a helpful perspective. Ultimately, I think that the open-source nature of Google Chrome’s code should reassure most people and win over fans with time.

And the comic book? I still think it’s a cool way to explain a lot of complex design decisions. πŸ™‚

I’ve been watching the Chrome team work, and I believe that they’re going to earn the respect and loyalty of a lot of surfers over time. Their ability to execute reminds me of how the Google Reader team won me over a couple years ago. If you’re running Windows and haven’t taken it for a spin, if you try Chrome for 5-6 days, I think you’ll like it too.

Engineering grouplets at Google

Google engineer Bharat Mediratta discussed some Google engineering customs in the New York Times yesterday. Bharat goes beyond 20% time to talk about some different aspects of being an engineer at Google:

  • Grouplets bring together like-minded engineers who care about things like documentation, improving our build system, or testing. It’s an informal process lets engineers contribute on the topics that they care about the most.
  • Sometimes we have “Fixit days” where every Google engineer is encouraged to tackle a specific topic. From the article:

    Or my favorite: the Customer Happiness Fixit, when we fix all those little things that bug our users and make them sad β€” for example, when the hotkeys aren’t just right on mobile phones. Many of these events come with special T-shirts and gifts to reward the engineers who take a little time out to work on them.

    That particular fixit day was one of my favorites too.

  • Bharat also discusses the best way for an engineer to have an impact at Google:

    Google works from the bottom up. If you have a great technical idea, you don’t have your V.P. send out a memo telling everybody to use it. Instead, you take it to your fellow engineers and convince them that it’s good. Good ideas spread fast, and this approach keeps us from making technical mistakes. But it also means that the burden falls upon you to spread your idea.

I’d completely agree with that. I’ve noticed that a good way to accomplish something at Google is to convince other engineers and build consensus from there. Google’s culture also rewards those who take the initiative on their ideas.

Bharat also talks about how his testing grouplet hit on the idea of posting one-page stories about testing in the bathrooms. Just like that, “Testing on the Toilet” was born. πŸ™‚ Now that the tradition has been discussed publicly, I don’t feel bad about linking to this picture that Niall Kennedy snapped while visiting Google a while ago. πŸ™‚

css.php