Archives for April 2008

Solved: another common site review problem

Okay, go read this post on the Google webmaster blog. In fact, if you read my site, you really should add the Official Google webmaster blog feed to your list of subscriptions, because that blog is almost 100% SEO/webmaster-related posts, and it is official. Done reading? Okay, I’ll give you my personal take on why I like this idea.

I’ve done a lot of site reviews in my time. Many of them go like this:

Webmaster: Matt, can I get a site review for ExampleCo??
Me: Hey, I’ve heard of Example. I really like your red widgets.
Webmaster: Thanks! We’re rolling out a new line of blue widgets this fall. The site is example.com.
Me: Okay, let’s take a quick look.

(small chat about blue widgets until the site loads.)

Me: Huh.
Webmaster: What? What does “Huh” mean?
Me: Well, when I visit www.example.com I get map of the world and then at the bottom of the page there’s a dropdown to select which country version of Example to go to next.
Webmaster: Right. Example is a big business with lots of different country-level domains, so we have to ask the user where they want to go. Why, is that a problem?
Me: It sort of is. Dropdown boxes and forms are kind of like a dead end for search engine spiders. Historically we haven’t crawled through them.
Webmaster: But it’s just a dropdown box with ten countries listed. You can’t just crawl that?
Me: Not really. Think of search engine spiders much like small children. They go around the web clicking on links. Unless there’s a link to a page, it can be hard for a search engine to find out about that page.
Webmaster: But it’s just ten countries. Couldn’t the search engine just pick one of those values and keep going?
Me: In theory you could do that, but in practice the major search engines don’t usually do that.
Webmaster: That sucks. I like how clean the page looks. Is there a way around that?
Me: Sure. You could put the list of countries at the bottom of the page and make them hyperlinks so that Googlebot can crawl through to the other urls. A good rule of thumb is to take a look at your site in a text browser like Links or an ancient browser with JavaScript/CSS/Flash turned off. If you can reach all your pages just by clicking regular links, your site should be pretty crawlable.

I’ve had this conversation a lot over the years. Savvy webmasters and SEOs know how to make a site crawlable, e.g. making sure that someone can reach every page on a site via normal HTML links. But the web is filled with sites that have a dropdown box or some other form that search engines typically didn’t know how to handle.

Now Google is finding ways to crawl through forms and drop-down boxes. We only do this for a small number of high-quality sites right now, and we’re very cautious and careful to do the crawling politely and abide by robots.txt. If you’d prefer that Google not crawl urls like this, you can use robots.txt to block the urls that would be discovered by crawling through a form. But I hope that the dialog above is a pretty good example of why this new discovery method can be helpful to webmasters.

Danny asks a good question: if Google doesn’t like search results in our search results, why would Google fill in forms like this? Again, the dialog above gives the best clue: it’s less about crawling search results and more about discovering new links. A form can provide a way to discover different parts of a site to crawl. The team that worked on this did a really good job of finding new urls while still being polite to webservers.

By the way, I wanted to send out props to a couple people outside Google who noticed this. Michael VanDeMar emailed me a little while ago to ask about this, and Gabriel “Gab” Goldenberg recently noticed this behavior as well. I appreciate them discussing this because it encouraged Google to talk about this a little more. 🙂

I wanted to blog, but…

Honest. I wanted to write a big, in-depth blog post about X (pick whatever X you want), but then Emmy came and sat down beside the keyboard with a forlorn face. This is what she looked like:

Emmy is waiting patiently for Matt

Emmy was just waiting patiently for me to get off the computer so that we could play or hang out. How am I supposed to blog under those kinds of conditions? 🙂

Google App Engine: Launching a startup gets even easier

This is pretty cool. Google launched App Engine, which lets you write code for a web application, then Google takes care of the scaling/failover/logistics-type issues. You can store your data in a Google Bigtable using the Google File System (GFS). There’s a bunch of App Engine APIs to simplify things like sending email and fetching urls. Your application can authenticate users that are using Google Accounts, so you can avoid the whole “ask your users to create a new account” issue if you want.

The official blog post makes it clear that this is a preview release, so Google will be adding more functionality over time but they’re opening the program up now to start to allow real-world applications and to get real-world feedback. The first 10,000 developers to sign up get to play with it now.

My favorite part is that the usage model looks pretty solid:

During this preview period, applications are limited to 500MB of storage, 200M megacycles of CPU per day, and 10GB bandwidth per day. We expect most applications will be able to serve around 5 million pageviews per month. In the future, these limited quotas will remain free, and developers will be able to purchase additional resources as needed.

I checked out my pageview stats for the first three months of the year. If you subtract out a couple posts that got hit by digg, I’m running at about 500,000 pageviews a month. So you can scale your web app up to be ten times more popular than my blog (which is relatively well-trafficked) before you’d be looking at paying for storage/CPU/bandwidth. By then, you’d know that your start-up idea was on to a good thing.

At this point, you might want to consider going to Google I/O, which is Google’s two-day developer event on May 28-29. If you’re a student or teacher it’s only $50 and there’s a bunch of different subjects on the agenda. Check out some of these sessions:
– Painless Python for Programmers
– Building Scalable Web Applications with Google App Engine
– HTML5, Brought to You by Gears (taught by Aaron Boodman of Greasemonkey fame)
– OpenSocial, OpenID, and OAuth: Oh, My!
– Building an Android Application 101
– Hands-On Maps API: Basic & Advanced

I believe that App Engine will make launching a startup easier than ever. At this point, you could build up a pretty killer startup incorporating technologies as simple as Gmail or as powerful as App Engine.

How to use a notebook: 7 quick tips

You never know when your brain is going to flash on an idea, a great gift, or something you need from the store. That’s why I carry a small notebook around with me most of the time. Here are some productivity tips on how to use a “hipster PDA” effectively.

  1. Get one. I got mine for about a buck at Office Depot. It looks like this:
  2. Mead notepad

  3. Write the date on the outside of the notebook. If you start using notebooks a lot, you’ll find it very handy to be able to sort notebooks by time.
  4. Clear out your brain. When you think of a task for work or a book that you want to buy, just write it down. This lets you concentrate on important things instead of remembering small items.
  5. Avoid the temptation to write on both sides of the page: just write on one side. You’ll see why in a minute.
  6. Keep each separate subject on a separate page. One page could be things to get done at work that day. Another page could be a meeting agenda. Another page could be books you want to read, or movies you want to see. Yet another could be things you want to blog about. But don’t mix the meeting agenda with your blogging to-do list. You’ll see why in a minute.
  7. When you’re finished with a page, yank out that page. Crumple it up and throw it away. Maybe you’re back from the grocery store and everything is crossed off your grocery list page. Try to finish out the notebook with almost all your pages ripped out.
  8. You want the notebook to be empty or nearly empty when you run out of blank paper. It’s very satisfying to yank a page out of the notebook when you’re done with a task. You want the page to pull away cleanly, so look for a notebook that is perfect bound. That is, the spine of the notebook is square and the pages are held in place with glue. I’ve found the “Square Deal” memo pads from Mead to be just right for me.
  9. A bonus tip: if you’re about to head to a big event like a conference and think you might take a lot of notes, feedback, or details, then start with a fresh notebook.

The observation here is pretty simple: the notepad is not your entire filing system. That notebook is just your short-term working memory. Ideally anything that you jot down in the notebook (e.g. movies to see) can eventually go onto a longer-term list, such as your Netflix queue for movies.

If you want to get advanced, you can store some small amount of info in the inside cover of the notebook. For example, there’s a cafe that I like with free WiFi. Their WEP password is their phone number. So I keep that WEP password on the inside cover of my notebook. Arguably things like passwords could go in your head or laptop or phone though. I really only use my “hipster PDA” to remember things until I can move them over into a better place or take care of them quickly.

Google to spin off search marketing side of Performics

I’m crunching on a bunch of work stuff today, but I wanted to point out this official Google blog post briefly:

Since we closed the acquisition of DoubleClick on March 11, we’ve been immersed in integration planning for each of our products and business units. Recently we completed this process for the DoubleClick Performics businesses, and have decided to split them into two separately-run business units: Affiliate Marketing and Search Marketing.

It’s clear to us that we do not want to be in the search engine marketing business. Maintaining objectivity in both search and advertising is paramount to Google’s mission and core to the trust we ask from our users. For this reason, we plan to sell the Performics search marketing business to a third party.

I have nothing but respect for the people that do search marketing for Performics, but I think this is the best decision. People hold Google to a unbelievably high standard, and I think it’s important that we try very hard to avoid any conflict of interest — or even the appearance of conflict of interest — in our business. Many people I respect have wanted Google to take this step and I think it’s the right call.

(As always, remember that this is my personal blog and personal opinions.)

css.php