Archives for February 2009

Pointers for Google Japan paid-post story

I just did a joint post about the Google Japan issue on Google’s Japanese webmaster blog. There’s also a post on Google’s main Japanese blog. If you don’t read Japanese, you can also watch the video where I recently talked about this.

To the extent that I can speak on behalf of Google, I apologize that this happened. One of the messages I heard was that people wanted Google to take action in this instance, and we did. The toolbar PageRank for google.co.jp dropped from 9 to 5, and I expect that to stay for a while. That decrease in PageRank reflects a loss of trust in the google.co.jp domain. In addition, the PageRank change also has ripple effects for google.co.jp where we lose trust in the links for that domain. The team from google.co.jp will also need to submit a reconsideration request just like anyone else would.

One of the other messages I’m hearing is that Google needs to keep talking about these issues, especially to explain why we think paid posts that affect search engines are bad for the ecology of the web. But in the mean time, I wanted to provide the pointers to the Japanese posts and to the video about this.

Learn about the Canonical Link Element in 5 minutes

Last week Google, Yahoo, and Microsoft announced support for a new link element to clean up duplicate urls on sites. The syntax is pretty simple: An ugly url such as http://www.example.com/page.html?sid=asdf314159265 can specify in the HEAD part of the document the following:

<link rel="canonical" href="http://example.com/page.html"/>

That tells search engines that the preferred location of this url (the “canonical” location, in search engine speak) is http://example.com/page.html instead of http://www.example.com/page.html?sid=asdf314159265 .

I also did a three-minute video with WebProNews after the announcement to describe the tag, and you can watch the canonical link element video for another way to learn about it. Watching the video is the easiest way to learn about this new element quickly.

The search engines have also posted about this new open standard. You can read a blog post or help center documentation from Google, Yahoo’s blog post, or Microsoft’s blog post.

Also exciting is that Joost de Valk has already produced several plug-ins. Joost made a canonical plug-in for WordPress, a plugin for e-commerce software package Magento, and also a plug-in for Drupal. I’d expect people to make plug-ins for other software packages pretty soon, or modify the software to use this link element in the core software.

Thanks to the folks at Yahoo (e.g. Priyank Garg and others) and Microsoft (e.g. Nathan Buggia and others) who built consensus to support this open standard. On the Google side, Joachim Kupke did all the implementation and indexing work to make this happen; thanks for the heavy lifting on this, Joachim. I want to send a special shout-out to Greg Grothaus as well. Although people had discussed similar ideas in the past, Greg was a catalyst at Google and his proposal really got the ball rolling on this idea; read more about it on his blog.

If you’re interested, you can see the slides I presented last week to announce this new element:

I’ll be happy to try to answer questions if you’ve got ’em, or you can ask questions on the official Google webmaster blog. If you’re going to SES London this week, Google’s own Maile Ohye will be at SES London to answer questions as well.

Update: I had “value” instead of “href” in the link element. Serves me right for not double-checking, and thanks to the commenters who noticed!

Update, 2/23/2009: Ask just announced that they will support the canonical link element. That means all the major search engines will be supporting this tag, which is great news for site owners, developers, and webmasters. Yay!

Write to a Google Spreadsheet from a Python script

Suppose you want to write to a Google Spreadsheet from a Python script. Here’s an example spreadsheet that you might want to update from a script:

Example spreadsheet

I did some searching and found this page, which quickly led me to the Python Developer’s Guide for the Google Spreadsheet API.

There’s a simple “Getting started with Gdata and Python” page. The upshot is 1) make sure you have a recent version of Python (e.g. 2.5 or higher), then 2) install the Google Data Library. The commands I used were pretty much

mkdir ~/gdata
(download the latest Google data Python library into the ~/gdata directory)
unzip gdata.py-1.2.4.zip (or whatever version you downloaded)
sudo ./setup.py install

That’s it. You can test that everything installed fine by running “./tests/run_data_tests.py” to verify that the tests all pass. The program “./samples/docs/docs_example.py” lets you list all of your Google Spreadsheets, for example. An extremely useful program that lets you insert rows right into a spreadsheet is “./samples/spreadsheets/spreadsheetExample.py” and someone has also got a really nice example of uploading a machine’s dynamic IP address to a spreadsheet.

The most painful thing is that InsertRow() must be called with a spreadsheet key and a worksheet key. If you find out those values, you could hardcode them into the script and probably cut the size of the script in half. Or you could just look in the url to see the key value. That’s what I did. So here’s an miniature example script to write to a Google Spreadsheet from a Python script:


#!/usr/bin/python

import time
import gdata.spreadsheet.service

email = 'youraccount@gmail.com'
password = 'yourpassword'
weight = '180'
# Find this value in the url with 'key=XXX' and copy XXX below
spreadsheet_key = 'pRoiw3us3wh1FyEip46wYtW'
# All spreadsheets have worksheets. I think worksheet #1 by default always
# has a value of 'od6'
worksheet_id = 'od6'

spr_client = gdata.spreadsheet.service.SpreadsheetsService()
spr_client.email = email
spr_client.password = password
spr_client.source = 'Example Spreadsheet Writing Application'
spr_client.ProgrammaticLogin()

# Prepare the dictionary to write
dict = {}
dict['date'] = time.strftime('%m/%d/%Y')
dict['time'] = time.strftime('%H:%M:%S')
dict['weight'] = weight
print dict

entry = spr_client.InsertRow(dict, spreadsheet_key, worksheet_id)
if isinstance(entry, gdata.spreadsheet.SpreadsheetsList):
  print "Insert row succeeded."
else:
  print "Insert row failed."

That’s it. Run the script to append a new row to the current spreadsheet. By the way, if you make a chart from the spreadsheet data, you can right-click on the chart, select “Publish chart…” from the menu, and get a snippet of HTML to copy/paste that will embed the chart on a web page. It will look like this:

That’s a live image served up by Google, and when the spreadsheet gets new data, the image should update too.

Where to find me, first half of 2009

Here’s the speaking/travel that I’m expecting to do in early 2009:

January 17-22, 2009: My wife and I journeyed to Washington D.C. to see the Presidential inauguration. It was very cold.

February 4-7, 2009: I just attended the TED conference down in Long Beach. It was pretty amazing. I got to pick Bill Gates’ brain on his foundation’s criteria for what to fund, and to talk to a man who had a near-death experience about a year ago and how it changed his life. The speakers were top notch too. If I get a chance, I may write more about TED later.

February 10-12, 2009: I’ll be at the SMX West conference this week. Don’t miss my session on “Ask the Search Engines” on Thursday, Feb. 12 — trust me. Looks like you can still register.

March 11-13, 2009: PubCon will be in Austin, and I’ll be doing a keynote (including Q&A) on Thursday, March 12th. Registration for PubCon is still open too.

March 13-17-ish, 2009: I’ve never been to South by Southwest , so I’m planning to attend that (it’s also in Austin, right after PubCon). Right now I’m not planning to speak, just to attend the “Interactive” part.

April 5-10, 2009: Taking a short vacation with my wife.

May 10-15, 2009: Taking a short vacation with my wife.

May 27-28, 2009: I’m speaking at Google I/O in San Franscisco.

May 30, 2009: I’ll be back at WordCamp in San Francisco.

June 2-3, 2009: I’m speaking at SMX Advanced.

June 9-11, 2009: I’ll be speaking at the Found conference in Burlingame, California. The Found conference has been postponed.

July 19-23, 2009: On July 22nd I’ll be speaking in Boston on the Industry Track of the Special Interest Group on Information Retrieval (SIGIR).

August 23-28, 2009: Taking a vacation with my wife.

I’m not exactly sure how I backed into this much speaking/travel in 2009, but that’s how things are shaping up right now.

Google and Big Ideas

I love Om Malik and respect him greatly. I’m hoping to corner him for lunch sometime to pick his brain on ways that Google could improve. But today he happened to do a tweet that caught my attention just as my morning caffeine was kicking in. Om said “I think google has no big ideas. this morning they announced a to-do-list. …” So over on FriendFeed, I listed a few Big Ideas from the last week. 🙂

image

You can click on the hyperlinks over on the FriendFeed discussion to see what I thought Google’s Big Ideas were in the last week or so. Personally, I don’t think Google is out of big ideas. 🙂

Added: Be sure to read Om’s thoughtful response too.

css.php