Archive for August, 2006

PageRank in academia

(Another post in the “back to school” theme.)

I was reading the June 2006 issue of Nature a few weeks ago back in Kentucky, and happened across a good article by Mark Buchanan. He discussed a recent paper in which scientists decided to rank papers not just by the raw number of citations, but by using a PageRank-like algorithm. One important paper by John Slater jumped from 1,853rd to 10th place:

The Slater determinant slipped into common usage and into a number of other papers that went on to become classics. Today, this paper gets few direct references, but scores points indirectly in Google terms as others papers that cited it long ago continue to accrue new citations.

I don’t see Buchanan’s article online, but a physics student did his own summary. I like the notion that Google sprang from academia, and that things like Google Scholar can make life a little easier for students in return. Besides, you know, free hosting and bug tracking for open-source projects, Summer of Code, Anita Borg scholarships, our free pizza for late-night hacking ambassadors, the free sitesearch and websearch that we offer to universities and non-profits, and stuff like that. Jeez, that’s a lot of stuff. And I forgot about the page for college students that collects our free services. The n-gram data we’re making available to researchers about phrases on the web. The 2006 Code Jam programming contest. Okay, I’m stopping because my head hurts. But it’s clearly a good time to be a student.

Comments (31)

Related articles in Google Scholar

Google Scholar recently added the ability to find related articles for a paper, so I decided to try it out with a paper from my former life as a graphics person. It was a paper in SIGGRAPH a few years ago called the “Office of the Future” and it dealt with projecting images onto surfaces that aren’t exactly flat. If you predistort an image before you project it, you can often cancel out the different surfaces as you project onto them, which lets you create the appearance of a flat screen again.

Here’s a picture that gives you the idea. It’s completely unrelated to me or the paper, but it demonstrates the concept really well (I just did a flickr search for [projection surface] and found a nice result–thanks flickr!

Projecting on a rock!

I don’t read German, other than a few choice words like Private Krankenversicherung, Flugkarten, and parkplatz but it looks quite similar: someone is measuring the irregularity of the rock and then pre-distorting the projected image so that it looks flat when projected.

Anyway, back on topic. If you search for [office of the future] on Google Scholar, you get a page like this:

Search results for office of the future

Cool, 300+ citations! Not bad. Now click on the “Related Articles” link and you see something like this:

Related articles for the SIGGRAPH paper

I can attest that that’s a pretty relevant list of related articles.

One fun thing about working at Google is talking to different people. When I talk to grad students, a lot of them mention that they like Google Scholar, and I can understand why. If you’re looking for background research, Google Scholar is pretty helpful, especially with the new related articles feature. In the image above, for example, you can see the “CAVE” paper (4th one from the top) has 1026 citations. That’s a pretty seminal paper in the “projecting graphics on unusual displays” niche. Nice job making Scholar even better, A and A and whoever else worked on this. :)

Comments (18)

SEO Advice: Writing useful articles that readers will love

Okay SEOs, what can you learn from my previous post about changing the default printer for Firefox on Linux? In the last week someone wrote and said “I want you to talk about SEO, and don’t give me any of that crap about good content.” I’m going to beg to differ. :) I wrote that post mainly because I’ve looked for this information a couple times and never found exactly what I was looking for quickly. That tells me that in this small niche, I could utterly rock the search engines. Plus once I figured out the info, it was only 10-20% more time to package it up nicely. Now this short content post can act as an evergreen draw for searchers.

Notice what I did with keywords. I carefully chose keywords for the title and the url (note that I used “change” in the url and “changing” in the title). The categories on my post (”How to” and “Linux”) give me a subtle way to mention Linux again, and include a couple extra ways that someone might do a search–lots of user type “how to (do what they want to do).” I thought about the words that a user would type in when looking for an answer to their question, and tried to include those words in the article. I also tried to think of a few word variations and included them where they made sense (file vs. files, bash and bashrc, Firefox and Mozilla, etc.). I’m targetting a long-tail concept where someone will be typing several words, so I’m probably in a space where on-page keywords are enough to rank pretty well. I don’t need anchor-text for “linux default printer” or similar phrases; in the on-page space, I’d recommend thinking more about words and variants (the “long-tail”) and thinking less about keyword density or repeating phrases.

The meta-issues I’d mention would be:
1) The utility of an article is paramount. If you write 2000 words about mortgage loans and never discuss the industry landscape or impart some useful, concrete knowledge to your reader, that should set off a warning flag in your head. So use this advice only for good (high-quality articles), not for evil. :)
2) Be sure to study your niche. I just spent 10-15 minutes to tackle the “default printer in Linux/Firefox/Mozilla” space. Is that niche worth writing an article about? Well, it was for me, because I was looking for this information myself. In general, any time you look for an answer or some information and can’t find it, that should strike you as an opportunity.

But the larger point is that if you put in time and research to produce or to synthesize original content, think hard about what niches to target. My advice is not to start with an article about porn/pills/casinos/mortgages–it’s better to start with a smaller niche. If you become known as an expert on (say) configuring Linux or hacking gadgets, you could build that out with things like forums to create even more useful content. Look for a progression of niches so that you start out small or very specific, but you can build your way up to a big, important area over time.

There are a lot of niches that just take sweat equity. You could be the SEO that does interviews. Or the SEO that transcribes Matt’s videos. Or the SEO that makes funny lists. Or the SEO company that provides webmaster radio. Or the SEO that makes podcasting easy. Or the SEO that specializes in a certain content management system or shopping cart. Or the SEO company that specializes in Yahoo! stores. Or the SEO that specializes in accessibility. Or the company that mocks Silicon Valley and its companies. Or the SEO that specializes in AdWords API ROI tracking. Or you could be the SEOs that write-up a summary of every panel at every search engine conference. Or the company that does cartoons. Or the SEO who pays attention to Google Base, Google Co-op, Yahoo! Answers, or Facebook. Or the SEO that provides Firefox plugins. Or the company that provides metrics and tracking for blogs. Or the SEO that talks about patents. Or the SEO that specializes in dynamic sites. Eye-tracking. Beginner SEO tutorials. Making maps mash-ups. Ajax SEO. SEO for non-profits. SEO for Second Life or MySpace. SEO to repair a company’s reputation. SEO for MySQL, Python, Ruby on Rails, WordPress blogs, or .NET sites. The SEO that surfaces databases or Flash sites. SEO for self-publishing authors. The SEO that does radio ads.

An infinite number of niches are waiting for someone to claim them. I’d ask yourself where you want to be, and see if you can find a path from a tiny specific niche to a slightly bigger niche and so on, all the way to your desired goal. Sometimes it’s easier to take a series of smaller steps instead of jumping to your final goal in one leap.

Comments (175)

Changing the default printer on Linux and Firefox

If your Linux system uses CUPS, you can change the default printer for Firefox or Mozilla with this command:

lpoptions -d printer-name

The ‘d’ option sets the default printer.

If you want to set a default printer for command-line programs, I’d set a variable in your shell start-up configuration file. If your shell is bash (which is the most common), you’d use the syntax in your ~/.bashrc file:

export PRINTER=printer-name

If you use csh or tcsh, put this in your ~/.cshrc or ~/.tcshrc files:

setenv PRINTER printer-name

If you’re a csh person, you might ask the difference between set PRINTER=printer-name and setenv PRINTER printer-name. The difference is that set makes a regular variable, and setenv creates an environmental variable. Regular variables are available only within csh. With an environmental variable, any program that you run from csh can also see the environmental variable.

Comments (23)

Getting better..

My dad reminded me that it takes about seven days to get over a cold. But if you go see a doctor or take a lot of medicine, it only takes about a week. :)

I’m feeling better and should be ready to play rollerblade hockey again in a day or so. Thanks to everyone for the cold remedy suggestions, except the folks who suggested chugging a quart of boiling water. :)

Comments (63)

Next entries » · « Previous entries