PageRank in academia

(Another post in the “back to school” theme.)

I was reading the June 2006 issue of Nature a few weeks ago back in Kentucky, and happened across a good article by Mark Buchanan. He discussed a recent paper in which scientists decided to rank papers not just by the raw number of citations, but by using a PageRank-like algorithm. One important paper by John Slater jumped from 1,853rd to 10th place:

The Slater determinant slipped into common usage and into a number of other papers that went on to become classics. Today, this paper gets few direct references, but scores points indirectly in Google terms as others papers that cited it long ago continue to accrue new citations.

I don’t see Buchanan’s article online, but a physics student did his own summary. I like the notion that Google sprang from academia, and that things like Google Scholar can make life a little easier for students in return. Besides, you know, free hosting and bug tracking for open-source projects, Summer of Code, Anita Borg scholarships, our free pizza for late-night hacking ambassadors, the free sitesearch and websearch that we offer to universities and non-profits, and stuff like that. Jeez, that’s a lot of stuff. And I forgot about the page for college students that collects our free services. The n-gram data we’re making available to researchers about phrases on the web. The 2006 Code Jam programming contest. Okay, I’m stopping because my head hurts. But it’s clearly a good time to be a student.

Related articles in Google Scholar

Google Scholar recently added the ability to find related articles for a paper, so I decided to try it out with a paper from my former life as a graphics person. It was a paper in SIGGRAPH a few years ago called the “Office of the Future” and it dealt with projecting images onto surfaces that aren’t exactly flat. If you predistort an image before you project it, you can often cancel out the different surfaces as you project onto them, which lets you create the appearance of a flat screen again.

Here’s a picture that gives you the idea. It’s completely unrelated to me or the paper, but it demonstrates the concept really well (I just did a flickr search for [projection surface] and found a nice result–thanks flickr!

Projecting on a rock!

I don’t read German, other than a few choice words like Private Krankenversicherung, Flugkarten, and parkplatz but it looks quite similar: someone is measuring the irregularity of the rock and then pre-distorting the projected image so that it looks flat when projected.

Anyway, back on topic. If you search for [office of the future] on Google Scholar, you get a page like this:

Search results for office of the future

Cool, 300+ citations! Not bad. Now click on the “Related Articles” link and you see something like this:

Related articles for the SIGGRAPH paper

I can attest that that’s a pretty relevant list of related articles.

One fun thing about working at Google is talking to different people. When I talk to grad students, a lot of them mention that they like Google Scholar, and I can understand why. If you’re looking for background research, Google Scholar is pretty helpful, especially with the new related articles feature. In the image above, for example, you can see the “CAVE” paper (4th one from the top) has 1026 citations. That’s a pretty seminal paper in the “projecting graphics on unusual displays” niche. Nice job making Scholar even better, A and A and whoever else worked on this. :)

Vacation books?

Okay, I’m looking for fun, light reading for my vacation. I don’t want search stuff, I don’t want heavy reading, I don’t want geopolitics or history.

Things like The Curious Incident of the Dog in the Night-Time. Or Terry Pratchett. Or early William Gibson. Cheesy cyberpunk if they don’t get the computer stuff too wrong. Neil Gaiman. Transmetropolitan.

Lazyweb, I invoke you! What should I read on vacation?

Most entertaining stuff of 2005

I was rooting around on my bookshelf today and noticed the book that I enjoyed the most last year: The Curious Incident of the Dog in the Night-Time, by Mark Haddon.

Most fun movie? I’m picking a black horse this time: The 40 Year Old Virgin. It surprised me by being better than I expected.

Most entertaining video game? I’m going to go with Guitar Hero (see my earlier post), but if I’d discovered Katamari Damacy or We ♥ Katamari, it would have been very close. Just when you think every genre of videogame has been invented, along comes fresh new ideas. In the Katamari series of games, you roll a ball around; as you roll up more things, your Katamari grows larger, until finally you’re rolling up islands and countries. It’s so much fun that I’m ordering the Katamari soundtrack.

Put-down-able books

When your day job is trying to help Google organize the world’s information, you need as much sleep as you can get. If you get started on some book that you can’t put down, you’ll be bleary the next day. That has led me to seek out “put-down-able” books. I’m not talking about bad books, but tomes that you can stop reading at any time.

Without further ado, here is my list of put-down-able books, in case other webmasters or search engine reps need their sleep:

  • The Big Show: High Times and Dirty Dealings Backstage at the Academy Awards. I normally don’t care for celebrity stuff, but a friend was reading this and I nicked it from them. 400+ pages and you can read the chapters in any order.
  • The Other Hollywood : The Uncensored Oral History of the Porn Film Industry. A great record of the history of pornographic films, and another Hollywood-ish book I enjoyed. You can literally hop anywhere into this book and just start reading. (By the way, does anyone have recommendations to learn more about the online porn industry? Ynot? Netpond? Luke Ford? Where should I be reading to improve my understanding?)
  • The Baroque Cycle, by Neal Stephenson. I love Stephenson’s early work, and his In the Beginning was the Command Line was fantastic. Cryptonomicon was good reading, but it was pretty dense and intricate. The Baroque Cycle consist of three books: Quicksilver, The Confusion, and The System of the World, making for 960+832+912 = 2704 pages that you can pick up and put down at will.
  • Nightwork: A History of Hacks and Pranks at MIT. A fun little read with no tension or drama, so it’s easy to interrupt at any point.
  • Anything by Amy Tan. I’ve read most of her books and I’m about to start Saving Fish from Drowning. I love Tan’s deliberate pacing.
  • The “Stealing the Network” series (How to Own the Box, How to Own a Continent, and How to Own an Identity). I love fiction that teaches me something. For example, I had no idea how to use Nmap until I watched The Matrix ;). This series of computer security books is steeped in real-world facts. The books are easy to put down because each chapter is an independent little story that stands on its own, but the chapters still form a larger story.

Those are the ones that I can think of right now. What non-stressy or put-down-able books have you read recently?

(Yes, yes, comment approval will start again tomorrow. Or maybe Friday.)