Archives for November 2010

What would you do if you were CEO of Google?

Every so often I like to play a game I call “CEO of Google.” The idea of the game is that through some combination of weird circumstances, you are now the CEO of Google. Don’t worry about how you became CEO–maybe you won a contest to be CEO for a day. I’d like to know: what products would you launch? What ideas would you fund?

I’m specifically interested in Big Ideas. Things like data liberation or self-driving cars. Or what’s the one huge feature missing from a Google product? In essence, what are the big products, features, or ideas that you would like to see Google work on?

I’m specifically interested in ideas that would improve the world, the web, or Google itself. What would you like to see Google do in 2011 or beyond?

Open Kinect Contest: $2000 in prizes

I’m starting a contest for people that do cool things with a Kinect. See the details below.

Open Kinect Logo

Before I joined Google, I was a grad student interested in topics like computer vision, motion self-tracking, laser scanners–basically any neat or unusual sensing device. That’s why I was so excited to hear about the Kinect, which is a low-cost ($150) peripheral for the Xbox. The output from a Kinect includes:
– a 640×480 color video stream.
– a 320×240 depth stream. Depth is recovered by projecting invisible infrared (IR) dots into a room. You should watch this cool video to see how the Kinect projects IR dots across a room. Here’s a single frame from the video:

IR Projection

but you should really watch the whole video to get a feel for what the Kinect is doing.
– the Kinect has a 3-axis accelerometer.
– the Kinect also has a controllable motor to tilt up and down plus four microphones.

What’s even better is that people have figured out how to access data from the Kinect without requiring an Xbox to go with it. In fact, open drivers for the Kinect have now been released. The always-cool Adafruit Industries, which offers all sorts of excellent do-it-yourself electronics kits, sponsored a contest to produce open-source drivers for the Kinect:

First person / group to get RGB out with distance values being used wins, you’re smart – you know what would be useful for the community out there. All the code needs to be open source and/or public domain.

Sure enough, within a few days, the contest was won by Héctor Martín Cantero, who is actually rolling his reward into tools and devices for fellow white-hat hackers and reverse engineers that he works with, which is a great gesture. Okay, so where are we now? If I were still in grad school, I’d be incredibly excited–there’s now a $150 off-the-shelf device that provides depth + stereo and a lot more.

It’s time for a new contest

I want to kickstart neat projects, so I’m starting my own contest with $2000 in prizes. There are two $1000 prizes. The first $1000 prize goes to the person or team that writes the coolest open-source app, demo, or program using the Kinect. The second prize goes to the person or team that does the most to make it easy to write programs that use the Kinect on Linux.

Enter the contests by leaving a comment on this blog post with a link to your project, along with a very-short description of what your project does or your contribution to Kinect hacking. The contest runs until the end of the year: that’s Dec. 31st, 2010 at midnight Pacific time. I may ask for outside input on who should be the winner, but I’ll make the final call on who wins.

To get your ideas flowing, I’ll offer a few suggestions. Let’s start with the second contest: making the Kinect more accessible. In my ideal world, would-be hackers would type a single command-line, e.g. “sudo apt-get install openkinect” and after that command finishes, several tools for the Kinect would be installed. Maybe a “Kinect snapshot” program that dumps a picture, a depth map, and the accelerometer values to a few files. Probably some sort of openkinect library plus header files so that people can write their own Kinect programs. I would *love* some bindings to a high-level language like Python so that would-be hobbyists can write 3-4 lines of python (“import openkinect”) and start trying ideas with minimal fuss. To win the second contest, you could write any of these libraries, utilities, bindings or simplify installing them on recent versions of Linux/Ubuntu (let’s say 10.04 or greater).

Okay, how about some ideas for cool things to do with a Kinect? I’ll throw out a few to get you thinking.

Idea 1: A Minority Report-style user interface where you can open, move, and close windows with your movements.

Idea 2: What if you move the Kinect around or mount it to something that moves? The Kinect has an accelerometer plus depth sensing plus video. That might be enough to reconstruct the position and pose of the Kinect as you move it around. As a side benefit, you might end up reconstructing a 3D model of your surroundings as a byproduct. The folks at UNC-Chapel Hill where I went to grad school built a wide-area self-tracker that relied on a Kalman filter to estimate a person’s position and pose. See this PDF paper for example.

Idea 3: Augmented reality. Given a video stream plus depth, look for discontinuities in depth to get a sort of 2.5 dimensional representation of a scene with layers. Then add new features into the video stream, e.g. a bouncing ball that goes between you and the couch, or behind the couch. The pictures at the end of this PDF paper should get you thinking.

Idea 4: Space carver. Like the previous idea, but instead of learning the 2.5D layers of a scene from a singe depth map, use the depth map over time. For example, think about a person walking behind a couch. When you can see the whole person, you can estimate how big they are. When they walk behind the couch, they’re still just as big, so you can guess that the couch is occluding that person and therefore the couch is in front of the person. Over time, you could build up much more accurate discontinuities and layers for a scene by watching who walks behind or in front of what.

Idea 5: A 3D Hough transform. A vanilla Hough transform takes a 2D image, looks for edges in the image, and then runs some computation to determine lines in the image. A 3D Hough transform finds planes in range data. I’ve done this with laser rangefinder data and it works. So you could take a depth data from a Kinect and reconstruct planes for the ground or walls in a room.

Idea 6: What if you had two or more Kinects? You’d have depth or range data from the viewpoint of each Kinect and you could combine or intersect that data. If you put two Kinects at right angles (or three or four Kinects around a room, all pointing into the room), could you reconstruct a true 3D scene or 3D object from intersecting the range data from each Kinect?

I hope a few of these ideas get you thinking about all the fun things you could do with a Kinect. I’m looking forward to seeing what cool ideas, applications, and projects people come up with!

css.php