Reverse engineering a Windows USB driver

For a while, I was really into reverse-engineering USB drivers. Don’t ask why. The heart wants what the heart wants. I didn’t finish this “hairball” post, but it has some info in it that still might be good.

I recently stumbled across this post and it inspired me. I decided to try to reverse engineer the USB protocol for my Omron pedometer, which can upload your step data, but only to a Windows computer.

There are two parts to writing a Linux driver for a new USB device: reverse-engineering the USB protocol, and writing the Linux program.

Reverse-engineering the USB protocol

Typically your problem is that a device only runs under Windows. Like it or not, that means that you’re going to need something that runs Windows. It can be a Windows computer, or you can get fancy and run Windows as a “guest” operating system using something like VMWare to do virtualization. That is, you’d install Linux, then install VMWare, then install Windows to run under VMWare. But let’s start simple.

Step 0. Find the Vendor ID and Product ID of your device

Every USB device should have a Vendor ID plus a Product ID (sometimes called a device ID) that identifies it. You’ll need to discover this information before you can talk to the device. I plugged my Omron pedometer into a linux machine and typed “lsusb”. You’ll get a lot of information back. I saw a line like

Bus 002 Device 018: ID 0590:0028 Omron Corp.

That tells me that the vendorid is hexadecimal value 0×0590 (which is 1424 in decimal) and the productid is hex value 0×0028 (which is 40 in decimal). For other operating systems, this post tells you how to find your vendor id and product id under Mac and Windows. For Windows XP, it looks like you can run “msinfo32.exe” and then look under “Components” and then “USB” and look for “VID_” (vendor id) and “PID_” (product id).

1) The simple approach: a dedicated Windows computer

In the beginning, it’s easiest to just use a Windows computer and run some software to sniff on the USB packets as they go back and forth. The wild part is that the best open-source/free program I found is five years old (SnoopyPro). It still worked fine on Windows XP though. SnoopyPro is the program you want. There’s a whole long history of how it forked from USBSnoopy (evidently also called “sniff-bin“), and there’s another program called sniffusb which is related but different (I think both sniffusb and SnoopyPro are forks off of the original USBSnoopy/sniff-bin program). It’s all very confusing. I went with SnoopyPro and it worked fine for me.

Further reading on SnoopyPro and related USB sniffer programs:
Some documentation on how to use SnoopyPro
If you’re willing to shell out for a book, it looks like USB Complete, now in its third edition, is one of the best.
http://www.piclist.com/techref/usbs.htm – mentions all the different sniffers
http://hackspire.unsads.com/USB_Protocol#USB_traffic_analysis – talks about how to convert SnoopyPro (and SniffUsb) logs/traces into hexadecimal data.

Are there other options? Sure. USB Monitor from HHD Software is $85 and runs on Windows. Or you could spend $850-950 to buy a hardware USB protocol analyzer. Since I have only a casual interest, that’s a bit steep for me.

One last option is to run Windows as a virtual “guest” on a Linux system running something like VMWare. VMWare can let programs interact with USB devices. As the virtual version Windows interacts with the USB device, the Linux computer gets to see everything that happens, because it sits between Windows and the USB device. In fact, Eric Preston presented a method that could log all the the output of the Linux usbmon program as binary data. Eric changed usbmon to use relayfs so that large amounts of data could be quickly relayed from kernel space to user space, then wrote a user space program to dump that binary data to disk. Eric also wrote a dissector for ethereal so that he could view the USB data in real-time. Unfortunately the PDF of his slide presentation have disappeared from http://download.linuxmontreal.com/projects/usb/reveng/ where they used to be. In fact, all of linuxmontreal.com appears to be gone now. :(

By the way, Ethereal is now known as Wireshark, and it is a protocol analyzer that runs on many platforms and apparently supports USB traces. It looks like Wireshark has supported USB since version 0.99.4:

Wireshark now supports USB as a media type. If you’re running a Linux distribution with version 2.6.11 of the kernel or greater and you have the usbmon module enabled and you have a recent CVS version of libpcap (post-0.9.5) installed you can also do live captures. More details can be found at the USB capture setup page on the wiki.

Follow the link in the quote to find Wireshark’s USB wiki page.

On Ubuntu 7.10 (Gutsy Gibbon), I was able to do these commands:

sudo mount -t debugfs none_debugs /sys/kernel/debug
sudo modprobe usbmon
ls /sys/kernel/debug/usbmon
0s 0t 0u 1s 1t 1u 2s 2t 2u

General USB Reading:
USB in a NutShell is a pretty good overview of how USB communication goes.
This Java and USB tutorial starts with a good overview of USB.
This USB and Linux tutorial starts to get into the nitty gritty of USB on Linux.

Example debunk post

Over the years I’ve written a lot of blog posts to debunk misconceptions or claims that weren’t true. Sometimes I publish the blogs posts but often I don’t. This is a pretty typical example post. Someone claimed that Google was evil for removing a particular domain, when in fact the domain had been removed from Google’s index via a self-service user request to our url removal tool.

When we see misconceptions, we try to figure out where the confusion happened and how to prevent that type of confusion in the future. It’s also safe to assume when you read “Google cancelled my account” stories that there’s usually another side to the story, even if for some reason Google doesn’t go into the details.

My guess is that you haven’t seen this one unless you live in Switzerland. A few months ago, a friend noticed this complaint in Heute Online:

Benbit complaint

My ability to read German is well, practically non-existent except for spammy words. So I asked a friend to translate it for me — thanks, Johanna. :) Here’s roughly what it says:

Search giant kicks Swiss blogger out of the index

“Google is evil after all”

Zurich – On his blog, Benbit* from Zurich often discloses security holes of big companies. This makes him unpopular (see box) – so unpopular that Google kicked him out of the index.

heute: Congratulations, you are one of the first Swiss citizens to be kicked out by Google. Proud?
Benbit: Nowadays, everybody uses Google. So, it is not funny at all if you suddenly disappear completely from the search engine. To me, Google’s motto “Don’t be evil” is not right. Google is evil and misuses its power.

Why did Google delete your site?
I don’t have a clue. I sent emails and registered letters, but no one contacted me to give me reasons for this.

Might it be possible that this is connected to your hacker activities? Didn’t you publish the security holes of many companies on your blog?
I did, but this doesn’t violate Google’s guidelines. I am neither a spammer nor have I been doing illegal search engine optimisation for my blog. My only explanation is that I stepped on the toes of a Google advertising client who in turn complained about me.

Any idea who this might be?
Well, one of the companies that I mentioned on my blog. Among them are also powerful major banks.

As a small blogger, do you have any chance at all against Google?
What Google is doing is a clear case of censorship and violates Switzerland’s federal constitution. I demand from Google to provide me with information about the deletion from the index. Otherwise, I am also considering going to a justice of the peace.
* Name known to the editor. PS: Until our press deadline, Google did not comment.

http://blog.benbit.ch

Okay, let’s pause for a second. At this point in the story, I think we can all agree that Google is 100%, pure, concentrated eeeeeevil. How dare they squash that poor, hapless blogger at benbit.ch?

Except I haven’t told our side of the story. Our side of the story is pretty short: someone from benbit.ch used our automated url removal tool to remove benbit.ch themselves. Now why would someone from benbit.ch remove their own site (multiple times with multiple url patterns over multiple months, I might add), and then lay the blame at Google’s feet? I could speculate, but I genuinely have no idea.

One important thing to mention is that even with a really harsh story like this, we still look for ways to do better. For example, this incident happened in March of 2007 using our “old” url removal tool that had been up for years. In April 2007, the webmaster tools team rolled out a new version of the url removal tool. In my opinion, it kicks butt over the old tool in a couple ways:

1) site owners can easily see the urls that they’ve removed themselves.
2) site owners can easily revoke a url pattern that they’ve entered in the past.

Just to show you what I mean, here’s a snapshot where I’ve removed everything in the http://www.mattcutts.com/files/ directory of my site:

Url removal snapshot

As you can see, I can easily view the removal url patterns that I’ve submitted, and there’s a “Re-include” button if I decide to revoke a removal and start showing the urls again.

My takeaways from this post would be:

- Sometimes people say negative things about Google. Remember that there is often another side to the story.
- Even when people say negative things, folks at Google do listen and look for ways to improve. Case in point: the newer url removal tool resolves a whole class of potential misunderstandings like the one above.
- Google does provide a lot of useful tools for site owners. :)

I’m glad that the webmaster tools team works to make it easier to debug and to fix lots of issues for site owners. If the tool had launched just a month or two earlier, the folks at benbit.ch could have diagnosed their issue themselves — but at least everyone can benefit from the better tool now.

Google Chrome Tips

I started this blog post of Chrome tips in 2008. Even though this is a “hairball” post, some of these tips still work.

- control-shift-V will paste your selection as plain text

- control-shift-T will re-open the last tab you closed. You can repeat that to keep re-opening previously closed tabs.

- Hover over a tab to see the title for that page.

- shift-escape to bring up the Chrome process manager

- switch your default search engine: right-click in the Omnibox and select “Edit search engines…” . Select a search engine and click “Make Default”

- Chrome’s user-agent looks like “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.24 Safari/525.13″

- Click on a tab and drag it to reorder tabs. To move the tab to a new window, click on the tab and drag it away from the tab bar until a “ghost image” of the tab appears.

- Use control-tab and control-shift-tab to shift your tab focus

- The address bar (referred to as an “Omnibox”) in Google Chrome is very smart. You can use it to type urls or to run searches. Once you type a space after a word, the browser will assume that you want to run a search. Once you type a ‘/’, the browser will assume that you want to navigate to a url.

- Here’s another omnibox trick. Visit Amazon.com and do a search for anything (say, Terry Pratchett). Your browser will see that you did a search and will learn that it can search amazon. Now start typing in the omnibox until “amazon.com” is offered as a suggestion, and then hit tab. You will be offered the ability to search directly on Amazon for what you want. So you could type “am” to bring up the “amazon.com” suggestion, then hit tab and Chrome will say “Search amazon.com:” then if you type “Little Brother” and hit return, you’ll be taken directly to Amazon’s search results for Little Brother.

- On Firefox, you’d use control-l to move the focus to the address bar and control-k to move the focus to the search box. Both shortcuts work on Google Chrome. Note that control-k adds a ‘?’ to the beginning of the address bar/omnibox, which is a shorthand way to write “Do a search.” So entering “?tax codes” would do a search for [tax codes]. After you get the hang of the omnibox, you’ll find yourself just typing searches and hitting enter, because you don’t really need the ‘?’ in front.

- Toggle the display of a bookmark bar on and off with control-shift-B. Even if the bookmark bar is off, it will conveniently appear for you on the “New Tab” window.

- Google Chrome doesn’t offer Google Bookmarks functionality, but if you want to use Google Bookmarks with your browser, you can visit google.com/bookmarks and there’s a bookmarklet at the bottom of the page that you can drag up to your bookmarks bar.

- If you delete a tab by accident, open up a new tab with control-t. In the bottom right is a section called “Recently closed tabs” where you can retrieve a tab. That section only lists three recently closed tabs though. You can re-open up to 10 closed tabs with control-shift-T.

- To maximize the Google Chrome window, you can double-click in any unused/blank part of the tab strip

- An Incognito window isn’t just useful for buying gifts or private porn surfing. If you have two different Google Accounts (maybe a work account and a personal account), you can use Incognito mode to keep two browser windows open and the two windows can each use a different Google Account. Open an Incognito window with ctrl-shift-N.

- control-h will open a history window so that you can search over your browser history

- To help prevent phishing, Google Chrome will bold the hostname of the url in the address bar.

- Attach a file in Gmail with simple drag-and-drop.

- Google Chrome has some neat internal pages that you can access. In the address bar, try entering “about:memory” to get a great breakdown of Chrome’s memory statistics. Enter “about:version” to get version information about Google Chrome. Enter “about:dns” to see the time you’ve saved with DNS prefetching. Enter “about:plugins” to find out more about your browser’s plugins. And “about:stats” shows all kinds of information.

How Cuil generates its categories

This “hairball” post about Cuil isn’t really snarky, so I’ll post it. Cuil is no longer around, but it did spawn a funny post on Reddit about Cuil Theory.

Cuil launched this week. For a search engineer, a new search engine is like a Christmas present: you can’t wait to play with it. Most search engineers can get a good feel for the strengths/weaknesses of a new engine within 10-15 queries. And I’d like to think that with another 5-10 queries, I can usually figure out how I’d spam a search engine. It’s my job to protect Google’s index from spam, so naturally I’m intimately familiar with different webspam techniques. :)

What’s also fun is to figure out the how a search engine provides various features. For example, for a Cuil search like [matt cutts] you’ll see the following categories:

Cuill categories

Where do those categories come from? Most people didn’t drill down that far, but it’s quite doable to figure out. If you want, take a few minutes to see if you can puzzle out how the categories are generated before reading on.

Google OS figured it out, for example: “Another interesting idea is the explorative category section that shows related Wikipedia categories and topics.” With a little work, it’s easy to verify that the right-hand box comes from Wikipedia category pages. For example, the string “matt cutts” occurs on the Wikipedia page for search engine optimization, and that page also includes a link to a search engine optimization consultants page. Sure enough, one of the categories listed for [matt cutts] is “Search Engine Optimization Consultants” and the entries under that category are from Wikipedia. Likewise, I think the Wikipedia page for Traffic Power and its link to a category page for black hat SEO probably accounts for why the category “Black_hat_seo” appears for my name.

There’s nothing wrong with surfacing Wikipedia category pages, of course, but sometimes that can lead to some drift in topicality. For example, p2pnet wrote about a search for their name: “[The search query] p2pnet.net, however, gave Canadian copyright law, Project Gotham Racing Series, file sharing networks, Wired magazine people, and filesharing programs.” You can see the categories for the search [p2pnet.net] below:

Cuill categories for p2pnet

And this Wikipedia page has the string “p2pnet.net” and also has a category page for “Project Gotham Racing series”. The idea of surfacing Wikipedia category pages will have advantages and disadvantages depending on the user and the query.

Cheap internet-connected scale: Wii Balance Board + Linux

You can ignore this ancient “hairball” blog post. Gather round, kids, and witness this blog post from a time *before internet-connected scales*. That’s right. Back then, we had to hack our Wii balance boards to connect them to the internet. Of course now you can buy wifi-connected scales from Fitbit and Withings. But in a olden days, you had to hack something up or even write it down on paper!

You can easily make an internet-connected scale out of a Wii Balance Board and a Linux machine:

First, find a Bluetooth dongle and configure your Linux machine to talk to the Wiimote.

Next, apply a few extra patches so that your Linux machine can talk to a Wii Balance Board.

Finally, use some Python code to upload your weight to a Google Spreadsheet.

If you’d like to hear me describe how to hook it everything together, you can watch me give a 7-8 minute talk about it (more info in that post), or you can watch it here:

Special thanks to Kevin Kelly and Gary Wolf for kickstarting the Quantified Self movement and encouraging me to talk about this project.

css.php