Wanted: bookmarks.html merging program

You can ignore this “hairball” blog post. This post dates back to a time when people actually curated, saved, and managed their bookmarks.html file. Then Google Chrome introduced the ability to save and sync all your bookmarks, extensions, etc. in the cloud. Now I sign in to Chrome and everything is synced in the cloud.

Over the years, I’ve accumulated lots of bookmarks.html files. I’d love someone to write an App Engine program that would let you upload bookmarks.html files and would merge them all into one master file. After that, you could prune/remove useless bookmarks, especially any bookmark items that are installed by default on a new browser but are useless.

Why do it on the Google App Engine?

Because it would be an easy way to get started. Essentially you want to upload a small set of files to one web location from several different computers, and then do something interesting with that data. App Engine is perfect for that kind of thing.

Can App Engine’s version of Python parse bookmarks.html files?

The Mozilla/Firefox bookmarks.html file format is a little strange, but not too strange. I found a few programs to parse bookmarks.html files. For example, one fellow wrote a Python program to merge bookmarks using sgmllib, which I’m guessing would work on App Engine.

Digging into it more, it looks like several people like Beautiful Soup as a parser. First off, you can download it as a single Python file to work in App Engine. It also looks pretty easy to use. I like this short example of extracting favicons to .ico files from a bookmarks.html file using Beautiful Soup. At least one other person has released tools to manipulate bookmarks.html files with Beautiful Soup.

Can you upload files to Google App Engine?

Yes! There’s evidently a limit of 10MB on uploaded files, but my biggest bookmark file was about 500K, and I suspect most people have much smaller bookmark files. Stack Overflow has a good example of file uploading in Google App Engine, plus there’s official examples as well as people helping other people to the point of showing live examples.

Plus browsers are getting better about uploading files to the web easily. Google Chrome supports really easy drag-and-drop file upload. I think Safari supports drag-and-drop file upload as well? And I know Firefox has the dragdropupload extension that eases uploading files to the web.

What about uploading Google Chrome bookmarks files?

Ah, a person after my own heart. The short answer is that Google Chrome can export bookmarks in a format that looks like Firefox to me. Click on the Wrench, then “Bookmark manager,” then Tools->Export Bookmarks… to get a bookmarks.html file. The more fun answer is that “C:Documents and Settings{$USER}Local SettingsApplication DataGoogleChromeUser DataDefault” appears to have a “Bookmarks” file, and it appears to be in JSON format. Can Python parse JSON? It can; Yahoo mentions that simplejson is a great library to use, and it turns out that Google App Engine supports simplejson very easily. Just say “from django.utils import simplejson” to use simplejson. So it wouldn’t be hard to upload raw Chrome bookmark files either.

Aren’t there existing websites to do this?

Maybe, but I don’t know of them. I thought that Foxmarks might be able to do this. Foxmarks (like the now-defunct Google Browser Sync) can synchronize bookmarks across multiple computers. And Foxmarks provides a my.foxmarks.com web interface that lets you manipulate and export your bookmarks, but you can’t upload a raw bookmarks.html file to Foxmarks; instead, you have to upload/sync bookmarks via a browser extension. If Foxmarks added the ability to upload bookmarks.html files (vote for that idea here), that would be pretty sweet.