BALUG: Mark Shuttleworth and Digital Tipping Point

Last night I drove into San Francisco for a meeting of the BALUG (Bay Area Linux Users Group). I’d never been to a BALUG meeting before, but Mark Shuttleworth (the founder of the Ubuntu distribution of Linux) was speaking and I wanted to size up Mark in person. He acquitted himself well. He spoke about the good, the bad, and the ugly of open-source as he sees it and then closed with some stories of being the first African into space. Here’s a (somewhat grainy) photo I took as Mark was speaking:

Mark Shuttleworth

A few impressions that I came away with:

– he cares a lot about the Linux desktop experience and likes to focus on that. That’s good, because a lot of people in the Linux community pay attention to the kernel and “user space” doesn’t interest them as much.
– he believes that collaboration should be a strong point of open source. Mark mentioned bug tracking as an example: bug reports and debugging logs should flow seamlessly to developers without a lot of extra work.
– Mark did a good job of giving props to Red Hat, Novell, and even Microsoft when he thought they deserved it. I thought this was an especially wise move and gave him more credibility than if he had taken potshots at competitors. Mark pointed out that Microsoft made software cheaper as a good thing Microsoft has done, although he didn’t see a need to license patents from them. I got the idea that Mark thinks that injecting venom into discussions about open-source doesn’t do favors for the community in the long-term.
– At the same time, Mark said that if open source believes that it has more powerful ideas long-term, open-source proponents shouldn’t shy away from engaging in productive/respectful conversations that may eventually win over (say) manufacturers of proprietary hardware so that they allow open-source drivers.

Overall, Shuttleworth seemed to espouse a nice balance of principles and pragmatism. He was a polished speaker and handled the after-speaking mob of people with grace and good humor, even when some folks wanted to talk about the minutiae of their favorite Linux project for a few minutes. I came away with a higher level of respect for Mark, Ubuntu, and Canonical and my interest level is already pretty high.

The night brought a few other fortuitous surprises. I got a couple tips about where to start hacking on my OLPC, which just arrived a couple days ago (thanks for the pointers, Charles!). I found out about the Alameda County Computer Resource Center, which takes donations of old computers, refurbishes them, and then donates them to school, non-profits, and other people that need a computer.

But my favorite surprise was walking by two people and hearing the phrase “Digital Tipping Point.” I’m a huge fan of the Digital Tipping Point blog. Officially, DTP is an “open source film project about the big changes that open source software will bring to our world.” So the film project includes a lot of individual interviews about open-source. But the reason that I love reading the DTP blog is that it provides anecdotes of open-source success without sarcasm, rancor, or the venom that some blogs have. If you’re a Linux fan, I think you’ll find that Digital Tipping Point is genuinely uplifting and cheerful. I keep Digital Tipping Point in my “fun” folder of Google Reader, so it was a pleasure to meet Christian Einfeldt, the producer of the documentary:

Digital Tipping Point

It was great to run into someone by coincidence and to be able to say “Hey, I love your blog. It brings a smile to my face and is a good example of what I like about the web.” All in all, it was a fun night.

Download, slice and dice podcasts on Linux

I’m trying to replace my Windows applications with Linux applications. On Windows, I use I use Juice to download podcasts as MP3s. Recently I decided to switch over to Linux for receiving podcasts. After looking around at various podcast catchers (especially ones that ran on the command-line, so that I could automate them with a cron job), I ran across Podracer. I decided to combine Podracer with a script to split long MP3s into shorter MP3s so that I could play them more easily in my car. Here’s what I did on my Ubuntu Linux machine:

Step 1: Install and configure podracer

I used these commands:
sudo apt-get install podracer
mkdir ~/.podracer
vim ~/.podracer/subscriptions
and add the url of a podcast, e.g. http://feeds.webmasterradio.fm/tdsc for The Daily SearchCast.

cp /etc/podracer.conf ~/.podracer/podracer.conf
Edit ~/.podracer/podracer.conf so that you can pick the download directory you want. I changed
#poddir=$HOME/podcasts/$(date +%Y-%m-%d)
to
poddir=$HOME/rawpodcasts
because I want all my podcasts in one directory where I can do a batch process over them afterwards. Go ahead and run “mkdir ~/rawpodcasts” to create the directory that podcasts will be stored in.

sudo vim /usr/bin/podracer
(it’s okay, Podracer is a shell script). Find the line that says
m3u=$(date +%Y-%m-%d)-podcasts.m3u
and comment it out so that podracer won’t automatically create an .m3u playlist as it downloads podcasts.

Run podracer in “catchup” mode to avoid downloading all the old podcasts from your subscriptions with “podracer -c”. podracer will create a file ~/.podracer/podcast.log to keep a record of all the podcasts that have been downloaded (the “-c” catchup mode creates this text file without actually downloading the MP3s). If you want to re-download a file (e.g. while you’re testing your configuration), you can edit the file ~/.podracer/podcast.log and just delete the line for any MP3 you want to re-download.

Step 2: Install and configure mp3splt (optional)

At a terminal window, type “sudo apt-get install mp3splt”. In Step 1, we configured Podracer to download podcasts as MP3s into a “rawpodcasts” directory. In this step, we’re going to take those long MP3s and split them into individual segments into a new “finishedpodcasts” directory. Make the “finishedpodcasts” directory with the command “mkdir ~/finishedpodcasts”.

Make a file /home/username/download-mp3s-and-process.sh that looks like this.

#!/bin/bash

# Run podracer to download any new podcasts
/usr/bin/podracer

# Now split the podcasts into segments
for i in /home/username/rawpodcasts/*.mp3
do
nicename=`basename $i .mp3`
# Send both stderr and stdout to /dev/null so that this is a quiet cron job
mp3splt -eqd /home/username/finishedpodcasts -o $nicename-@n $i &> /dev/null
done

This script will run podracer to download any new podcasts. Then we list all the MP3 files in the rawpodcasts directory and run mp3splt on each podcast. If you had a file test.mp3, you would be running the command

“mp3splt -eqd /home/matt/finishedpodcasts -o test-@n test.mp3 &> /dev/null”

for example. What do the options to mp3splt mean?

-e means “split on sync errors.” If someone created an mp3 by concatenating multiple mp3s (e.g. with a program such as mp3wrap), that could cause sync errors. mp3splt looks at those sync errors to split the concatenated mp3 back into multiple mp3 files.

-q stands for “quiet.” Don’t ask user to respond to any questions. Normally “-e” says something like

Mp3Splt 2.1 (2004/Sep/28) by Matteo Trotta
THIS SOFTWARE COMES WITH ABSOLUTELY NO WARRANTY! USE AT YOUR OWN RISK!
MPEG 1 Layer 3 – 44100 Hz – Joint Stereo – 256 Kb/s – Total time: 35m.04s
Processing file to detect possible split points, please wait…
Total tracks found: 6
Is this a reasonable number of tracks for this file? (y/n)

Quiet mode suppresses this interactive question on the last two lines above.

-d is the directory to place the split mp3s.

-o lets you specific an output file. “@n” stands for the track number after splitting. So if test.mp3 were made out of two mp3 files, the output of the command above would be two files (in the finishedpodcasts directory) named test.mp3-001.mp3 and test.mp3-002.mp3 . It doesn’t hurt to run mp3splt on existing mp3s because it will just overwrite any old files that had been created.

Step 3: Periodically download and process podcasts

To download podcast files periodically and process them, make a crontab entry for podracer or your script. This will make the cron daemon run your script every few hours to download new mp3s.

I typed “crontab -e” and made the file look like this:

# At 3:03 am, 8:03 am, 10:03 am, 12:03 pm, and 4:03 pm, run this script
3 3,8,10,12,16 * * * /home/username/download-mp3s-and-process.sh

Whenever you’re ready to put the podcasts on some type of media (SD Card, iPod, iPhone, whatever), just copy over anything from the finishedpodcasts directory (if you used mp3splt in step 2) or the rawpodcasts directory if you skipped step 2. Then delete anything left over in either directory.

How to back up your Gmail on Linux in four easy steps

I really like Gmail, but I also like having backups of my data just in case. Here’s how to use a simple program called getmail on Unix to backup your Gmail or Google Apps email. We’ll break this into four steps.

Gmail image

Step 0: Why getmail?

If you browse around on the web, you’ll find several options to help you download and backup your email. Here are a few:

Step 1: Install getmail

On Ubuntu 7.10 (Gutsy Gibbon), you would type

sudo apt-get install getmail4

at a terminal window. Hey, that wasn’t so bad, right? If you use a different flavor of Linux, you can download getmail and install it with a few commands like this:

cd /tmp
[Note: wget the tarball download link found at http://pyropus.ca/software/getmail/#download ]
tar xzvf getmail*.tar.gz
cd (the directory that was created)
sudo python setup.py install

Step 2: Configure Gmail and getmail

First, turn on POP in your Gmail account. Because you want a copy of all your mail, I recommend that you choose the “Enable POP for all mail” option. On the “When messages are accessed with POP” option, I would choose “Keep Gmail’s copy in the Inbox” so that Gmail still keeps your email after you back up your email.

For this example, let’s assume that your username is bob@gmail.com and your password is bobpassword. Let’s also assume that you want to back up your email into a directory called gmail-archive and that your home directory is located at /home/bob/.

I have to describe a little about how mail is stored in Unix. There are a couple well-known methods to store email: mbox and Maildir. When mail is stored in mbox format, all your mail is concatenated together in one huge file. In the Maildir format, each email is stored in a separate file. Needless to say, each method has different strengths and weaknesses. For the time being, let’s assume that you want your email in one big file (the mbox format) and work through an example.

Example with mbox format

– Make a directory called “.getmail” in your home directory with the command “mkdir ~/.getmail”. This directory will store your configuration data and the debugging logs that getmail generates.
– Make a directory called gmail-archive with the command “mkdir ~/gmail-archive”. This directory will store your email.
– Make a file ~/.getmail/getmail.gmail and put the following text in it:

[retriever]
type = SimplePOP3SSLRetriever
server = pop.gmail.com
username = bob@gmail.com
password = bobpassword

[destination]
type = Mboxrd
path = ~/gmail-archive/gmail-backup.mbox

[options]
# print messages about each action (verbose = 2)
# Other options:
# 0 prints only warnings and errors
# 1 prints messages about retrieving and deleting messages only
verbose = 2
message_log = ~/.getmail/gmail.log

Added: Run the command “touch ~/gmail-archive/gmail-backup.mbox” . If you change the path in the file above, touch whatever filename you used. This command creates an empty file that getmail can then append to.

The file format should be pretty self-explanatory. You’re telling getmail to fetch your email from pop.gmail.com via a POP3 connection over SSL (which prevents people from seeing your email as it passes between Gmail and your computer). The [destination] section tells where to save your email, and in what format. The “Mboxrd” is a flavor of the mbox format — read this page on mbox formats if you’re really interested. Finally, we set options so that getmail generates a verbose log file that will help in case there are any snags.

Example with Maildir format

Suppose you prefer Maildir instead? You’d still run “mkdir ~/.getmail” and “mkdir ~/gmail-archive”. But the Maildir format uses three directories (tmp, new, and cur). We need to make those directories, so type “mkdir ~/gmail-archive/tmp ~/gmail-archive/new ~/gmail-archive/cur” as well. In addition, change the [destination] section to say

[destination]
type = Maildir
path = ~/gmail-archive/

Otherwise your configuration file is the same.

Step 3: Run getmail

The good news is that step 2 was the hard part. Run getmail with a command like “getmail -r /home/bob/.getmail/getmail.gmail” (use the path to the config file that you made in Step 2). With any luck, you’ll see something like

getmail version 4.6.5
Copyright (C) 1998-2006 Charles Cazabon. Licensed under the GNU GPL version 2.
SimplePOP3SSLRetriever:bob@gmail.com@pop.gmail.com:995:
msg 1/99 (7619 bytes) from <info@example.com> delivered to Mboxrd /home/bob/gmail-archive/gmail-backup.mbox
msg 2/99 (6634 bytes) from <sales@example.com> delivered to Mboxrd /home/bob/gmail-archive/gmail-backup.mbox

99 messages retrieved, 0 skipped
Summary:
Retrieved 99 messages from SimplePOP3SSLRetriever:bob@gmail.com@pop.gmail.com:995

Hooray! It works! But wait — I have over 99 messages, you say. Why did it only download 99 messages? The short answer is that Gmail will only let you down a few hundred emails at a time. You can repeat the command (let getmail finish each time before you run it again) until all of your email is downloaded.

Step 4: Download new email automatically

A backup is a snapshot of your email at one point in time, but it’s even better if you download and save new email automatically. (This step will also come in handy if you have a ton of Gmail and don’t want to run the command from Step 3 over and over again for hours to download all your mail.)

We’re going to make a simple cron job that runs periodically to download new email and preserve it. First, make a very short file called /home/bob/fetch-email.sh and put the following text in the file:

#!/bin/bash
# Note: -q means fetch quietly so that this program is silent
/usr/bin/getmail -q -r /home/bob/.getmail/getmail.gmail

Make sure that the file is readable/executable with the command “chmod u+rx /home/bob/fetch-email.sh”. If you want to make sure the program works, run the command “/home/bob/fetch-email.sh”. The program should execute without generating any output, but if there’s new email waiting for you it will be downloaded. This script needs to be silent or else you’ll get warnings when you run the script using cron.

Now type the command “crontab -e” and add the following entry to your crontab:

# Every 10 minutes (at 7 minutes past the hour), fetch my email
7,17,27,37,47,57 * * * * /home/bob/fetch-email.sh

This crontab entry tells cron “Every 10 minutes, run the script fetch-email.sh”. If you wanted to check less often (maybe once an hour), change “7,17,27,37,47,57” to “7” and the cron job will run at 7 minutes after every hour. That’s it — you’re done! Enjoy the feeling of having a Gmail backup in case your net connection goes down.

Bonus info: Back up in both mail formats at once!

As I mentioned, mbox and Maildir have different advantages. The mbox format is convenient because you only need to keep track of one file, but editing/deleting email from that huge file is a pain. And when one program is trying to write new email while another program is trying to edit the file, things can sometimes go wrong unless both programs are careful. Maildir is more robust, but it chews through inodes because each email is a separate file. It also can be harder to process Maildir files with regular Unix command-line tools, just because there are so many email files.

Why not archive your email in both formats just to be safe? The getmail program can easily support this. Just change your [destination] information to look like this:

[destination]
type = MultiDestination
destinations = (‘[mboxrd-destination]’, ‘[maildir-destination]’)

[mboxrd-destination]
type = Mboxrd
path = ~/gmail-archive/gmail-backup.mbox

[maildir-destination]
type = Maildir
path = ~/gmail-archive/

Note that you’ll still have to run all the “mkdir” commands to make the “gmail-archive” directory, as well as the tmp, new, and cur directories under the gmail-archive directory.

Bonus reading!

What, you’re still here? Okay, if you’re still reading, here’s a few pointers you might be interested in:
– The main getmail site includes a page with lots of getmail examples of configuration files. The getmail website has a ton of great documentation, too. Major props to Charles Cazabon for his getmail program.
– This write-up from about a year ago covers how to back up Gmail as well.
– The author of getmail seems to hang out quite a bit on this getmail mailing list. See the main site for directions on signing up for the list.
– If you’re interested in a more powerful setup (e.g. using Gmail + getmail + procmail), this is a useful page.
– For the truly sadistic, learn the difference between a Mail User Agent (MUA) and a Mail Transfer Agent (MTA) and how email really gets delivered in Unix.
– I’ve been meaning to write all this down for months. Jeff Atwood’s recent post finally pushed me over the edge. Jeff describes a program that offers to “archive your Gmail” for $29.95, but when you give the program your username/password it secretly mails your username/password to the program’s creator. That’s pretty much pure evil in my book. And the G-Archiver program isn’t even needed! Because Gmail will export your email for free using POP or IMAP, it’s not hard to archive your Gmail. So I wrote up how I back up my Gmail in case it helps anyone else. Enjoy!

Added March 16, 2008: Several people have added helpful comments. One of my favorites led me to a post by commenter Peng about how to back up Gmail with IMAP using getmail. Peng describes how to back up the email by label as well. He mentions that you could use the search “after:2007/1/1 before:2007/3/31” and assign the label FY07Q1 to the search results, for example. Then you can back up that single label/mailbox by making the getmail config file look like this:

[retriever]
type = SimpleIMAPSSLRetriever
server = imap.gmail.com
username = username
password = password
mailboxes = (“FY07Q1”,)

[destination]
type = Mboxrd
path = ~/.getmail/gmail-backup-FY07Q1.mbox

Peng also mentions a nice bonus: since you’re backing up via IMAP instead of POP, there’s no download limit. That means that you don’t have to run the getmail program repeatedly. Thanks for mentioning that Peng!

Summer of Code 2008: 21 potential projects

Yay! Google is opening up its “Summer of Code” for 2008. Google’s Summer of Code program encourages students to tackle open-source projects over the summer break. For a 2-3 more days, sponsor organizations are invited to apply and then students can apply starting March 24th. I’ve been thinking about some projects that I’d enjoying seeing. If anyone needs ideas, I’d love to see some of these projects:

  1. Synergy is a fantastic way to control multiple computers with one mouse/keyboard. You can even move your mouse directly from one computer’s display seamlessly to the other computer’s display. But Synergy currently doesn’t support drag/drop between computers or file transfer between computers. Adding drag/drop sounds like a great summer project to me. 🙂
  2. Tweak Google Browser Sync to synchronize user-defined spelling dictionaries between computers.
  3. I don’t know about anyone else, but personally I’d love a Google Account sign-in for WordPress. If you wrote the plug-in in a pretty general way, you could no doubt also use it for OpenID, Yahoo accounts, Live IDs, TypeKey, etc. But mainly I’d like to use my Google Account to leave authenticated comments on WordPress blogs.
  4. Better tools to reverse engineer USB devices so that they can work on Linux in user space.
  5. Extend Firefox 3 Places to provide more complete tab history: when a tab was closed, whether a page was opened in a new tab or a new window, and basically anything to let a user see a “watertight” view of their surfing.
  6. Better screencasting support for Linux. Windows has Camtasia and CamStudio. The tools on Linux just aren’t as polished.
  7. Better video editing software in Linux. Or get parts of GIMPshop into the GIMP.
  8. Better podcasting support in Audacity.
  9. A Firefox extension or Greasemonkey to report webspam to Google. The extension would let you report a spam webpage from a button on a toolbar. The extension would also add a “Report as Spam!” button to Google’s search results page. 🙂
  10. A simple tool to backup your entire Google account (email, calendar, docs, feeds, etc.).
  11. Work to make Ubuntu more suitable for the coming wave of computers with Linux pre-installed (Wal-mart PC, Asus EEE).
  12. The world always needs better, more streamlined virtualization.
  13. Beef up the features on brainstorm.ubuntu.com . For example, let each user see all the ideas that they’ve voted for.
  14. Help out on Google Gears or Android.
  15. Make Juice run better on Linux.
  16. Make Asterisk easier to install and configure.
  17. Implement the functionality of the dragdropupload extension directly into Firefox.
  18. Add the ability to drag/drop images in WordPress, and let WordPress handle uploading the image to a preferred location on your webhost.
  19. Make the standard version of Ubuntu boot even faster.
  20. Add some nice features to Wine, ReactOS or Abiword.
  21. How about a good open-source program to manage your book library? Something like the Delicious Library program, but that works with Linux?

What about you? If you could request a student to work on any open-source project this summer, what would you ask for?

Ubuntu annoyance: asks for DVD

Sometimes when you install Ubuntu (a flavor of Linux) and then try to install new packages, you get this annoying message:

Media change: please insert the disc labeled
‘Ubuntu 7.10 _Gutsy Gibbon_ – Release i386 (20071016)’
in the drive ‘/cdrom/’ and press enter

To fix that message, click on

System->Administration->Software Sources and uncheck the “CD-ROM/DVD” option at the bottom of the menu:

Uncheck the DVD option

css.php