Finally… updates to the Mac OS X eReader scripts!

I’ve been using Kovid Goyal’s excellent Calibre application for a few months now to manage ebooks on my Sony eReader. It’s a great tool for managing my library of ebooks, and can also automatically download, convert, and install news from any RSS feed you throw at it. I’ve converted a lot of stuff I had in LIT format using it, and been very pleased with Kovid’s work overall.

At the same time, I’ve used Feedbooks as a source for ebooks, especially classic literature. Feedbooks provides all the formats you’d expect, and its ePub output looks really nice on the eReader. It also provides reading lists so I can tag content I want to download later, and has the makings of a social network for bookworms through those lists. I frequently download ePubs from Feedbooks and drop them on the eReader via Calibre, so I can keep both current events and literature on the device.

Last night, I took a look at the News section of Feedbooks, and was I impressed! They have a lot of RSS feeds they’re aggregating and formatting, and it looks great on the eReader. Sadly, since I’m using Mac OS X, I can’t use their News Stand application to automate content downloads when I attach the eReader.

But I liked the idea of getting the news formatted content straight from a server; not only would it be faster, but I wouldn’t have to either leave Calibre running or launch it every morning. Since automating the download and installation of Feedbooks news content is an extension of what I’d already done previously for PDF printing, I figured it was time to do a bit more hacking.

Feedbooks has a nice REST API, which they document on their site. While they don’t directly document their news endpoint yet, it’s based on the same service, so it’s easy to pull content using curl, which is really cool. In about three minutes, I was able to put together a little bash script that pulled a book from Feedbooks and drop it on my eReader.

Before I began working on the automated solution, I figured it was time to move everything to launchd, as I said I’d do three months ago. Launchd can start scripts at predetermined times or when a volume is mounted, so it doesn’t rely on the Finder and Automator; instead I can just write a few scripts and stitch them together with launchd directly. I won’t bore you with the details, as there are a lot of good tutorials on how to use launchd; head over to Google and noodle around if you’ve never done anything with it. Instead, I’ll give you another hint: Lingon. With Lingon, writing a launchd plist is as easy as as fillig out a form. For example, I was able to ditch all the Automator junk to detect the eReader and instead create a launchd entry to do it by filling out a form, like this:

Lingon Configuration for Automating eReader PDF Transfers

I wrote a second script that pulls a few Feedbooks news items and drops them in a second news directory; launchd just calls this script on a regular basis so that I now have reasonably up-to-date news stories anytime I cable the eReader. Doing this correctly was a bit finicky; because all of this runs on my laptop, I wanted to make sure that in the process of pulling a file that it wouldn’t overwrite a previous copy. That way, if it wasn’t on the network, the previous file could still be put on the eReader. I ended up using curl to pull content to working files in /tmp, and checking the result code from curl before copying the results to an intermediate spool directory where the results would wait until I tethered the eReader.

Finally, I reworked the transfer script quite a bit to simplify things. It’s now two scripts: transfer.sh, which launchd starts whenever I connect the eReader, and update-ereader.sh, which actually does the updating of both PDF and ePub files from Feedbooks. The new transfer.sh script invokes an Automator action that prompts me once all the files have been moved to the eReader, so I can tether and go easily, or go back and start Calibre and move more content to the eReader.

As is usual with my shell scripting, these scripts demonstrates my cargo cult mentality; there’s probably better ways to do much of this using bash. But I’m pleased that I no longer need to leave Calibre running all the time and yet getting my news is as seamless as getting what I’ve printed. You’re welcome to use these scripts as you see fit, of course; grab this and have fun. A word of warning: expect to do some customization (you’ll want to edit the files and specify paths, and need to set up the launchd magic on your own. Although all of this is still Mac OS X centric, I suspect that if you’re using Linux there may be pieces you can use here, too, especially now that I’ve ditched most of the dependencies on Automator. (I’ve not looked in to how to detect eReader mounts on Linux, though.)

Speaking of cargo cult programming, here are a few tidbits I picked up that may come in handy for others…

Most important to me from a cosmetic perspective was getting spaces back into file names and dropping the PDF extension on files. CUPS happily converts spaces in file names to underscores, which is great for scripting, but looks really ugly on the eReader display. A quick pass over a filename with sed fixed that:
dest=`echo "$base" | sed -e 's/_/ /g' -e 's/.pdf//g'`
Of course, if you’ve got spaces in file names, bash itself is going to get mad when you try to walk over the list of files in a directory; instead of something like
for f in `ls $spool/*`; do... done
write
find ${spool} -type file -print0 | while read -d $'\0' f; do ... done
The first will fail if any file name has whitespace, while the second uses find to grab each file name, terminate it with a null. and pass it to read, which reads the whole file name up to the trailing null appended by find and stashes the resulting file name in the variable f. Nifty trick, that.

One thing that still bugs me about this solution is that because launchd starts the transfer operation on any mount operation, not just the eReader card mount. That’s a launchd behavior; it starts scripts on volume mounts, not a specific volume mount, and gets twitchy if you tell it to watch a file and the file comes and goes (so you can’t trigger on a directory being mounted.) Because I have a card in the eReader, the eReader presents two separate mount events, and launchd invokes the transfer script twice. I’d hacked around that previously using a lock file, which is a usual UNIX pattern to deal with this sort of thing, but didn’t really like it. This version of the scripts uses ps to see if the script is running, like this:
me=$0 ps -ax | grep $me | grep -v grep | grep -v $$ > /dev/null if [ $? -eq 0 ]; then exit 0; fi
I have this nasty feeling that spawning all those grep commands is actually slower than simply testing for a file on the file system, though. Oh, well — performance isn’t that important, as the entire process is bound by the speed of writing to the eReader’s flash file system over USB anyway.

Finally, by separating the update operation from the mount detection, I can update the eReader when other content generation operations (such as a scheduled news download or my printing something else) occurs. The script that fetches my news kicks the update script, as does the cups printing job — I just stick a call to the update-ereader.sh script as the value of the PostProcessing directive in /etc/cups/cups-pdf.conf. (Actually, I’ve got a bigger script than that that runs; it does some stuff to try to strip margins and growls completion of each print job… but I’m not ready to talk about that, as it’s shamefully fragile at the moment.)

Leave a Reply