Being able to whip up a regular expression in the ordinary course of data wrangling is one of those skills that separates the computing neophyte from the skilled. Years ago, documentation on the subject was scanty at best, relegated to man(1) pages and the occasional ‘NIX book footnote or appendix. With the widespread adoption of regular expressions in programming languages and the rise of the web, there’s been an explosion of poorly-written Web tutorials which purport to teach you regular expressions, but seldom does more than give your shift and number keys a workout.
Ben Forta’s Learning Regular Expressions, published by Addison-Wesley, aims to change this, and for the most part does an excellent job. Situated somewhere between the first page of Google search results for “regular expression” and Jeffrey Friedl’s Mastering Regular Expressions, it provides a step-by-step tutorial on using regular expressions to match text. Illustrated with common problems including matching phone numbers, postal codes (from three different countries, an nice addition to the tried-and-true example), email addresses, URLs, and snippets of HTML.
The book’s chapters are structured as lessons, each adding on to what you’ve learned in previous chapters. A careful study of the material and working of the examples will bring you not just a basic understanding of word and pattern matching, but more advanced use of regular expressions including position matching, backreferences, look-ahead and look-behind, and even conditional embedding. I’ve been using regular expressions (poorly, I admit, for the most part) for thirty years, and I had several “Oh, that’s how you can do that” moments when reading, especially the last few chapters.
Each chapter provides a set of increasingly sophisticated examples. I hesitate to call this a cookbook, because a competitor I will not name has largely co-pted that format, and truthfully, it’s more didactic than culinary. The presentation works, however; for each section, you’re presented with a problem, some sample text, a regular expression that may or may not solve the problem, and then a discussion of the regular expression and why it did or did not work. Including regular expressions that do not satisfy your goal permits Forta to link one section to another, building on your expectation of how things might work to how they do work. It turns out to be an effective way to present the material, and it’s easy to follow along using a tool such as grep.
The book closes with an appendix on the differences between several popular evironments’ implementation of regular expressions. This is helpful, because almost every reader will come to the book with a slightly different expectation of where they will be using what they learn.
I found Forta’s step-by-step presentation refreshing without being condescending. I would have preferred perhaps a few additional examples on some of the more advanced topics, like look-ahead and look-behind, and even subexpressions. However, he does the job and does it quickly; a motivated reader can go from knowing nothing about the subject to being proficient in just a few evenings, and I found it a quick read.
This took a bit longer than I expected it to, but that’s usually the way things work when you don’t know what you’re doing. When it’s done, it works quite well — I’ve got Winlink Express running on my Macbook under High Sierra with wine — no Parallels or VMware Fusion needed!
Here are the steps…
Continue reading Winlink on Mac OS X with a TH-D74 over Bluetooth…
More wrong with this than right, especially in the middle, but it’s the first I’ve recorded to completion and kept in almost a year, so it’s worth putting somewhere for posterity.
Recorded Thanksgiving 2017 in Apache Junction, Arizona using an OP-1.
- Can an AI be taught to explain itself? Cliff Kuang, New York Times Magazine
This is a good account of some of the problems we face with machine learning today. There is a clear disconnect between the results you get with good applications of ML, and understanding why they work the way they do. I am not convinced, however, that just adding a second network on the side to explain the first really will solve the problem — it begs the question of how we will understand what that network is doing.
- Come On Eileen, Dexy’s Midnight Runners. It’s worth finding different versions of this song and listening, because there are some fun intros and exits you don’t hear on the usual radio mix! See the wikipedia page for a nice discussion.