Thoughtstream | Courses

Understanding Perl Regular Expressions

This one- or two-day tutorial will introduce beginner and intermediate Perl programmers to the full functionality of Perl's regular expressions.

The first day focuses on the pattern-matching features in Perl 5.6 and 5.8. The morning sessions explore the principles and mechanisms underlying all Perl regular expressions. You'll see how the highly compact syntax of Perl patterns controls a built-in pattern-matching "engine", and learn how to design and construct Perl regexes to drive that engine efficiently. We'll also look at the four principal uses of regexes in Perl, discussing a number of uniquely Perlish regex "idioms". By lunch time, Perl's regexes will no longer seem like a mystery wrapped in an enigma wrapped in line-noise.

In the afternoon we will look at the more advanced and powerful features of Perl regular expressions such as code embedding, user- defined assertions, regex recursion, and backtracking control. These high-end features are not covered in most Perl textbooks or classes, yet understanding and being able to apply them is essential when dealing with large, real world data sets. During the class we will work through several everyday, yet challenging, problems in information processing and see how the Perl's regular expression mechanism can be tamed and harnessed to solve them. By the end of the first day, the full (and surprising) power of Perl's regular expression will be at your command.

The optional second day concentrates on the many new regex (and related) features introduced in Perl 5.10, including internally recursive patterns, named captures, new capture-related special variables, deprecated regex-related special variables, hyper-greedy quantifiers, backtracking control, relative backreferences, floating- length look-behind, new escape characters, and smart-matching.

Course format

1-day or 2-day seminar

Who should attend

Perl programmers from all disciplines who are familiar with the basics of Perl's control flow, string handling, and simple data structures (scalars, arrays, hashes).