1. regex (Regular Expressions)

Lesson Content

Regular expressions are a powerful tool to do pattern based selection. It uses special notations similar to those we’ve encountered already such as the * wildcard.

We’ll go through a couple of the most common regular expressions, these are almost universal with any programming language.

Well use this phrase as our test string:

sally sells seashells 
by the seashore
  1. Beginning of a line with ^
^by
would match the line "by the seashore"
  1. End of a line with $
seashore$ 
would match the line "by the seashore"
  1. Matching any single character with .
b. 
would match "by" 
  1. Bracket notation with [] and ()

This can be a little tricky, brackets allow us to specify characters found within the bracket.

d[iou]g
would match: dig, dog, dug

The previous anchor tag ^ when used in a bracket means anything except the characters within the bracket.

d[^i]g
would match: dog and dug but not dig

Brackets can also use ranges to increase the amount of characters you want to use.

d[a-c]g
will match patterns like dag, dbg, and dcg

Be careful though as brackets are case sensitive:

d[A-C]g
will match dAg, dBg and dCg but not dag, dbg and dcg

And those are some basic regular expressions.

Exercise

Try to combine regular expressions with grep and search through some files.

grep [regular expression here] [file]

Quiz Question

# What regular expression would you use to match a single character? > - Regular Expression, or regex or regexp in short, is extremely and amazingly powerful in searching and manipulating text strings, particularly in processing text files. One line of regex can easily replace several dozen lines of programming codes. > - Regex is supported in all the scripting languages (such as Perl, Python, PHP, and JavaScript); as well as general purpose programming languages such as Java; and even word processors such as Word for searching texts. Getting started with regex may not be easy due to its geeky syntax, but it is certainly worth the investment of your time. 1. [ ] - 2. [ ] | 3. [ ] , 4. [x] .