Home

Regular Expressions(Regex)

The purpose of a regex is to find character patterns. It stands for regular expression. This can be used to replace them with something or to delete them.

You can test regexes at this site.

Regex Components

Anchors

Anchors are special characters that match with position instead of matching the actual character.

Example Description
^string Selects “string” if it is the start of the line
string$ Selects “string” if it start at the end of the line

Examples

Quantifiers

Quantifiers decide how many times a character can occur for it to be selected.

Example Description
a+ Selects “a” one or more times.
a* Selects “a” zero or more times, including an empty string if no “a” is found.
a? Selects “a” one or zero times, including an empty string if no “a” is found.
a{X} Selects X number of “a”s
a{X,Y} Selects X to Y “a”s
a{X,} Selects at least X number of “a”s

Examples

OR Operator

The or operator “|” allows you to select one pattern or another.

Examples

Character Classes

Character classes are used to match or not match what is in the “[]”s

Character classes are often used with quantifiers to allow quantifiers to be applied to multiple characters.

Example Description
[abc] Selects “a”, “b”, or “c”.
[^abc] Selects everything except “a”, “b”, or “c”

Examples

Bracket Expressions

Bracket expressions are used to find a range of characters using []s

Example Description
[0-9] Find any digit
[a-z] Find any lower case characters
[A-Z] Find any upper case characters

Examples

Flags

Flags are optional settings put after the regex to change certain matching behavior.

Flag symbol Description
g global. Allows for multiple matches rather than just the first occurrence.
m multi line. ^ and $ match to each line instead of the whole input string.
i case insensitivity. “abc” is treated the same as “ABC”
x ignores whitespace within the regex
s Allows the dot “.” to match newline characters
u Used to match with full unicode. This is useful when working outside the ASCII range.

Grouping and Capturing

Capturing groups allow you to create references which can be used later on.

Example Description
(abc) This captures the group “abc” and can be referenced later
(?:abc) This creates a group, but is not added to the references

Back-references

To reference a capturing group you can do so with \1, \2, \3, etc.

Examples

Greedy and Lazy Match

Quantifiers are greedy by default, meaning they match as much as they can. Adding ? after the quantifier makes it lazy, meaning it matches as little as possible.

Example Description
a+? Selects “a” only one time instead of one or more

Boundaries

Boundaries allow you to find strings at the begging of words or at the end of words.

Example Description
\bstring Find “string” if it at the begging of words
string\b Find “string” if it at the end of words

Examples

Look-ahead and Look-behind

Used to see if a pattern matches ahead or behind the current position without changing the position.

Example Description
abc(?=def) Selects “abc” only if it is followed by another “def”, but doesn’t match the following “def”
abc(?!def) Select “abc” only if it is not followed by another “def”, but doesn’t match the following “def”
(?<=def)abc Select “abc” only if it is in front of “def”, but doesn’t match the “def”
(?<!def)abc Select “abc” only if it is not in front of “def”, but doesn’t match the “def”

Examples

Metacharacters

Special characters with specific meanings.

Metacharacters Description
. Find any single character except new line
.* Find any character 0 or more. Useful for selecting everything in front or behind.
\w Find a lower case, upper case, or digit.
\W Find anything that isn’t lower case, upper case, or digit.
\d Find any digit
\D Find any non-digit character
\s Find a whitespace character
\S Find any non-whitespace character
\0 Find null character
\n Find new line character
\f Find form feed character
\r Find carriage return character
\t Find tab character
\v Find vertical tab character
\ddd Find the octal number with ddd
\xYY Find a hexadecimal number with YY
\uYYYY Find the unicode character with the hex number nnnn

Regex Examples

What it’s matching Regex
Hex value /^#?([a-f0-9]{6}|[a-f0-9]{3})$/
Email /^([a-z0-9_.-]+)@([\da-z.-]+).([a-z.]{2,6})$/
URL /^(https?:\/\/)?([\da-z.-]+).([a-z.]{2,6})([\/\w .-])\/?$/
HTML Tag /^<([a-z]+)([^<]+)(?:>(.)<\/\1>|\s+\/>)$/
HTML Comment //
Phone number /^\d{3}-\d{3}-\d{4}/

Backus-Naur form grammer(BNF)

Similar to regex in that it is a language to find character patterns.

Examples

<number> ::= <digit> | <number> <digit>
<digit>  ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Command line tools

See Linux