Master Regular Expressions
A structured guide to syntax and patterns.
The Basics
Regular expressions match characters literally unless they are special meta-characters.
helloExampleMatches the exact string 'hello'. Case sensitive by default.
123ExampleMatches the exact digits '123'.
Character Classes
Define a set of characters to match using special codes.
\dExampleMatches any single digit (0-9).
\wExampleMatches any 'word' character (letters, numbers, underscore).
\sExampleMatches any whitespace (space, tab, newline).
.ExampleThe wild card. Matches ANY character except newline.
Sets and Ranges
Create custom lists of characters to match.
[abc]ExampleMatches either 'a', 'b', or 'c'.
[a-z]ExampleMatches any lowercase letter.
[^0-9]ExampleThe ^ inside brackets negates the set. Matches anything NOT a digit.
Quantifiers
Specify how many times the previous character/group should repeat.
a+ExampleMatches one or more 'a's.
a*ExampleMatches zero or more 'a's.
colou?rExampleThe '?' makes the 'u' optional. Matches color and colour.
\d{3}ExampleMatches exactly 3 digits.
Anchors
Lock the pattern to specific positions in the text.
^StartExampleMatches 'Start' only at the beginning of the string/line.
End$ExampleMatches 'End' only at the end of the string/line.
\bword\bExampleBoundary. Ensures 'word' is a whole word, not part of 'sword'.
Lookaround Assertions
Check for patterns without including them in the match. Zero-width assertions that look ahead or behind.
\w+(?=\s+is\b)ExamplePositive Lookahead: Matches words that are followed by ' is'.
\b\d{3}(?!-\d{4})ExampleNegative Lookahead: Matches 3 digits NOT followed by '-4 digits'.
(?<=\$)\d+ExamplePositive Lookbehind: Matches digits preceded by a dollar sign.
\b\w+\b(?<!\bthe\b)ExampleNegative Lookbehind: Matches words NOT preceded by 'the'.
Named Capture Groups
Give your capture groups meaningful names instead of relying on numeric indices. Makes patterns more readable and maintainable.
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})ExampleNamed groups for date parsing. Access captures by name instead of index.
(?<protocol>https?):\/\/(?<domain>[^/]+)(?<path>\/[^\s]*)?ExampleNamed groups for URL parsing: protocol, domain, and path.
(?<area>\d{3})-(?<exchange>\d{3})-(?<number>\d{4})ExampleNamed groups for phone number components.
(?<hours>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})ExampleNamed groups for time components.
Unicode Properties
Match characters by their Unicode properties. Works with international text, emojis, and special characters. Requires the 'u' flag.
\p{L}+ExampleUnicode Letters: matches letters from any language including accented characters and non-Latin scripts.
\p{N}+ExampleUnicode Numbers: matches numbers including Roman numerals, fractions, and numbers from other scripts.
[\p{Emoji}]ExampleEmoji Characters: matches any emoji character (if supported by engine).
[^\p{L}\p{N}\s]+ExampleNon-alphanumeric symbols: matches symbols, punctuation, and special Unicode characters.