Regex Cheat Sheet
Complete interactive reference of regular expression syntax organized by category. Search patterns, copy with one click, and test against live input instantly.
Live Regex Tester
Character Classes(13)
.Any character except newline (with s flag, matches newline too)
/h.t/g\dDigit — equivalent to [0-9]
/\d+/g\DNon-digit — equivalent to [^0-9]
/\D+/g\wWord character — equivalent to [a-zA-Z0-9_]
/\w+/g\WNon-word character — equivalent to [^a-zA-Z0-9_]
/\W+/g\sWhitespace — spaces, tabs, newlines
/\s+/g\SNon-whitespace character
/\S+/g[abc]Character set — matches any one character listed
/[aeiou]/g[^abc]Negated set — matches any character NOT listed
/[^aeiou]+/g[a-z]Character range — matches any character from a to z
/[a-z]+/g[A-Za-z0-9]Combined range — alphanumeric characters
/[A-Za-z0-9]+/g\tTab character
/\t/g\nNewline character
/\n/gQuantifiers(10)
*0 or more — greedy (as many as possible)
/go*/g+1 or more — greedy
/go+/g?0 or 1 — makes the preceding token optional
/colou?r/g{n}Exactly n repetitions
/\d{3}/g{n,}n or more repetitions
/\d{2,}/g{n,m}Between n and m repetitions (inclusive)
/\d{2,3}/g*?0 or more — lazy (as few as possible)
/<.+?>/g+?1 or more — lazy
/<.+?>/g??0 or 1 — lazy (prefers 0)
/colou??r/g{n,m}?Between n and m — lazy
/\d{2,4}?/gAnchors(6)
^Start of string (or start of line with m flag)
/^hello/g$End of string (or end of line with m flag)
/world$/g\bWord boundary — position between a word character and a non-word character
/\bcat\b/g\BNon-word boundary — position not at a word boundary
/\Bcat\B/g^With m flag: matches start of each line
/^\w+/g$With m flag: matches end of each line
/\w+$/gGroups & References(6)
(abc)Capturing group — captures the matched substring
/(\w+)\s(\w+)/g(?:abc)Non-capturing group — groups without creating a capture
/(?:foo)+/g(?<name>abc)Named capturing group — accessible by name in match result
/(?<year>\d{4})-(?<month>\d{2})/g\1Backreference — matches same text as first capture group
/(\w+) \1/g\k<name>Named backreference — matches same text as named capture group
/(?<word>\w+) \k<word>/ga|bAlternation — matches a or b
/cat|dog|bird/gLookaround(4)
(?=abc)Positive lookahead — matches if followed by the pattern (not consumed)
/\d+(?= dollars)/g(?!abc)Negative lookahead — matches if NOT followed by the pattern
/\d+(?! dollars)/g(?<=abc)Positive lookbehind — matches if preceded by the pattern (not consumed)
/(?<=\$)\d+/g(?<!abc)Negative lookbehind — matches if NOT preceded by the pattern
/(?<!\$)\d+/gFlags(7)
gGlobal — find all matches, not just the first
/\d+/giCase-insensitive — match regardless of letter case
/hello/gmMultiline — ^ and $ match start/end of each line
/^\w+/gsDotall — . matches newline characters too
/.+/guUnicode — enables full Unicode support and Unicode escapes
/\u{1F600}/gySticky — match starting at lastIndex only (no searching ahead)
/\d+/gdIndices — provides start/end indices for each match and group (ES2022)
/(\w+)/gYou’re writing a regex to match email addresses and can’t remember: is it \w+ or [a-zA-Z0-9_]+? Does \b match at the start of a line or at a word boundary? Is (?<=...) a lookbehind or a lookahead? You need a quick-scan reference organized by category, not a 5000-word tutorial.
Why This Cheat Sheet (Not the Regex Tester or Library)
PureDevTools has a Regex Tester for testing expressions against text and a Regex Library with 35+ ready-made patterns. This cheat sheet is an interactive syntax reference — every regex construct organized by category (character classes, quantifiers, anchors, groups, lookaround, flags) with descriptions, examples, and copy buttons. Use this to look up syntax; use the tester to verify your regex works; use the library for common patterns.
What Is a Regular Expression?
A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Regex engines scan a string and find substrings that match the pattern. They are built into virtually every programming language — JavaScript, Python, Go, Java, Rust, PHP, Ruby, Perl — and into command-line tools like grep, sed, awk, and editors like VS Code and Vim.
Regex is used for:
- Validation: checking that an email, phone number, URL, or postal code matches the expected format
- Search and replace: globally substituting one pattern with another in code or text
- Parsing: extracting specific parts of structured text (log files, config files, HTML)
- Lexing: tokenising source code in compilers and syntax highlighters
- Routing: matching URL paths in web frameworks (Express, Django, Rails)
How the Regex Engine Works
The engine reads the pattern and the subject string character by character. At each position it asks: “does the pattern match starting here?” If not, it advances one character and tries again. This is called left-to-right scanning.
Two broad strategies determine how quantifiers behave:
Greedy quantifiers (*, +, {n,m}) match as many characters as possible, then back off if the overall match fails. <.+> applied to <b>bold</b> greedily consumes the whole string and backtracks until it finds the last >, producing <b>bold</b>.
Lazy quantifiers (*?, +?, {n,m}?) match as few characters as possible, expanding only when needed. <.+?> on the same input produces <b> because it stops at the first >.
Character Classes
Character classes match a single character from a defined set.
| Syntax | Matches |
|---|---|
. | Any character except newline (\n). With the s (dotall) flag, matches newline too |
\d | Digit — [0-9] |
\D | Non-digit — [^0-9] |
\w | Word character — [a-zA-Z0-9_] |
\W | Non-word character — [^a-zA-Z0-9_] |
\s | Whitespace — space, tab, newline, carriage return, form feed |
\S | Non-whitespace |
[abc] | Exactly one of: a, b, or c |
[^abc] | Any character except a, b, or c |
[a-z] | Any lowercase letter a through z |
Unicode Character Classes (ES2018+)
With the u flag you can use Unicode property escapes:
\p{Letter} — any Unicode letter
\p{Number} — any Unicode number
\p{Script=Latin} — letters from the Latin script
\p{Emoji} — emoji characters
Quantifiers
Quantifiers specify how many times the preceding token must repeat.
| Syntax | Meaning | Greedy? |
|---|---|---|
* | 0 or more | yes |
+ | 1 or more | yes |
? | 0 or 1 | yes |
{n} | Exactly n | — |
{n,} | n or more | yes |
{n,m} | n to m (inclusive) | yes |
*? | 0 or more | lazy |
+? | 1 or more | lazy |
?? | 0 or 1 | lazy |
{n,m}? | n to m | lazy |
Possessive quantifiers (not in JavaScript) *+, ++, ?+ never back off and prevent catastrophic backtracking in languages that support them (PHP’s PCRE, Java, .NET).
Anchors
Anchors match a position in the string rather than a character.
| Syntax | Matches position |
|---|---|
^ | Start of string (or start of each line with m flag) |
$ | End of string (or end of each line with m flag) |
\b | Word boundary — between \w and \W, or at start/end of string next to \w |
\B | Non-word boundary |
\b is zero-width — it consumes no characters. /\bcat\b/ matches “cat” in “the cat sat” but not in “concatenate”.
Groups and References
Capturing groups (...) serve two purposes:
- They group tokens so quantifiers can apply to the whole group:
(?:ab)+matches “ababab” - They save the matched text so you can refer to it later
Backreferences repeat what was captured:
\1refers to the first group,\2to the second, etc.\k<name>refers to a named group(?<name>...)
Example: /(\w+) \1/g finds repeated words like “the the” or “and and”.
Named groups (?<year>\d{4}) make patterns self-documenting. Access the captured value as match.groups.year in JavaScript.
Non-capturing groups (?:...) group without creating a capture slot — useful when you need grouping for quantifiers but don’t need the captured value.
Lookaround
Lookahead and lookbehind assertions match a position where a pattern is (or isn’t) present, without consuming characters.
| Syntax | Name | Description |
|---|---|---|
(?=...) | Positive lookahead | Match only if followed by pattern |
(?!...) | Negative lookahead | Match only if NOT followed by pattern |
(?<=...) | Positive lookbehind | Match only if preceded by pattern |
(?<!...) | Negative lookbehind | Match only if NOT preceded by pattern |
Example: Extract prices without the currency symbol:
/(?<=\$)\d+(\.\d{2})?/g
Applied to "Total: $42.50", this matches 42.50 — the dollar sign is checked but not included in the match.
JavaScript supports lookbehind since ES2018 (Chrome 62, Firefox 78, Safari 16.4).
Flags
Flags modify overall matching behaviour and are written after the closing / in literal notation or as the second argument to new RegExp().
| Flag | Name | Effect |
|---|---|---|
g | Global | Return all matches, not just the first. In exec() mode, updates lastIndex |
i | Case-insensitive | A matches a, B matches b, etc. |
m | Multiline | ^ matches start of each line; $ matches end of each line |
s | Dotall | . also matches \n and \r (ES2018) |
u | Unicode | Treats the pattern as a sequence of Unicode code points; enables \p{...} |
y | Sticky | Match only at lastIndex; does not search ahead |
d | Indices | Adds indices array to each match with [start, end] positions (ES2022) |
Common Flag Combinations
gi— find all matches case-insensitivelygm— find all matches across multiple linesgis— global, case-insensitive, dotall (match across newlines)
Escape Special Characters
These characters have special meaning in regex and must be escaped with \ to match them literally:
. * + ? ^ $ { } [ ] | ( ) \
To match a literal period: \.
To match a literal dollar sign: \$
To match a literal backslash: \\
Inside a character class [...], most special characters lose their meaning. Only ], \, ^ (at start), and - (between chars) need escaping.
Common Patterns Reference
Email (simplified): [a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}
IPv4 address: (\d{1,3}\.){3}\d{1,3}
Hex color: #?([0-9a-fA-F]{6}|[0-9a-fA-F]{3})
UUID v4: [0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}
ISO 8601 date: \d{4}-\d{2}-\d{2}(T\d{2}:\d{2}:\d{2}(\.\d+)?Z?)?
Semantic version: (0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)
Slug: [a-z0-9]+(?:-[a-z0-9]+)*
JavaScript Regex API
// Test if pattern matches
/^\d+$/.test("123") // true
// Find first match
"hello".match(/e(l+)/) // ["ell", "ll"]
// Find all matches (g flag)
"a1b2c3".match(/\d/g) // ["1", "2", "3"]
// Replace (string)
"hello world".replace(/o/g, "0") // "hell0 w0rld"
// Replace with function
"hello".replace(/[aeiou]/g, c => c.toUpperCase()) // "hEllO"
// exec() loop (for groups with g flag)
const re = /(\w+)/g;
let m;
while ((m = re.exec("foo bar")) !== null) {
console.log(m[1]); // "foo", "bar"
}
// Named groups (ES2018)
const { year, month } = "2024-03".match(/(?<year>\d{4})-(?<month>\d{2})/).groups;
Frequently Asked Questions
What is the difference between * and +?
* matches zero or more — the pattern can be absent entirely. + requires at least one occurrence. /ab*/ matches “a”, “ab”, “abb”; /ab+/ matches “ab”, “abb” but not “a”.
What is catastrophic backtracking?
Catastrophic backtracking (also called ReDoS — Regex Denial of Service) occurs when a poorly written regex with nested quantifiers causes the engine to exponentially explore possible match paths. Example: /(a+)+$/ applied to “aaaaaab” will explore millions of combinations. The fix is to make the inner group non-capturing and use atomic groups or possessive quantifiers where supported, or restructure the pattern.
What is the difference between ^ inside and outside character classes?
Outside a character class, ^ anchors the match to the start of the string (or line in multiline mode). Inside a character class [^abc], it negates the class — matching anything except the listed characters.
Are regex patterns the same across languages?
No. Most languages use flavors derived from PCRE (Perl Compatible Regular Expressions), but there are differences. JavaScript does not support \A (absolute start), \z (absolute end), conditional patterns, or atomic groups. Python’s re module adds (?P<name>...) for named groups. Look up the specific flavor for your language.
How do I match a literal regex special character?
Escape it with a backslash: \. matches a literal period, \$ matches a literal dollar sign, \[ matches a literal bracket.