What is the difference between a capturing group and a non-capturing group?

A capturing group (abc) saves the matched text so you can reference it later with backreferences like \1 or access it in match.groups. A non-capturing group (?:abc) groups tokens for quantifiers or alternation without creating a capture slot. Use non-capturing groups when you need grouping but not the captured value — it's slightly faster and keeps your backreference numbering clean.

What are lookahead and lookbehind assertions?

Lookahead (?=...) and lookbehind (?<=...) are zero-width assertions that check for a pattern at a position without consuming characters. Positive lookahead (?=X) matches only if followed by X. Negative lookahead (?!X) matches only if NOT followed by X. Lookbehind (?<=X) and (?<!X) work the same way but check what precedes the current position. JavaScript supports lookbehind since ES2018.

What does the global (g) flag do?

The global flag makes the regex find all matches in the string instead of stopping at the first one. In JavaScript, String.prototype.match() with the g flag returns an array of all matched strings. The exec() method with g advances the lastIndex property after each call, enabling loop-based matching.

What is a word boundary \b in regex?

\b matches a zero-width position between a word character (\w: [a-zA-Z0-9_]) and a non-word character (\W), or between a word character and the start/end of the string. It does not consume any characters. For example, /\bcat\b/ matches 'cat' in 'the cat sat' but not in 'concatenate' or 'cats'.

What is ReDoS and how do I avoid it?

ReDoS (Regular Expression Denial of Service) occurs when a regex with nested quantifiers causes catastrophic backtracking — exponentially exploring possible match paths on long inputs. Example: /(a+)+$/ on 'aaaaaab' can take seconds or minutes. Avoid nested quantifiers on the same character class, use atomic groups where supported, and test your patterns against adversarial inputs before deploying.

Regex Cheat Sheet

Complete interactive reference of regular expression syntax organized by category. Search patterns, copy with one click, and test against live input instantly.

All processing happens in your browser. No data is sent to any server.

Live Regex Tester

Pattern

//g

Flags

Test String

Character Classes(13)

.

Any character except newline (with s flag, matches newline too)

Pattern:/h.t/g

Input:hat hit hot hut

Matches:"hat", "hit", "hot", "hut"

\d

Digit — equivalent to [0-9]

Pattern:/\d+/g

Input:abc 123 def 456

Matches:"123", "456"

\D

Non-digit — equivalent to [^0-9]

Pattern:/\D+/g

Input:abc123

Matches:"abc"

\w

Word character — equivalent to [a-zA-Z0-9_]

Pattern:/\w+/g

Input:hello world

Matches:"hello", "world"

\W

Non-word character — equivalent to [^a-zA-Z0-9_]

Pattern:/\W+/g

Input:hello, world!

Matches:", ", "!"

\s

Whitespace — spaces, tabs, newlines

Pattern:/\s+/g

Input:hello world

Matches:" "

\S

Non-whitespace character

Pattern:/\S+/g

Input:hello world

Matches:"hello", "world"

[abc]

Character set — matches any one character listed

Pattern:/[aeiou]/g

Input:hello

Matches:"e", "o"

[^abc]

Negated set — matches any character NOT listed

Pattern:/[^aeiou]+/g

Input:hello

Matches:"h", "ll"

[a-z]

Character range — matches any character from a to z

Pattern:/[a-z]+/g

Input:Hello World 123

Matches:"ello", "orld"

[A-Za-z0-9]

Combined range — alphanumeric characters

Pattern:/[A-Za-z0-9]+/g

Input:foo_bar-123!

Matches:"foo", "bar", "123"

\t

Tab character

Pattern:/\t/g

Input:a b

Matches:"\t"

\n

Newline character

Pattern:/\n/g

Input:line1 line2

Matches:"\n"

Quantifiers(10)

*

0 or more — greedy (as many as possible)

Pattern:/go*/g

Input:g go goo gooo

Matches:"g", "go", "goo", "gooo"

+

1 or more — greedy

Pattern:/go+/g

Input:g go goo gooo

Matches:"go", "goo", "gooo"

?

0 or 1 — makes the preceding token optional

Pattern:/colou?r/g

Input:color colour

Matches:"color", "colour"

{n}

Exactly n repetitions

Pattern:/\d{3}/g

Input:1 12 123 1234

Matches:"123", "123"

{n,}

n or more repetitions

Pattern:/\d{2,}/g

Input:1 12 123 1234

Matches:"12", "123", "1234"

{n,m}

Between n and m repetitions (inclusive)

Pattern:/\d{2,3}/g

Input:1 12 123 1234

Matches:"12", "123", "123"

*?

0 or more — lazy (as few as possible)

Pattern:/<.+?>/g

Input:bold

Matches:"", ""

+?

1 or more — lazy

Pattern:/<.+?>/g

Input:<a>

Matches:"<a>", ""

??

0 or 1 — lazy (prefers 0)

Pattern:/colou??r/g

Input:color colour

Matches:"color", "colour"

{n,m}?

Between n and m — lazy

Pattern:/\d{2,4}?/g

Input:12345

Matches:"12", "34"

Anchors(6)

^

Start of string (or start of line with m flag)

Pattern:/^hello/g

Input:hello world say hello

Matches:"hello"

$

End of string (or end of line with m flag)

Pattern:/world$/g

Input:hello world world tour

Matches:"world"

\b

Word boundary — position between a word character and a non-word character

Pattern:/\bcat\b/g

Input:cat concatenate cats

Matches:"cat"

\B

Non-word boundary — position not at a word boundary

Pattern:/\Bcat\B/g

Input:cat concatenate cats

Matches:"cat"

^

With m flag: matches start of each line

Pattern:/^\w+/g

Input:hello world

Matches:"hello", "world"

$

With m flag: matches end of each line

Pattern:/\w+$/g

Input:hello world

Matches:"hello", "world"

Groups & References(6)

(abc)

Capturing group — captures the matched substring

Pattern:/(\w+)\s(\w+)/g

Input:hello world

Matches:"hello world"

(?:abc)

Non-capturing group — groups without creating a capture

Pattern:/(?:foo)+/g

Input:foofoofoo

Matches:"foofoofoo"

(?<name>abc)

Named capturing group — accessible by name in match result

Pattern:/(?<year>\d{4})-(?<month>\d{2})/g

Input:2024-01-15

Matches:"2024-01"

\1

Backreference — matches same text as first capture group

Pattern:/(\w+) \1/g

Input:hello hello world world

Matches:"hello hello", "world world"

\k<name>

Named backreference — matches same text as named capture group

Pattern:/(?<word>\w+) \k<word>/g

Input:hello hello

Matches:"hello hello"

a|b

Alternation — matches a or b

Pattern:/cat|dog|bird/g

Input:I have a cat and a dog

Matches:"cat", "dog"

Lookaround(4)

(?=abc)

Positive lookahead — matches if followed by the pattern (not consumed)

Pattern:/\d+(?= dollars)/g

Input:100 dollars and 50 euros

Matches:"100"

(?!abc)

Negative lookahead — matches if NOT followed by the pattern

Pattern:/\d+(?! dollars)/g

Input:100 dollars and 50 euros

Matches:"50"

(?<=abc)

Positive lookbehind — matches if preceded by the pattern (not consumed)

Pattern:/(?<=\$)\d+/g

Input:price: $42 and 10 cents

Matches:"42"

(?<!abc)

Negative lookbehind — matches if NOT preceded by the pattern

Pattern:/(?<!\$)\d+/g

Input:price: $42 and 10 cents

Matches:"10"

Flags(7)

g

Global — find all matches, not just the first

Pattern:/\d+/g

Input:one 1 two 2 three 3

Matches:"1", "2", "3"

i

Case-insensitive — match regardless of letter case

Pattern:/hello/g

Input:Hello HELLO hello

Matches:"Hello", "HELLO", "hello"

m

Multiline — ^ and $ match start/end of each line

Pattern:/^\w+/g

Input:foo bar baz

Matches:"foo", "bar", "baz"

s

Dotall — . matches newline characters too

Pattern:/.+/g

Input:hello world

Matches:"hello\nworld"

u

Unicode — enables full Unicode support and Unicode escapes

Pattern:/\u{1F600}/g

Input:Hello 😀 World

Matches:"😀"

y

Sticky — match starting at lastIndex only (no searching ahead)

Pattern:/\d+/g

Input:123abc456

Matches:"123"

d

Indices — provides start/end indices for each match and group (ES2022)

Pattern:/(\w+)/g

Input:hello

Matches:"hello"

You’re writing a regex to match email addresses and can’t remember: is it \w+ or [a-zA-Z0-9_]+? Does \b match at the start of a line or at a word boundary? Is (?<=...) a lookbehind or a lookahead? You need a quick-scan reference organized by category, not a 5000-word tutorial.

Why This Cheat Sheet (Not the Regex Tester or Library)

PureDevTools has a Regex Tester for testing expressions against text and a Regex Library with 35+ ready-made patterns. This cheat sheet is an interactive syntax reference — every regex construct organized by category (character classes, quantifiers, anchors, groups, lookaround, flags) with descriptions, examples, and copy buttons. Use this to look up syntax; use the tester to verify your regex works; use the library for common patterns.

What Is a Regular Expression?

A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Regex engines scan a string and find substrings that match the pattern. They are built into virtually every programming language — JavaScript, Python, Go, Java, Rust, PHP, Ruby, Perl — and into command-line tools like grep, sed, awk, and editors like VS Code and Vim.

Regex is used for:

Validation: checking that an email, phone number, URL, or postal code matches the expected format
Search and replace: globally substituting one pattern with another in code or text
Parsing: extracting specific parts of structured text (log files, config files, HTML)
Lexing: tokenising source code in compilers and syntax highlighters
Routing: matching URL paths in web frameworks (Express, Django, Rails)

How the Regex Engine Works

The engine reads the pattern and the subject string character by character. At each position it asks: “does the pattern match starting here?” If not, it advances one character and tries again. This is called left-to-right scanning.

Two broad strategies determine how quantifiers behave:

Greedy quantifiers (*, +, {n,m}) match as many characters as possible, then back off if the overall match fails. <.+> applied to bold greedily consumes the whole string and backtracks until it finds the last >, producing bold.

Lazy quantifiers (*?, +?, {n,m}?) match as few characters as possible, expanding only when needed. <.+?> on the same input produces  because it stops at the first >.

Character Classes

Character classes match a single character from a defined set.

Syntax	Matches
`.`	Any character except newline (`\n`). With the `s` (dotall) flag, matches newline too
`\d`	Digit — `[0-9]`
`\D`	Non-digit — `[^0-9]`
`\w`	Word character — `[a-zA-Z0-9_]`
`\W`	Non-word character — `[^a-zA-Z0-9_]`
`\s`	Whitespace — space, tab, newline, carriage return, form feed
`\S`	Non-whitespace
`[abc]`	Exactly one of: a, b, or c
`[^abc]`	Any character except a, b, or c
`[a-z]`	Any lowercase letter a through z

Unicode Character Classes (ES2018+)

With the u flag you can use Unicode property escapes:

\p{Letter}    — any Unicode letter
\p{Number}    — any Unicode number
\p{Script=Latin}   — letters from the Latin script
\p{Emoji}     — emoji characters

Quantifiers

Quantifiers specify how many times the preceding token must repeat.

Syntax	Meaning	Greedy?
`*`	0 or more	yes
`+`	1 or more	yes
`?`	0 or 1	yes
`{n}`	Exactly n	—
`{n,}`	n or more	yes
`{n,m}`	n to m (inclusive)	yes
`*?`	0 or more	lazy
`+?`	1 or more	lazy
`??`	0 or 1	lazy
`{n,m}?`	n to m	lazy

Possessive quantifiers (not in JavaScript) *+, ++, ?+ never back off and prevent catastrophic backtracking in languages that support them (PHP’s PCRE, Java, .NET).

Anchors

Anchors match a position in the string rather than a character.

Syntax	Matches position
`^`	Start of string (or start of each line with `m` flag)
`$`	End of string (or end of each line with `m` flag)
`\b`	Word boundary — between `\w` and `\W`, or at start/end of string next to `\w`
`\B`	Non-word boundary

\b is zero-width — it consumes no characters. /\bcat\b/ matches “cat” in “the cat sat” but not in “concatenate”.

Groups and References

Capturing groups (...) serve two purposes:

They group tokens so quantifiers can apply to the whole group: (?:ab)+ matches “ababab”
They save the matched text so you can refer to it later

Backreferences repeat what was captured:

\1 refers to the first group, \2 to the second, etc.
\k<name> refers to a named group (?<name>...)

Example: /(\w+) \1/g finds repeated words like “the the” or “and and”.

Named groups (?<year>\d{4}) make patterns self-documenting. Access the captured value as match.groups.year in JavaScript.

Non-capturing groups (?:...) group without creating a capture slot — useful when you need grouping for quantifiers but don’t need the captured value.

Lookaround

Lookahead and lookbehind assertions match a position where a pattern is (or isn’t) present, without consuming characters.

Syntax	Name	Description
`(?=...)`	Positive lookahead	Match only if followed by pattern
`(?!...)`	Negative lookahead	Match only if NOT followed by pattern
`(?<=...)`	Positive lookbehind	Match only if preceded by pattern
`(?<!...)`	Negative lookbehind	Match only if NOT preceded by pattern

Example: Extract prices without the currency symbol:

/(?<=\$)\d+(\.\d{2})?/g

Applied to "Total: $42.50", this matches 42.50 — the dollar sign is checked but not included in the match.

JavaScript supports lookbehind since ES2018 (Chrome 62, Firefox 78, Safari 16.4).

Flags

Flags modify overall matching behaviour and are written after the closing / in literal notation or as the second argument to new RegExp().

Flag	Name	Effect
`g`	Global	Return all matches, not just the first. In `exec()` mode, updates `lastIndex`
`i`	Case-insensitive	`A` matches `a`, `B` matches `b`, etc.
`m`	Multiline	`^` matches start of each line; `$` matches end of each line
`s`	Dotall	`.` also matches `\n` and `\r` (ES2018)
`u`	Unicode	Treats the pattern as a sequence of Unicode code points; enables `\p{...}`
`y`	Sticky	Match only at `lastIndex`; does not search ahead
`d`	Indices	Adds `indices` array to each match with `[start, end]` positions (ES2022)

Common Flag Combinations

gi — find all matches case-insensitively
gm — find all matches across multiple lines
gis — global, case-insensitive, dotall (match across newlines)

Escape Special Characters

These characters have special meaning in regex and must be escaped with \ to match them literally:

. * + ? ^ $ { } [ ] | ( ) \

To match a literal period: \. To match a literal dollar sign: \$ To match a literal backslash: \\

Inside a character class [...], most special characters lose their meaning. Only ], \, ^ (at start), and - (between chars) need escaping.

Common Patterns Reference

Email (simplified):    [a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}
IPv4 address:          (\d{1,3}\.){3}\d{1,3}
Hex color:             #?([0-9a-fA-F]{6}|[0-9a-fA-F]{3})
UUID v4:               [0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}
ISO 8601 date:         \d{4}-\d{2}-\d{2}(T\d{2}:\d{2}:\d{2}(\.\d+)?Z?)?
Semantic version:      (0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)
Slug:                  [a-z0-9]+(?:-[a-z0-9]+)*

JavaScript Regex API

// Test if pattern matches
/^\d+$/.test("123")          // true

// Find first match
"hello".match(/e(l+)/)       // ["ell", "ll"]

// Find all matches (g flag)
"a1b2c3".match(/\d/g)        // ["1", "2", "3"]

// Replace (string)
"hello world".replace(/o/g, "0")     // "hell0 w0rld"

// Replace with function
"hello".replace(/[aeiou]/g, c => c.toUpperCase())  // "hEllO"

// exec() loop (for groups with g flag)
const re = /(\w+)/g;
let m;
while ((m = re.exec("foo bar")) !== null) {
  console.log(m[1]);   // "foo", "bar"
}

// Named groups (ES2018)
const { year, month } = "2024-03".match(/(?<year>\d{4})-(?<month>\d{2})/).groups;

Frequently Asked Questions

What is the difference between * and +? * matches zero or more — the pattern can be absent entirely. + requires at least one occurrence. /ab*/ matches “a”, “ab”, “abb”; /ab+/ matches “ab”, “abb” but not “a”.

What is catastrophic backtracking? Catastrophic backtracking (also called ReDoS — Regex Denial of Service) occurs when a poorly written regex with nested quantifiers causes the engine to exponentially explore possible match paths. Example: /(a+)+$/ applied to “aaaaaab” will explore millions of combinations. The fix is to make the inner group non-capturing and use atomic groups or possessive quantifiers where supported, or restructure the pattern.

What is the difference between ^ inside and outside character classes? Outside a character class, ^ anchors the match to the start of the string (or line in multiline mode). Inside a character class [^abc], it negates the class — matching anything except the listed characters.

Are regex patterns the same across languages? No. Most languages use flavors derived from PCRE (Perl Compatible Regular Expressions), but there are differences. JavaScript does not support \A (absolute start), \z (absolute end), conditional patterns, or atomic groups. Python’s re module adds (?P<name>...) for named groups. Look up the specific flavor for your language.

How do I match a literal regex special character? Escape it with a backslash: \. matches a literal period, \$ matches a literal dollar sign, \[ matches a literal bracket.

Related Tools

Regex Tester

Test and debug regular expressions with real-time highlighting and capture groups

Regex Pattern Library

Searchable library of 35+ common regex patterns for email, URL, phone, date, and more

String Escape / Unescape

Escape and unescape strings for JavaScript/JSON, HTML, URL, and more

More JavaScript Tools

JavaScript Minifier Regex Tester Regex Library Array Methods Cheat Sheet Object Methods Reference JavaScript Object Explorer View all →