Regex anchors and lookarounds in plain English

Regex anchors and lookarounds are the parts of a regular expression that don't consume any characters. They look like a match but they don't move the cursor. Once you understand that, every weird quirk falls into place. Here's the short guide.

What "doesn't consume" means

A normal character class like [a-z] consumes a character — it matches one letter and advances the position by one. An anchor like ^ matches a position — the start of a line — without moving anywhere. After ^ matches, the next part of the pattern picks up at the same character that ^ was looking at.

This distinction is the key to everything. Anchors and lookarounds are zero-width assertions. They check something about the current position and either succeed or fail without changing where in the string we are.

The five anchors

^ — start of the string (or start of a line in multiline mode).
$ — end of the string (or end of a line in multiline mode).
\b — word boundary (between a word character and a non-word character).
\B — non-word-boundary (anything that isn't a \b position).
\A and \z — true start and end of the string, regardless of multiline mode. (Not all flavors support these — Python, Ruby, Java do; JavaScript does not.)

`^` and `$` and the multiline trap

In JavaScript, ^ matches only the start of the input by default. With the m flag, it matches the start of any line. Most people learn ^ as "start of input" and never learn the multiline switch, so they're surprised when their pattern ^\d+ doesn't match the first number on each line.

The most common ^/$ bug: writing ^password$ expecting it to match the whole-string value "password", then running it against "password\n" and being confused when it doesn't match. By default, $ in JavaScript matches the position right before the final \n only if there is no m flag — actually, by default $ matches only the very end of the string. The m flag changes its behavior. Spec-level pedantry matters here; just test the corner case in the Regex Tester before you commit.

`\b` — the most useful anchor

\b matches the position between a word character (\w, which is [A-Za-z0-9_]) and a non-word character. It's how you say "match this word but not when it appears inside a longer word."

\bcat\b matches cat but not category or concat.
\bcat matches cat and catalog but not concat.
cat\b matches cat and concat but not catalog.

The catch: \b's definition of "word character" is ASCII-only in many flavors. \bcafé\b may treat é as a non-word character and surface false boundaries. JavaScript fixed this in ES2018 with the u flag and Unicode property escapes; older engines did not.

The four lookarounds

A lookaround is a zero-width assertion that checks a pattern at the current position.

(?=…) — lookahead. Match here only if … would match starting at this position.
(?!…) — negative lookahead. Match here only if … would NOT match.
(?<=…) — lookbehind. Match here only if … matched ending at this position.
(?<!…) — negative lookbehind. Match here only if … would NOT match ending here.

The defining property is the same as for anchors: they don't consume characters. After a (?=foo) succeeds, the cursor is still in the same place; the next part of the pattern starts from where the lookahead started.

Three useful examples

Password validation: must contain at least one digit, at least one letter, ≥8 chars.

^(?=.*\d)(?=.*[A-Za-z])[A-Za-z\d]{8,}$

Three zero-width assertions stack at the start: one checks that the rest of the string contains a digit, one checks that it contains a letter, and the third — actually a normal pattern — consumes the characters. Lookaheads here let you check multiple independent conditions without worrying about order.

Find numbers not preceded by a dollar sign.

(?<!\$)\b\d+\b

(?<!\$) rejects the position if the previous character was $. Then \b\d+\b matches a standalone number.

Replace cat with dog only when not at the end of a word.

cat(?!\b) matches cat only when it's followed by another word character — i.e., it's inside a longer word like catalog.

Variable-length lookbehind

JavaScript (since ES2018), Python, and PCRE support variable-length lookbehind: (?<=\w+) matches a position preceded by one or more word characters. Older engines required a fixed length. If you target old engines (some legacy regex libraries), you may have to rewrite as a non-capturing group elsewhere.

Greedy, lazy, possessive — not the same as lookarounds

A common confusion: people lump the +? (lazy quantifier) and ++ (possessive quantifier, where supported) in with lookarounds because they look like they're "modifying" matching behavior. They're not zero-width — they consume. They control how many characters a quantifier matches; lookarounds control whether a position is allowed.

If your regex is doing something unexpected, ask: is the symptom that the cursor moved when I didn't expect, or that the match boundary is in a weird place? Cursor problems mean lookaround/anchor issues. Boundary problems mean quantifier issues.

The one mistake almost everyone makes

You write ^https?://.*$ and assume it matches a URL. It does — but only one. Then you run it against multi-line input expecting to extract every URL, one per line, and the regex matches the entire input as a single URL because . matches every character except newline and $ happens to land at the end.

Two fixes:

Add the m flag so ^ and $ match line boundaries.
Replace .* with [^\n]* so the match can't span lines.

Combined: /^https?:\/\/[^\s]+$/gm.

The 60-second test for this kind of bug is: open the Regex Tester, paste a multi-line sample with two of the thing you want to match, and check that you get two matches, not one. If you get one, your . is eating the newline.

A cheat sheet to keep around

Symbol	Meaning
`^`	Start of string (or line with `m`)
`$`	End of string (or line with `m`)
`\b`	Word boundary
`\B`	Not a word boundary
`(?=x)`	Lookahead — followed by `x`
`(?!x)`	Negative lookahead — not followed by `x`
`(?<=x)`	Lookbehind — preceded by `x`
`(?<!x)`	Negative lookbehind — not preceded by `x`
`.`	Any character except newline (use `s` flag for any)
`\w`	Word character (ASCII unless `u` flag)

The mental model that makes all of this click: anchors and lookarounds check a position, the rest of regex matches characters. When in doubt, ask "is the cursor supposed to move here?" — if no, you want an anchor or a lookaround.