The Stack Overflow Regular Expressions FAQ
See also a lot of general hints and useful links at the regex tag details page.
Online tutorials
Quantifiers
- Zero-or-more:
*:greedy,*?:reluctant,*+:possessive - One-or-more:
+:greedy,+?:reluctant,++:possessive ?:optional (zero-or-one)- Min/max ranges (all inclusive):
{n,m}:between n & m,{n,}:n-or-more,{n}:exactly n - Differences between greedy, reluctant (a.k.a. “lazy”, “ungreedy”) and possessive quantifier:
- Greedy vs. Reluctant vs. Possessive Quantifiers
- In-depth discussion on the differences between greedy versus non-greedy
- What’s the difference between
{n}and{n}? - Can someone explain Possessive Quantifiers to me? php, perl, java, ruby
- Emulating possessive quantifiers .net
- Non-Stack Overflow references: From Oracle, regular-expressions.info
Character Classes
- What is the difference between square brackets and parentheses?
[...]: any one character,[^...]: negated/any character but[^]matches any one character including newlines javascript[\w-[\d]]/[a-z-[qz]]: set subtraction .net, xml-schema, xpath, JGSoft[\w&&[^\d]]: set intersection java, ruby 1.9+[[:alpha:]]:POSIX character classes[[:<:]]and[[:>:]]Word boundaries- Why do
[^\\D2],[^[^0-9]2],[^2[^0-9]]get different results in Java? java - Shorthand:
- Digit:
\d:digit,\D:non-digit - Word character (Letter, digit, underscore):
\w:word character,\W:non-word character - Whitespace:
\s:whitespace,\S:non-whitespace
- Digit:
- Unicode categories (
\p{L}, \P{L}, etc.)
Escape Sequences
- Horizontal whitespace:
\h:space-or-tab,\t:tab - Newlines:
\r,\n:carriage return and line feed\R:generic newline php java-8
- Negated whitespace sequences:
\H:Non horizontal whitespace character,\V:Non vertical whitespace character,\N:Non line feed character pcre php5 java-8 - Other:
\v:vertical tab,\e:the escape character
Anchors
| anchor | matches | flavors |
|---|---|---|
^ |
Start of string | Common* |
^ |
Start of line | Commonm |
$ |
End of line | Commonm |
$ |
End of text | Common* except javascript |
$ |
Very end of string | javascript*, phpD |
\A |
Start of string | Common except javascript |
\Z |
End of text | Common except javascript python |
\Z |
Very end of string | python |
\z |
Very end of string | Common except javascript python |
\b |
Word boundary | Common |
\B |
Not a word boundary | Common |
\G |
End of previous match | Common except javascript, python |
| Term | Definition |
|---|---|
| Start of string | At the very start of the string. |
| Start of line | At the very start of the string, and after a non-terminal line terminator. |
| Very end of string | At the very end of the string. |
| End of text | At the very end of the string, and at a terminal line terminator. |
| End of line | At the very end of the string, and at a line terminator. |
| Word boundary | At a word character not preceded by a word character, and at a non-word character not preceded by a non-word character. |
| End of previous match | At a previously set position, usually where a previous match ended. At the very start of the string if no position was set. |
“Common” refers to the following: icu java javascript .net objective-c pcre perl php python swift ruby
* Default |
m Multi-line mode. |
D Dollar end only mode.
Groups
(...):capture group,(?:):non-capture group- Why is my repeating capturing group only capturing the last match?
\1:backreference and capture-group reference,$1:capture group reference- What’s the meaning of a number after a backslash in a regular expression?
\g<1>123:How to follow a numbered capture group, such as\1, with a number?: python
- What does a subpattern
(?i:regex)mean? - What does the ‘P’ in
(?P<group_name>regexp)mean? (?>):atomic group or independent group,(?|):branch reset- Equivalent of branch reset in .NET/C# .net
- Named capture groups:
- General named capturing group reference at
regular-expressions.info - java:
(?<groupname>regex): Overview and naming rules (Non-Stack Overflow links) - Other languages:
(?P<groupname>regex)python,(?<groupname>regex).net,(?<groupname>regex)perl,(?P<groupname>regex)and(?<groupname>regex)php
- General named capturing group reference at
Lookarounds
- Lookaheads:
(?=...):positive,(?!...):negative - Lookbehinds:
(?<=...):positive,(?<!...):negative - Lookbehind limits in:
- Lookbehinds need to be constant-length php, perl, python, ruby
- Lookarounds of limited length
{0,n}java - Variable length lookbehinds are allowed .net
- Lookbehind alternatives:
- Using
\Kphp, perl (Flavors that support\K) - Alternative regex module for Python python
- The hacky way
- JavaScript negative lookbehind equivalents External link
- Using
Modifiers
| flag | modifier | flavors |
|---|---|---|
a |
ASCII | python |
c |
current position | perl |
e |
expression | php perl |
g |
global | most |
i |
case-insensitive | most |
m |
multiline | php perl python javascript .net java |
m |
(non)multiline | ruby |
o |
once | perl ruby |
S |
study | php |
s |
single line | ruby |
U |
ungreedy | php r |
u |
unicode | most |
x |
whitespace-extended | most |
y |
sticky ↪ | javascript |
- How to convert preg_replace e to preg_replace_callback?
- What are inline modifiers?
- What is ‘?-mix’ in a Ruby Regular Expression
Other:
|:alternation (OR) operator,.:any character,[.]:literal dot character- What special characters must be escaped?
- Control verbs (php and perl):
(*PRUNE),(*SKIP),(*FAIL)and(*F)- php only:
(*BSR_ANYCRLF)
- php only:
- Recursion (php and perl):
(?R),(?0)and(?1),(?-1),(?&groupname)
Common Tasks
- Get a string between two curly braces:
{...} - Match (or replace) a pattern except in situations s1, s2, s3…
- How do I find all YouTube video ids in a string using a regex?
- Validation:
- Internet: email addresses, URLs (host/port: regex and non-regex alternatives), passwords
- Numeric: a number, min-max ranges (such as 1-31), phone numbers, date
- Parsing HTML with regex: See “General Information > When not to use Regex”
Advanced Regex-Fu
- Strings and numbers:
- Regular expression to match a line that doesn’t contain a word
- How does this PCRE pattern detect palindromes?
- Match strings whose length is a fourth power
- How does this regex find triangular numbers?
- How to determine if a number is a prime with regex?
- How to match the middle character in a string with regex?
- Other:
- How can we match a^n b^n?
- Match nested brackets
- Using a recursive pattern php, perl
- Using balancing groups .net
- “Vertical” regex matching in an ASCII “image”
- List of highly up-voted regex questions on Code Golf
- How to make two quantifiers repeat the same number of times?
- An impossible-to-match regular expression:
(?!a)a - Match/delete/replace
thisexcept in contexts A, B and C - Match nested brackets with regex without using recursion or balancing groups?
Flavor-Specific Information
(Except for those marked with *, this section contains non-Stack Overflow links.)
- Java
- Official documentation: Pattern Javadoc ↪, Oracle’s regular expressions tutorial ↪
- The differences between functions in
java.util.regex.Matcher:matches()): The match must be anchored to both input-start and -endfind()): A match may be anywhere in the input string (substrings)lookingAt(): The match must be anchored to input-start only- (For anchors in general, see the section “Anchors”)
- The only
java.lang.Stringfunctions that accept regular expressions:matches(s),replaceAll(s,s),replaceFirst(s,s),split(s),split(s,i) - *An (opinionated and) detailed discussion of the disadvantages of and missing features in
java.util.regex
- .NET
- How to read a .NET regex with look-ahead, look-behind, capturing groups and back-references mixed together?
- Official documentation:
- Boost regex engine: General syntax, Perl syntax (used by TextPad, Sublime Text, UltraEdit, …???)
- JavaScript general info and RegExp object
- .NET
MySQL
Oracle
Perl5 version 18.2
- PHP: pattern syntax,
preg_match - Python: Regular expression operations,
searchvsmatch, how-to - Rust: crate
regex, structregex::Regex - Splunk: regex terminology and syntax and regex command
- Tcl: regex syntax, manpage,
regexpcommand - Visual Studio Find and Replace
General information
(Links marked with * are non-Stack Overflow links.)
- Other general documentation resources: Learning Regular Expressions, *Regular-expressions.info, *Wikipedia entry, *RexEgg, Open-Directory Project
- DFA versus NFA
- Generating Strings matching regex
- Books: Jeffrey Friedl’s Mastering Regular Expressions
- When to not use regular expressions:
- Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. (blog post written by Stack Overflow’s founder)*
- Do not use regex to parse HTML:
- Don’t.
Please, just don’t
- Well, maybe…if you’re really determined (other answers in this question are also good)
- Don’t.
Examples of regex that can cause regex engine to fail
- Why does this regular expression kill the Java regex engine?
Tools: Testers and Explainers
(This section contains non-Stack Overflow links.)
-
Online (* includes replacement tester, + includes split tester):
- Debuggex (Also has a repository of useful regexes) javascript, python, pcre
- *Regular Expressions 101 php, pcre, python, javascript, java
- Regex Pal, regular-expressions.info javascript
- Rubular ruby
RegExr
Regex Hero dotnet
- *+ regexstorm.net .net
- *RegexPlanet: Java java, Go go, Haskell haskell, JavaScript javascript, .NET dotnet, Perl perl php PCRE php, Python python, Ruby ruby, XRegExp xregexp
freeformatter.comxregexp- *+
regex.larsolavtorvik.comphp PCRE and POSIX, javascript
-
Offline:
- Microsoft Windows: RegexBuddy (analysis), RegexMagic (creation), Expresso (analysis, creation, free)
3
solved Reference – What does this regex mean?
