Introduction
Regular expressions (regex) are a powerful tool used to match patterns in text. They are used in many programming languages and applications to search, edit, and manipulate text. This article will explain what a regex is and how to interpret a regex. It will also provide an example of a regex and explain what it means. Finally, it will provide some tips on how to use regex effectively.
Solution
^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$
This regex is used to validate an email address. It requires the email address to start with one or more characters from the set of alphanumeric characters, underscores, periods, and hyphens, followed by an @ symbol, followed by one or more characters from the set of alphanumeric characters, hyphens, and periods, followed by a period, followed by one or more characters from the set of alphanumeric characters, hyphens, and periods.
The Stack Overflow Regular Expressions FAQ
See also a lot of general hints and useful links at the regex tag details page.
Online tutorials
Quantifiers
- Zero-or-more:
*
:greedy,*?
:reluctant,*+
:possessive - One-or-more:
+
:greedy,+?
:reluctant,++
:possessive ?
:optional (zero-or-one)- Min/max ranges (all inclusive):
{n,m}
:between n & m,{n,}
:n-or-more,{n}
:exactly n - Differences between greedy, reluctant (a.k.a. “lazy”, “ungreedy”) and possessive quantifier:
- Greedy vs. Reluctant vs. Possessive Quantifiers
- In-depth discussion on the differences between greedy versus non-greedy
- What’s the difference between
{n}
and{n}?
- Can someone explain Possessive Quantifiers to me? php, perl, java, ruby
- Emulating possessive quantifiers .net
- Non-Stack Overflow references: From Oracle, regular-expressions.info
Character Classes
- What is the difference between square brackets and parentheses?
[...]
: any one character,[^...]
: negated/any character but[^]
matches any one character including newlines javascript[\w-[\d]]
/[a-z-[qz]]
: set subtraction .net, xml-schema, xpath, JGSoft[\w&&[^\d]]
: set intersection java, ruby 1.9+[[:alpha:]]
:POSIX character classes[[:<:]]
and[[:>:]]
Word boundaries- Why do
[^\\D2]
,[^[^0-9]2]
,[^2[^0-9]]
get different results in Java? java - Shorthand:
- Digit:
\d
:digit,\D
:non-digit - Word character (Letter, digit, underscore):
\w
:word character,\W
:non-word character - Whitespace:
\s
:whitespace,\S
:non-whitespace
- Digit:
- Unicode categories (
\p{L}, \P{L}
, etc.)
Escape Sequences
- Horizontal whitespace:
\h
:space-or-tab,\t
:tab - Newlines:
\r
,\n
:carriage return and line feed\R
:generic newline php java-8
- Negated whitespace sequences:
\H
:Non horizontal whitespace character,\V
:Non vertical whitespace character,\N
:Non line feed character pcre php5 java-8 - Other:
\v
:vertical tab,\e
:the escape character
Anchors
anchor | matches | flavors |
---|---|---|
^ |
Start of string | Common* |
^ |
Start of line | Commonm |
$ |
End of line | Commonm |
$ |
End of text | Common* except javascript |
$ |
Very end of string | javascript*, phpD |
\A |
Start of string | Common except javascript |
\Z |
End of text | Common except javascript python |
\Z |
Very end of string | python |
\z |
Very end of string | Common except javascript python |
\b |
Word boundary | Common |
\B |
Not a word boundary | Common |
\G |
End of previous match | Common except javascript, python |
Term | Definition |
---|---|
Start of string | At the very start of the string. |
Start of line | At the very start of the string, and after a non-terminal line terminator. |
Very end of string | At the very end of the string. |
End of text | At the very end of the string, and at a terminal line terminator. |
End of line | At the very end of the string, and at a line terminator. |
Word boundary | At a word character not preceded by a word character, and at a non-word character not preceded by a non-word character. |
End of previous match | At a previously set position, usually where a previous match ended. At the very start of the string if no position was set. |
“Common” refers to the following: icu java javascript .net objective-c pcre perl php python swift ruby
* Default |m
Multi-line mode. |D
Dollar end only mode.
Groups
(...)
:capture group,(?:)
:non-capture group- Why is my repeating capturing group only capturing the last match?
\1
:backreference and capture-group reference,$1
:capture group reference- What’s the meaning of a number after a backslash in a regular expression?
\g<1>123
:How to follow a numbered capture group, such as\1
, with a number?: python
- What does a subpattern
(?i:regex)
mean? - What does the ‘P’ in
(?P<group_name>regexp)
mean? (?>)
:atomic group or independent group,(?|)
:branch reset- Equivalent of branch reset in .NET/C# .net
- Named capture groups:
- General named capturing group reference at
regular-expressions.info
- java:
(?<groupname>regex)
: Overview and naming rules (Non-Stack Overflow links) - Other languages:
(?P<groupname>regex)
python,(?<groupname>regex)
.net,(?<groupname>regex)
perl,(?P<groupname>regex)
and(?<groupname>regex)
php
- General named capturing group reference at
Lookarounds
- Lookaheads:
(?=...)
:positive,(?!...)
:negative - Lookbehinds:
(?<=...)
:positive,(?<!...)
:negative - Lookbehind limits in:
- Lookbehinds need to be constant-length php, perl, python, ruby
- Lookarounds of limited length
{0,n}
java - Variable length lookbehinds are allowed .net
- Lookbehind alternatives:
- Using
\K
php, perl (Flavors that support\K
) - Alternative regex module for Python python
- The hacky way
- JavaScript negative lookbehind equivalents External link
- Using
Modifiers
flag | modifier | flavors |
---|---|---|
a |
ASCII | python |
c |
current position | perl |
e |
expression | php perl |
g |
global | most |
i |
case-insensitive | most |
m |
multiline | php perl python javascript .net java |
m |
(non)multiline | ruby |
o |
once | perl ruby |
S |
study | php |
s |
single line | ruby |
U |
ungreedy | php r |
u |
unicode | most |
x |
whitespace-extended | most |
y |
sticky ↪ | javascript |
- How to convert preg_replace e to preg_replace_callback?
- What are inline modifiers?
- What is ‘?-mix’ in a Ruby Regular Expression
Other:
|
:alternation (OR) operator,.
:any character,[.]
:literal dot character- What special characters must be escaped?
- Control verbs (php and perl):
(*PRUNE)
,(*SKIP)
,(*FAIL)
and(*F)
- php only:
(*BSR_ANYCRLF)
- php only:
- Recursion (php and perl):
(?R)
,(?0)
and(?1)
,(?-1)
,(?&groupname)
Common Tasks
- Get a string between two curly braces:
{...}
- Match (or replace) a pattern except in situations s1, s2, s3…
- How do I find all YouTube video ids in a string using a regex?
- Validation:
- Internet: email addresses, URLs (host/port: regex and non-regex alternatives), passwords
- Numeric: a number, min-max ranges (such as 1-31), phone numbers, date
- Parsing HTML with regex: See “General Information > When not to use Regex”
Advanced Regex-Fu
- Strings and numbers:
- Regular expression to match a line that doesn’t contain a word
- How does this PCRE pattern detect palindromes?
- Match strings whose length is a fourth power
- How does this regex find triangular numbers?
- How to determine if a number is a prime with regex?
- How to match the middle character in a string with regex?
- Other:
- How can we match a^n b^n?
- Match nested brackets
- Using a recursive pattern php, perl
- Using balancing groups .net
- “Vertical” regex matching in an ASCII “image”
- List of highly up-voted regex questions on Code Golf
- How to make two quantifiers repeat the same number of times?
- An impossible-to-match regular expression:
(?!a)a
- Match/delete/replace
this
except in contexts A, B and C - Match nested brackets with regex without using recursion or balancing groups?
Flavor-Specific Information
(Except for those marked with *
, this section contains non-Stack Overflow links.)
- Java
- Official documentation: Pattern Javadoc ↪, Oracle’s regular expressions tutorial ↪
- The differences between functions in
java.util.regex.Matcher
:matches()
): The match must be anchored to both input-start and -endfind()
): A match may be anywhere in the input string (substrings)lookingAt()
: The match must be anchored to input-start only- (For anchors in general, see the section “Anchors”)
- The only
java.lang.String
functions that accept regular expressions:matches(s)
,replaceAll(s,s)
,replaceFirst(s,s)
,split(s)
,split(s,i)
- *An (opinionated and) detailed discussion of the disadvantages of and missing features in
java.util.regex
- .NET
- How to read a .NET regex with look-ahead, look-behind, capturing groups and back-references mixed together?
- Official documentation:
- Boost regex engine: General syntax, Perl syntax (used by TextPad, Sublime Text, UltraEdit, …???)
- JavaScript general info and RegExp object
- .NET MySQL Oracle Perl5 version 18.2
- PHP: pattern syntax,
preg_match
- Python: Regular expression operations,
search
vsmatch
, how-to - Rust: crate
regex
, structregex::Regex
- Splunk: regex terminology and syntax and regex command
- Tcl: regex syntax, manpage,
regexp
command - Visual Studio Find and Replace
General information
(Links marked with *
are non-Stack Overflow links.)
- Other general documentation resources: Learning Regular Expressions, *Regular-expressions.info, *Wikipedia entry, *RexEgg, Open-Directory Project
- DFA versus NFA
- Generating Strings matching regex
- Books: Jeffrey Friedl’s Mastering Regular Expressions
- When to not use regular expressions:
- Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. (blog post written by Stack Overflow’s founder)*
- Do not use regex to parse HTML:
- Don’t. Please, just don’t
- Well, maybe…if you’re really determined (other answers in this question are also good)
Examples of regex that can cause regex engine to fail
- Why does this regular expression kill the Java regex engine?
Tools: Testers and Explainers
(This section contains non-Stack Overflow links.)
-
Online (* includes replacement tester, + includes split tester):
- Debuggex (Also has a repository of useful regexes) javascript, python, pcre
- *Regular Expressions 101 php, pcre, python, javascript, java
- Regex Pal, regular-expressions.info javascript
- Rubular ruby RegExr Regex Hero dotnet
- *+ regexstorm.net .net
- *RegexPlanet: Java java, Go go, Haskell haskell, JavaScript javascript, .NET dotnet, Perl perl php PCRE php, Python python, Ruby ruby, XRegExp xregexp
freeformatter.com
xregexp- *+
regex.larsolavtorvik.com
php PCRE and POSIX, javascript
-
Offline:
- Microsoft Windows: RegexBuddy (analysis), RegexMagic (creation), Expresso (analysis, creation, free)
3
solved Reference – What does this regex mean?
Regular expressions, or regex for short, are a powerful tool used to find patterns in text. They are used in many programming languages, including JavaScript, Python, and Perl. Regex can be used to search for specific words or phrases, validate user input, and even extract information from a string. But what does a regex actually mean?
A regex is a sequence of characters that define a search pattern. It is used to match a string of text against a pattern. The pattern is composed of symbols and characters that represent different elements of the text. For example, the character “.” is used to match any single character, while the character “*” is used to match zero or more of the preceding character. Regex can also be used to match specific words or phrases, or to search for patterns within a string.
Regex can be used to validate user input, such as email addresses or phone numbers. It can also be used to extract information from a string, such as a date or a price. Regex can also be used to search for patterns within a string, such as a specific word or phrase. Regex can be used to search for patterns in text, such as a specific word or phrase, or to validate user input, such as an email address or phone number.
Regex is a powerful tool that can be used to find patterns in text. It is used in many programming languages, including JavaScript, Python, and Perl. Regex can be used to search for specific words or phrases, validate user input, and even extract information from a string. Understanding how regex works and what it means can help you use it more effectively.