Based on https://docs.python.org/3/howto/regex.html
Character classes
^
: The beginning of a line
$
: The end of a line
\A
: The beginning of a string
\Z
: The end of a string
\b
: A word boundary
\B
: Anything except a word boundary
.
: Any character except a newline (unless re.DOTALL
is used)
\d
and D
: Any (non-)digit
\s and \S
: Any (non-)whitespace character
\w
and \W
: Any (non-)alphanumeric character
[
and ]
: Specify a character class, either as a set of characters (e.g.., [abc]
or as a range (e.g. [a-c]
)
^
: As the first character in a character class, negates the set (e.g., [^abc]
matches any character except a, b, or c).
Repetitions
*
: Match the preceding character or group zero or more times.
+
: Match the preceding character or group one or more times.
?
: Match the preceding character or group zero or one time.
{m,n}
: Match the preceding character or group between m
and n
times.
Operators
\
: Escape a metacharacter (e.g., \+
matches the character +
).
|
: The or operator
Groups
(...)
: Define a group
(?P<name>...)
: Define a named group
(?:...)
: Define a non-capturing group
(?=...)
: Positive lookahead assertion
(?!...)
: Negative lookahead assertion
Regex Functions in Python
match
: Retursn a match object iff the RE matches at the beginning of the string
search
: Returns a match object if the RE matches anywhere in the string
finditer
: Returns an iterator over substrings that match the RE
sub
: Replaces all substrings that match the RE with another RE
split
: Splits a string at every match to the RE
compile
: Convert a regular expression into a pattern object that can be re-used