Basic Characters
| Expression | Explanations |
|---|
| ^ | Matches the expression to its right, at the start of a string before it experiences a line break |
| $ | Matches the expression to its left, at the end of a string before it experiences a line break |
| . | Matches any character except newline |
Quantifiers
| Expression | Explanations |
|---|
| + | Matches the expression to its left 1 or more times. |
| * | Matches the expression to its left 0 or more times. |
| ? | Matches the expression to its left 0 or 1 times |
| {p} | Matches the expression to its left p times, and not less. |
| {p, q} | Matches the expression to its left p to q times, and not less. |
| {p, } | Matches the expression to its left p or more times. |
| { , q} | Matches the expression to its left up to q times |
Character Classes
| Expression | Explanations |
|---|
| \w | Matches alphanumeric characters, that is a-z, A-Z, 0-9, and underscore(_) |
| \W | Matches non-alphanumeric characters, that is except a-z, A-Z, 0-9 and _ |
| \d | Matches digits, from 0-9. |
| \D | Matches any non-digits. |
| \s | Matches whitespace characters, which also include the \t, \n, \r, and space characters. |
| \S | Matches non-whitespace characters. |
| \A | Matches the expression to its right at the absolute start of a string whether in single or multi-line mode. |
| \Z | Matches the expression to its left at the absolute end of a string whether in single or multi-line mode. |
| \n | Matches a newline character |
| \t | Matches tab character |
| \b | Matches the word boundary (or empty string) at the start and end of a word. |
| \B | Matches where \b does not, that is, non-word boundary |
Sets
| Expression | Explanations |
|---|
| [abc] | Matches either a, b, or c. It does not match abc. |
| [a-z] | Matches any alphabet from a to z. |
| [A-Z] | Matches any alphabets in capital from A to Z |
| [a-p] | Matches a, -, or p. It matches – because \ escapes it. |
| [-z] | Matches – or z |
| [a-z0-9] | Matches characters from a to z or from 0 to 9. |
| [(+*)] | Special characters become literal inside a set, so this matches (, +, *, or ) |
| [^ab5] | Adding ^ excludes any character in the set. Here, it matches characters that are not a, b, or 5. |
| \[a\] | Matches [a] because both parentheses [ ] are escaped |
Groups
| Expression | Explanations |
|---|
| ( ) | Matches the expression inside the parentheses and groups it which we can capture as required |
| (?#…) | Read a comment |
| (?PAB) | Matches the expression AB, which can be retrieved with the group name. |
| (?:A) | Matches the expression as represented by A, but cannot be retrieved afterwards. |
| (?P=group) | Matches the expression matched by an earlier group named “group” |
Assertions
| Expression | Explanations |
|---|
| A(?=B) | This matches the expression A only if it is followed by B. (Positive look ahead assertion) |
| A(?!B) | This matches the expression A only if it is not followed by B. (Negative look ahead assertion) |
| (?<=B)A | This matches the expression A only if B is immediate to its left. (Positive look behind assertion) |
| (?<!B)A | This matches the expression A only if B is not immediately to its left. (Negative look behind assertion) |
| (?()|) | If else conditional |
Referecnes
https://cs50.harvard.edu/python/2022/notes/7/
https://www.geeksforgeeks.org/python-regex-cheat-sheet/