This page has been robot translated, sorry for typos if any. Original content here.

PHP Regular Expressions Cheat Sheet

On this topic:


^ - Line beginning
$ - End of line

. - Any character except line breaks (without the /.../s parameter)
[...] - Any of the listed character set. Inside the square brackets, other operators do not work, but you can use metacharacters. Using the hyphen, you can specify character sets: from first to last. For example, [af] means any letter from a, b, c, d, e, f.
[^ ...] - None of the listed character set. Inside the square brackets, other operators do not work, but you can use metacharacters. Using the hyphen, you can specify character sets: from first to last. For example, [^ 0-9] means any characters except 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
\ # - The # character following the slash (except az and 0-9). For example, \\ means the symbol \, \. means character. (period), \ $ means the character $, etc.

\ b - Word start
\ B - End of Word
[[: alnum:]] - alphanumeric characters
[[: digit:]] - decimal numeric characters

[[: xdigit:]] - hexadecimal numeric characters
[[alpha:]] - alphabetic characters
[[: upper:]] - uppercase letters
[[: lower:]] - lowercase alphabetic characters

[[: punct:]] - punctuation marks
[[: space:]] - space characters
[[: blanc:]] - tab and space characters
[[: print:]] - printable characters

[[: cntrl:]] - control characters
[[: graph:]] - printable characters, except whitespace
\ xNN - NN - hexadecimal ASCII character code (\ x20 - space, \ x4A - J, \ x6A - j, etc.)

\ t - tab character
\ n - new line
\ r - carriage transfer
\ a - format translation

\ v - vertical tab
\ a - call
\ e - escape
\ 033 - octal character

\ x1A - Hexadecimal
\ c - control character
\ l - lower case of the next character
\ u - upper case - // -

\ L - all lowercase characters to \ E
\ U - at the top - // -
\ E - register limiter
\ Q - undo action as a metacharacter

\ w - alphanumeric or '_' character
\ W - not - // -
\ s - one space

\ S - one is not a space
\ d - one digit
\ D - one not digit

\ b - word boundary
\ B is not a word boundary
\ A - the beginning of the line for each line in a multi-line line
\ Z - end of line for each line in a multi-line line.

\ G - the end of the action m // g

(...) - Group characters in one pattern and remember
| - Previous or next pattern (logical "OR")

* - Zero or more times
+ - One or more times
? - 0 or 1 time previous mask
{n} - Repeat n times

{n,} - Repeat n or more times
{n, m} - Repeat n to m times
? #N - This is a "look back" operator. N is the number of characters to view.

? ~ N - Denial of viewing backward.
? = - View ahead.
?! - Denial of viewing ahead.

i - do not distinguish between lowercase and uppercase letters.
m - count a multiline string.
s - single line.
x - extended syntax (use of spaces and comments)

e - after performing standard substitutions in the replaced string, interprets it as PHP code and uses the result to replace the search string.
A - pattern matching will be achieved only if it matches the beginning of the line in which the search is performed.
D - the $ metacharacter in the pattern corresponds only to the end of the data being processed. Without this modifier, the $ metacharacter also corresponds to the position before the last character, in case it is a line break (but does not apply to any other line breaks). This modifier is ignored if the m modifier is used. There is no similar modifier in Perl.
S - if this modifier is used, an additional analysis of the template is carried out. In the present, this only makes sense for fixed patterns that do not contain variable references.

U - modifier inverts the greed of quantifiers, so they are not greedy by default. But they become greedy if the '?' Character follows. This feature is not compatible with Perl. The U modifier can also be used inside a pattern, using the '? U' record.
X - includes additional PCRE functionality that is not compatible with Perl: any backslash in a pattern, followed by a symbol that does not have a special meaning, leads to an error. This is due to the fact that such combinations are reserved for further development. By default, as in Perl, a slash with the following character without any special meaning is interpreted as as a typo. Today, these are all features that are controlled by this modifier.
u - includes additional PCRE functionality that is not compatible with Perl: templates are treated as UTF8 strings. The u modifier is available in PHP 4.1.0 and higher for Unix platforms, and in PHP 4.2.3 and higher for Windows platforms.

(? # comment) - a comment in the body of the template.
(?: template) - grouping as well as '()', but without return link
(? = template) - "peeping" ahead. For example, / \ w + (? = \ T) / matches the word followed by tabulation, but the '\ t' character is not included in the result.

\ NUMBER - Reference inside the regexp to its own parsed bracket, where NUMBER is the number of the desired group (bracket). This operator works with some restrictions on the type of the referenced block — it only works if there are no repetition statements in the referenced parenthesis.