This page has been robot translated, sorry for typos if any. Original content here.

Crib on regular expressions PHP

On this topic:


^ - Start of line
$ - End of line

. - Any character except for line translations (without the parameter /.../s)
[...] - Any of the above character set. Within the square brackets, other operators do not work, but you can use metacharacters. With a hyphen, you can specify character sets: from the first to the last. For example, [af] means any letter from a, b, c, d, e, f.
[^ ...] - None of the above character set. Within the square brackets, other operators do not work, but you can use metacharacters. With a hyphen, you can specify character sets: from the first to the last. For example, [^ 0-9] means any character other than 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
\ # - The next character in the slash is # (except az and 0-9). For example, \\ stands for \, \. means a symbol. (dot), \ $ means the symbol $, and so on.

\ b - Beginning of the word
\ B - End of word
[[: alnum:]] - alphanumeric characters
[[: digit:]] - decimal numeric characters

[[: xdigit:]] - hexadecimal numeric characters
[[: alpha:]] - alphabetic characters
[[: upper:]] - uppercase alphabetic characters
[[: lower:]] - lower case letters

[[: punct:]] - punctuation
[[: space:]] - space characters
[[: blanc:]] - tab and space characters
[[: print:]] - printed characters

[[: cntrl:]] - control characters
[[: graph:]] - printed characters, except for whitespace
\ xNN - NN - hexadecimal ASCII character code (\ x20 - space, \ x4A - J, \ x6A - j, etc.)

\ t - tab character
\ n - new line
\ r - carriage return
\ a - translation of the format

\ v - vertical tabulation
\ a - call
\ e - escape
\ 033 - the octal recording of the character

\ x1A - hexadecimal
\ c - control character
\ l - lower case of the next character
\ u - upper case - // -

\ L - all characters in lowercase before \ E
\ U - in the upper - // -
\ E - the limiter of the change of register
\ Q - cancel the action as a metacharacter

\ w - alphanumeric or '_' character
\ W - not - // -
\ s is one space

\ S - one is not a space
\ d - one number
\ D - one is not a digit

\ b - word boundary
\ B is not a word boundary
\ A - the beginning of the line for each line in a multi-line string
\ Z - end of line for each line in a multiline string

\ G is the end of the action m // g

(...) - Group characters into one pattern and remember
| | - Previous or next pattern (logical "OR")

* - Zero or more times
+ - One or more times
? - 0 or 1 times previous mask
{n} - Repeat n times

{n,} - Repeat n or more times
{n, m} - Repeat from n to m times
? #N - This is the "backwards" operator. N is the number of characters to view.

? ~ N - Negative viewing back.
? = - Preview forward.
?! - Negating the view forward.

i - do not distinguish between lowercase and uppercase letters.
m - consider a multiline string.
s is a single-line string.
x - extended syntax (using spaces and comments)

e - after executing standard substitutions in the replaced string interprets it as PHP code and uses the result to replace the search string.
A - the pattern matching will be achieved only if it corresponds to the beginning of the line in which the search is performed.
D - metasymbol $ in the template corresponds only to the end of the data being processed. Without this modifier, the $ metacharacter also matches the position before the last character, if it is a line feed (but does not apply to any other line feeds). This modifier is ignored if the modifier m is used. In Perl, there is no similar modifier.
S - if this modifier is used, an additional template analysis is performed. In the present, this makes sense only for fixed templates that do not contain reference variables.

The U -modifier inverts the greed of the quantifiers, so they are not greedy by default. But become greedy if followed by the symbol '?'. This possibility is not compatible with Perl. The U modifier can also be used inside the template, using the '? U' entry.
X includes additional PCRE functionality that is not compatible with Perl: any backslash in the template, followed by a character that does not have a special value, results in an error. This is due to the fact that such combinations are reserved for further development. By default, the same as in Perl, the slash with the next character followed without a special value is treated as a typo. To date, these are all the features that are controlled by this modifier
u - includes additional PCRE functionality that is not compatible with Perl: templates are treated as UTF8 strings. The u modifier is available in PHP 4.1.0 and higher for Unix platforms, and in PHP 4.2.3 and higher for Windows platforms.

(? # comment) is a comment in the body of the template.
(?: pattern) - grouping like '()', but without backlink
(? = template) - "peeking" forward. For example, \ w + (? = \ T) / matches the word followed by a tab, but the '\ t' character is not included in the result.

\ NUMBER - A reference inside the regex on its own parsed bracket, where NUMBER is the number of the desired group (brackets). This operator works with some restrictions on the type of the referenced block - it works only if there are no repeat operators in the referenced bracket.