This page has been robot translated, sorry for typos if any. Original content here.

Some examples of regular expressions

On this topic:


Is the string a number, up to 77 digits:

  if (ereg ("^ [0-9] {1.77} $", $ string)) echo "yes";  else echo "no"; 

Whether the string consists only of letters, numbers, and "_", 5 to 20 characters in length:

  if (ereg ("^ [a-zа-я0-9 _] {5,20} $", $ string)) echo "yes";  else echo "no"; 

Are there any characters in the string except valid ones? Letters, numbers and "_" are considered valid. The length can not be checked here, unless it is just an additional condition strlen ($ string). Do not confuse with the previous example - although the result is the same, but the method is different, "by contradiction"

  if (! ereg ("[^ a-zа-я0-9 _]", $ string)) 
  echo "no other letters (OK)"; 
 else 
  echo "there are foreign letters (FALSE)"; 

For case independent comparison, use ereg i ().

Are there any consecutive characters in the string, at least 3 characters in a row (such as "abvgDDDee", but not "aabbaabb"):

  if (preg_match ("/ (.) \\ 1 \\ 1 /", $ string)) echo "yes";  else echo "no"; 

Replace everywhere in the text LINE1 with LINE2 (the problem is solved without regular expressions):

  $ string = str_replace ("LINE1", "LINE2", $ string); 

Replace the curves of the line break codes with normal ones: for this you only need to delete "\ r".

Transitions are normal (but different!): "\ N" or "\ r \ n".

There are also glitches, such as "\ r \ r \ n".

  $ string = str_replace ("\ r", "", $ string); 

Replace all repeating spaces with one. Do not try to use str_replace here, this is a good function, but not for this example.

  $ string = preg_replace ("/ XX + /", "X", $ string);  // instead of X, put a space 

There are some words in the text, say "WORD" and "LYALYA" (etc.), which need to be replaced in the same way with the same, but with additives.

Perhaps the words are missing or occur many times in any case.

Those. if there was a "word" or "word" (or something else), you need to replace it with "<b> word </ b>" or "<b> word) </ b>" (depending on how it was).

In other words, you need to find a list of words in any register and insert fixed lines (on "<b>" and "</ b>") along the edges of the found words.

  $ string = preg_replace ("/ (word1 | word2 | laja | word99) / si", " \\ 1 ", $ string); 

Find the text enclosed in a tag, for example <TITLE> ... </ TITLE> from the HTML file ( $ string is the source text).

  if (preg_match ("!! si", $ string, $ ok))
  echo "Tag found, text: $ ok [1]";
 else
  echo "Tag not found"; 

Find the text enclosed in a tag and replace it with another tag, for example: <TITLE> ... </ TITLE> replace similarly with <MY_TAG> ... </ MY_TEG> in the HTML file:

  preg_replace ("! ! si "," <MY_TEG> \\ 1 МОЙ_ТЕГ>  ", $ string); 

PHP code highlighting in messages

For example, you have a forum like vBulletin, where you can highlight the code, if you specifically select it: [PHP] any code [/ PHP] .

As a result, after this (when viewing the message), you get a beautiful and colored php code.

And so, if you want all the pieces between [PHP] .. [/ PHP] and <? ..?> To be perceived as code and colored, then this can be done quite easily.

The text of the program.

 <?

 // Original message:
 // ------------------------------------------------ ------
 $ str = '
 Pamagite, does not work at all!  Here is an example:
 [php]
 // comment
 # comment
 phpinfo ();
 [/ php] 

 La la la la la la 

 [php]
 for ($ i = 0; $ i <100; $ i ++) {
 ping ("- f", "www.ru");
 }
 [/ php]
 <? 
 echo "<a href=http://shram.kiev.ua/> click here! </a>";
 phpinfo (); 
 ?>
 ';
 // ------------------------------------------------ ------

 // suppress warnings (there are glitches in highlight_string) 
 error_reporting (0);

 // function of highlighting one piece of text
 function _my _ ($ s, $ a1, $ a2) {
  if ($ a1! = "<?") {$ a1 = "<?";  $ a2 = "?>";  }
  $ s = str_replace ("\\\" "," \ "", $ s);
  ob_start ();
  highlight_string ($ a1. $ s. $ a2);
  $ s = ob_get_contents ();
  ob_end_clean (); 
  return $ s;
 }

 // look for all the pieces in the text between <? ... or [PHP] ...
 $ str = preg_replace ("! (\ [php \] | <\?) (. *?) (\ [/ php \] | \?>)! ise", "_ my _ ('\\ 2', '\ \ 1 ',' \\ 3 ') ", $ str);

 echo $ str;

 ?>

After such a program on the screen it turns out:

Pamagite, does not work at all! Here is an example: <?
// comment
# comment
phpinfo ();
?> lala lala lala <?
for ( $ i = 0 ; $ i < 100 ; $ i ++) {
ping ( "-f" , "www.ru" );
}
?> <?
echo "<a href=http://shram.kiev.ua/> click here! </a>" ;
phpinfo ();
?>

As you can see, everything that was between the special lines was highlighted, and the extraneous text did not change at all. If you are going to apply for the forum, then think about the transitions to the new lines.

If your entire message is a solid code, use highlight_string directly, without searching <? ..?> In the code ...

URL validation check

This function is taken from the chat source.

It supports everything that can be in the URL ...

Remember that you must not only check, but also take a new value

from function because She adds "http: //" in case of its absence.

  // add.  function to remove dangerous characters
 function pregtrim ($ str) {
  return preg_replace ("/ [^ \ x20- \ xFF] /", "", @ strval ($ str));
 }
 //
 // checks the URL and returns:
 // * +1 if URL is empty 
 // if (checkurl ($ url) == 1) echo "empty"
 // * -1 if the URL is not empty, but with errors
 // if (checkurl ($ url) == - 1) echo "error"
 // * string (new URL) if the URL is found and parsed
 // if (checkurl ($ url) == 0) echo "everything is ok"
 // or if (strlen (checkurl ($ url))> 1) echo "everything is ok"
 //
 // If the protocol was not in the URL, it will be added ("http: //")
 //
 function checkurl ($ url) {
  // we cut the left characters and extreme spaces
  $ url = trim (pregtrim ($ url));
  // if empty - exit
  if (strlen ($ url) == 0) return 1;
  // check the URL for correctness
  if (! preg_match ("~ ^ (? :( ?: https? | ftp | telnet): // (?: [a-z0-9_-] {1.32} ".
  "(? :: [a-z0-9 _-] {1.32})? @)?)? (? :( ?: [a-z0-9 -] {1,128} \.) + (?: com | net | ".
  "org | mil | edu | arpa | gov | biz | info | aero | inc | name | [az] {2}) | (?! 0) (? :( ? ".
  "! 0 [^.] | 255) [0-9] {1,3} \.) {3} (?! 0 | 255) [0-9] {1,3}) (?: / [A -z0-9., _ @% & ".
  "? + = \ ~ / -] *)? (?: # [^ '\" & <>] *)? $ ~ i ", $ url, $ ok))
  return -1;  // if not correct - exit
  // if there is no flow - add
  if (! strstr ($ url, ": //")) $ url = "http: //". $ url;
  // replace protocol with lower case: hTtP -> http
  $ url = preg_replace ("~ ^ [az] + ~ ie", "strtolower ('\\ 0')", $ url);
  return $ url;
 } 

Thus, for verification you need to use something like this:

  $ url = checkurl ($ url);  // rewrite the URL in itself
 if ($ url) exit ("Error URL"); 

E-mail validation

E-mail validation - check as in the previous example.

  //
 // checks soap and returns
 // * +1 if the soap is empty
 // * -1 if not empty, but with an error
 // * string if soap is valid
 //

 function checkmail ($ mail) {
  // we cut the left characters and extreme spaces
  $ mail = trim (pregtrim ($ mail));  // function pregtrim () take above in the example
  // if empty - exit
  if (strlen ($ mail) == 0) return 1;
  if (! preg_match ("/ ^ [a-z0-9 _-] {1.20} @ (([a-z0-9 -] + \.) + (com | net | org | mil |".
  "edu | gov | arpa | info | biz | inc | name | [az] {2}) | [0-9] {1,3} \. [0-9] {1,3} \. [0- ".
  "9] {1,3} \. [0-9] {1,3}) $ / is", $ mail))
  return -1;
  return $ mail;
 } 

Cutting URLs from text and curved HTML files

Sometimes you need to cut from HTML text links to URL or Email.

If your HTML does not have a clearly curved code, then this is a very simple task for a regular expression of the type:

  ] + href = ([^>] +) [^>] *> (. *?) 

But links are different ... How to make your program, you decide.

You can take only 100% valid links, but then some crooked links will not fall (although they are also correct).

You can take everything, but then some links will not be completely cut out correctly.

Program text:

  <?
 $ str = "
 <a href=url1> name1 </a> 
 <a href=url2> name2 </a>
 <a href='url3'> name3 </a> 
 <a href=url4> <brackets> </a>
 <a href=\"url5\"> <b> bold </ b> </a> 
 <a href=url6> \ "quotes \" </a>
 <a target=\"<trip to trick the program> hahaha \ "href = url7> 77777 </a>
 <a href=url8 target=\"<tribe to trick the program> hahaha \ "> 88888 </a>";
 echo "<pre> Source Code:". htmlspecialchars ($ str). "</ pre>";
 echo "--------------- Option 1 ---------------";
 preg_match_all ("!  ] +) \ "? '?. *?> (. *?)! is", $ str, $ ok);
 for ($ i = 0; $ i  ". $ ok [1] [$ i]."  - ". $ ok [2] [$ i];
 }
 echo " 
--------------- Option 2 --------------- "; preg_match_all ("! ] + href = \ "? '? ([^ \"'>] +) \ "? '? [^>] *> (. *?)! is", $ str, $ ok); for ($ i = 0; $ i ". $ ok [1] [$ i]." - ". $ ok [2] [$ i]; } echo "
--------------- Option 3 --------------- "; preg_match_all ("!
] + href = \ "? '? ([^ \"'>] +) \ "? '? [^>] *> ([^ <>] *?)! is", $ str, $ ok); for ($ i = 0; $ i ". $ ok [1] [$ i]." - ". $ ok [2] [$ i]; } ?>

The result of the execution of the example:

Source:
<a href=url1> name1 </a>
<a href=url2> name2 </a>
<a href='url3'> name3 </a>
<a href=url4> <brackets> </a>
<a href="url5"> <b> bold </ b> </a>
<a href=url6> quotes </a>
<a target="<tribe to trick the program> hahaha "href = url7> 77777 </a>
<a href=url8 target="<tribe to trick the program> hahaha "> 88888 </a>
---------------Option 1---------------
  • url1 - name1
  • url2 - name2
  • url3 - name3
  • url4 - <brackets>
  • url5 - fat
  • url6 - quotes
  • url7 - 77777
  • url8 - hahaha "> 88888
    --------------- Option 2 ---------------
  • url1 - name1
  • url2 - name2
  • url3 - name3
  • url4 - <brackets>
  • url5 - fat
  • url6 - quotes
  • url8 - hahaha "> 88888
    --------------- Option 3 ---------------
  • url1 - name1
  • url2 - name2
  • url3 - name3
  • url6 - quotes