PHP: \ "Quotes \". Creating mysql queries, slash, quoting.


  • Quick recommendations.
  • Rules for compiling MySQL queries
  • Dynamic query generation
  • Correct work with special characters when composing queries
  • Prepared expressions
  • SQL Injection
  • Features of work with
  • About slashes. How to get rid of them
  • Remarks
  • Note: Forms
  • Comments
  • Quick recommendations.

  • To prevent SQL injections, two simple rules should be observed:
    1. Do not put data into the database without processing.
    This can be done either with the help of prepared expressions, or by processing parameters manually.
    If the request is left manually, then
    - all numerical parameters must be brought to the desired type
    - all other parameters must be processed by the mysql_real_escape_string () function and enclosed in quotation marks.

    2. Do not put control structures and identifiers entered by the user into the query.
    And to prescribe in the script a list of possible options, and choose only from them.

    Read more about this - why it should be done exactly the way it works, you can lower it. And it is highly recommended .

    Rules for compiling MySQL queries
    To begin with - a little bit about why these slashes are needed at all.
    If we substitute any data into the query, then to distinguish these data from the SQL commands, they must be quoted.
    For example, if we write
    SELECT * FROM table WHERE name = Bill
    Then the database will decide that Bill is the name of another field, will not find it, and will give an error. Therefore, the inserted data (in this case Bill's name) must be enclosed in quotation marks - then the database will find it a string, the value of which should be assigned to the name field:
    SELECT * FROM table WHERE name = 'Bill'
    However, in the data itself there may be quotes too. For example,
    SELECT * FROM table WHERE name = 'Д'Артаньян'
    Here, the database will decide that the 'D' is the data, and Artagnan is the team that she does not know, and will also give an error. Therefore, we need to trace all the data to explain to the database that the quotes (and some other special characters) encountered in them refer to the data.
    As a result, we get the right query, which will not cause errors:
    SELECT * FROM table WHERE name = 'Д\'Артаньян'

    Thus, we found out that when you substitute data in a query, you should follow two rules:
    - All data inserted into the query must be enclosed in quotes (single or double, but more convenient and more often used single).
    - In all string variables special symbols should be escaped with slashes.

    It should be specially noted: the added slashes DO NOT go to the database. They are only needed in the query. When you hit the base, the slashes are discarded. Accordingly, a common mistake is the use of stripslashes when retrieving data from the database.

    In fact, all of the above applies to string type data and dates. Numbers can be inserted without trailing or surrounded by quotes. If you do so, it is MANDATORY! Force the data to the correct type before inserting it into the query, for example:
    $id = intval ( $id );
    However, for simplicity (and reliability), you can work with numbers as with strings (for mysql will still convert them to the correct type). Accordingly, we will be any data inserted into the query, to trace and enclose in quotation marks.

    Also, there is one more rule - optional, but it should be followed to avoid the appearance of errors:
    The names of fields and tables should be enclosed in backward single quotes - "" (the key with this symbol is on the standard keyboard to the left of the "1" key). After all, the field name can coincide with the mysql keywords, but if we use the backquote, MySQL will understand Everything is correct:
    SELECT * FROM `table` WHERE `date` = '2006-04-04'
    It is necessary to distinguish these quotes and not to confuse one with others. It should also be remembered that backquotes are not escaped by slashes.

    Dynamic query generation
    If the SQL query in the script is written entirely, and does not change in any way, for example
    SELECT * FROM `table`
    Then there will be no problems with it.
    But all the power of our scripts is in the dynamic composition of queries!
    Instead of writing ready requests for all occasions, we compose them based on the data entering the script.
    And here we are trapped in danger.
    Let's say we compose a query using a variable:
    SELECT * FROM table WHERE name = '$name'
    It seems that everything is all right?
    And if $ name we have D'Artagnan? The request will return an error!
    That is, the variable must be traced before being substituted into the query.
    This can be done in several ways.
    The simplest (and the wrong) is to rely on magic quotes. As you already guessed, it was for this case that they were invented. In order to save the SQL requests of forgetful programmers from errors, ALL incoming data to the script are traced indiscriminately.
    If you use someone else's code, it's better to use magic quotes . This can create some inconvenience and does not guarantee you from mistakes or hacking (because the query rules do not exhaust the query rule), but at least reduce the risk. Therefore, when using someone else's code, be sure that the magic quotes are included.
    If you write all the code yourself, you should learn how to compose the queries correctly.

    Correct work with special characters when composing queries
    So. As we have already learned above, in order to correctly formulate a query, you must enclose the data in quotes and trace them.
    With the first, everything is clear. When composing dynamic queries, we never forget to enclose all the data in quotation marks:
    $query = "INSERT INTO `table` VALUES(NULL,'$name','$date','$price')" ;
    If the variable $ price should be of type int and we bring it to this type, then it is possible not to enclose it in quotation marks. However, if we conclude, then there will be no special troubles, but it will be possible to do work with the data of uniform.

    The second - tearing - is the one for which, in fact, for the most part, and all this text is written. Because it causes most of all questions and difficulties.

    First we disable the magic quotes. So, as described at the very beginning.
    Why should this be done?
    For many reasons. The most obvious is logical. "Magic quotes" add slashes not where they are needed - when composing a query , and even before getting into the script! But the data is not necessarily thereafter to be inserted into the query. Maybe they will have to be displayed to the user, and the slashes will only interfere. Plus, added slashes prevent, for example, correctly check the length of the entered line. In addition, we need to track not only the data that came from the user, but in general any data inserted into the query - many this obvious fact did not even occur to us! The list can be continued, but the conclusion is one: you do not have to add slashes automatically, without parsing, before the script starts, but only where you really need it - when composing the query.
    There is one more reason: when using the Unicode encoding, which is gaining popularity, and in time will occupy a dominant position in the web, magic quotes can simply spoil the text by taking a part of the multibyte string for the special character.

    Now let's add a slash yourself.
    First, for the purpose of tracking, we use the function mysql_real_escape_string()
    It should be remembered that you can only apply it after establishing a connection to the database.
    This function does much more than obsolete addslashes and mysql_escape_string. First, it facilitates the maintenance and reading of the mysql logs, replacing, for example, the newline character with "\n" And some other characters on the escape sequence. Secondly, and most importantly, it works correctly with multibyte encodings, taking into account the current MySQL encoding and does not, therefore, corrupt the Unicode text.
    Secondly, we do not forget that we need to trace all the data that we have enclosed in the query in quotation marks:

    $name
    = mysql_real_escape_string ( $name );
    $age = mysql_real_escape_string ( $age );
    $query = "INSERT INTO table (name,age,class) VALUES ('$name','$age',11)" ;

    or:
    $query = "SELECT * FROM table WHERE name LIKE '" . mysql_real_escape_string ( $_GET [ 'name' ]). "%'" ;
    It can be seen that the code is rather cumbersome. To facilitate the compilation of queries, you can set off on various tricks-to make a function for composing queries from an array (very convenient for requests of type INSERT and UPDATE), to trace an array of data in a loop, and so on.
    You can write and your library or function for making requests.
    The main thing is to remember that strict adherence to the rules of querying guarantees you from hacking the database, as well as being aware that the application of "magic quotes", with the apparent ease of composing requests, does not give such a guarantee, but only hinders the normal operation of the application.

    Prepared expressions
    There is another way to send requests to the database, called "prepared statements" (prepared statements).
    Its essence lies in the fact that a query template is prepared, with special markers, in place of which dynamic components will be substituted. An example of such a template:
    SELECT * FROM table WHERE name=? The question mark here is the same marker. In another way, it is called a placeholder. The whole secret is that the data in its place substitutes a special function that "binds" the variable to the query.
    Here's what the code looks like in this case:
    $stmt = $mysqli -> prepare ( "SELECT District FROM City WHERE Name=?" );
    $stmt -> bind_param ( "s" , $city );
    $stmt -> execute ();
    In the first line, we prepare a query template.
    In the second - we attach to the marker the value of the variable $ city.
    In the third line, we execute the query thus prepared.
    In this case, the request and data go to the database not together, but separately, excluding the possibility of any error or malicious manipulation.

    It is clear that many questions arise. But neither the scope nor the subject matter of this article allows us to dwell on them in more detail. I recommend that you consult the documentation on mysqli libraries and < PDO , which implements this principle.
    Also, you can use the DbSimple library of Dmitry Koterov or PEAR :: DB. The main difference between the two is that they implement the mechanism of prepared expressions only externally. And inside they work in the old fashion - making a request and sending it to the database, taking over the work on processing the variables. And PDO and mysqli work as described above - that is, the query template and the data go to the database separately.

    SQL Injection

    So, we learned how to correctly insert data into a query.
    BUT! The dynamic composition of queries is not limited to the substitution of data. Often we have to substitute SQL commands and field names into the query. And here we are already turning to the topic of security:

    SQL Injection is a method of hacker attack, when the data transferred to the script is modified in such a way that the request formed in this script starts to perform completely differently than what it was intended for.
    The rules of protection against such attacks can be divided into two points:
    1. Working with data.
    2. Working with the query control elements.

    We considered the first point above. He, it can be said, is not, in fact, a defense. Compliance with the rules for adding to the query is dictated, first of all, by the requirements of SYNOPSIS SQL. And as a side effect, we also have protection against hacking.

    The second point is much more complicated, because there is no such single universal rule as for data - the back quote does not protect the field name from modification by the hacker. It is impossible to protect the table name, SQL statements, LIMIT command parameters, and other operators.
    Therefore, the basic rule when substituting control elements in a query is:
    If you want to dynamically substitute SQL statements or field, database, table names into a query, you can not directly insert them into the query under any circumstances.
    All variants of such additions must be written in your script in advance and selected based on what the user entered.
    For example, if you need to pass a field name to an order by operator, you can not in any way substitute it directly. You must first check it. For example, to make an array of valid values, and substitute in the query only if the passed parameter in this array is present:
    $orders =array( "name" , "price" , "qty" );
    $key = array_search ( $_GET [ 'sort' ], $orders ));
    $orderby = $orders [ $key ];
    $query = "SELECT * FROM `table` ORDER BY $orderby" ;
    We search for a word entered by the user in the array of previously described variants, and if we find it, we select the corresponding element of the array. If no match is found, the first element of the array will be selected.
    Thus, the query is not substituted for what the user entered, but what was written in our script.
    Similarly, we must act in all other cases
    For example, if the WHERE clause is dynamically generated:
    if (!empty( $_GET [ 'price' ])) $where .= "price='" . mysql_real_escape_string ( $_GET [ 'price' ]). "'" ;
    $query = "SELECT * FROM `table` WHERE $where" ;
    It's difficult for me to imagine a case where the table name can be substituted into the query dynamically, but if this happens, then the name should also be inserted only from the set set in the script.
    The parameters of the LIMIT statement must be forced to be cast to an integer type using arithmetic operations or the intval () function.
    It should not be thought that the examples listed here exhaust all the options for the dynamic compilation of queries. You just need to understand the principle, and apply it in all such cases.

    Features of working with the operator LIKE
    A completely separate case is the LIKE operator.
    First of all, in addition to the usual tracing, in variables that are substituted in LIKE, you must double slashes. That is, if the variable contains the character \, then it must be doubled, and after that perform the normal tearing, through mysql_real_escape_string.
    For example, if we are looking for a string
    The symbol \ is called "backslash" and we need an exact match, then we just use mysql_real_escape_string and the query is standard:
    SELECT * FROM test WHERE field = 'символ \\ называется \"backslash\"' If we want to substitute this string in LIKE, we first need to replace each slash with two, and then use mysql_real_escape_string. As a result,
    SELECT * FROM table WHERE field LIKE '% character \\\\ is called \ "backslash \"%'
    Secondly, it should be noted that none of the functions adding slashes adds them to the search metacharacters "%" and "_" used in the LIKE statement. Therefore, if you use this operator, and do not want the _ and% characters to be used as masks, then add the slashes manually. This can be done by a team
    $ Data = addCslashes ( $ data , '% _' ); Attention - these are not addslashes! In the name of this function there is an additional letter "c".

    Thus, it turns out that the variables used in the LIKE statement we must handle separately.
    First replace one slash by two, with the help of such, for example, the code:
    $ Var = str_replace ( '\\' , '\\\\' , $ var ); Then (you can, along with all the other data going into the query), we trace:
    $var = mysql_real_escape_string ( $var ); And then, if we want, that _ and% correspond exactly to ourselves, we make
    $var = addCslashes ( $var , '_%' );
    As a result, if we look for, for example, such a line
    символ \ называется "backslash", а символ _ называется "underscore" then after processing, in the request it should look like this:
    '%символ \\\\ называется \"backslash\", а символ \_ называется \"underscore\" That is, the slash that was in the string was initially quadrupled. The rest of the symbols have followed, as usual. Plus - the underscore symbol has traced.

    About slashes. How to get rid of them
    Slash, or backslash, from English back slash is a backslash ("\"), which suddenly appears by itself in your variables. It is added to some special characters, but mostly it is noticed because of the quotes.
    This is due to the special PHP settings that are normally enabled on the default hosting. Theoretically, these settings can improve the security of scripts working with the database. In practice, the automatic addition of slashes often results in confusion and inconvenience, both when working with the database, and in its absence.
    Below we will discuss these two cases in detail.

    The automatic addition of the slash is answered by the php.ini directives, which are commonly called "magic quotes":
    magic_quotes_gpc and magic_quotes_runtime If the first is enabled, PHP automatically adds slashes to the data that came from the user - from POST, GET requests and cookies (as well as to the login and password received via HTTP Authorization).
    If the second, the slashes are added to the data received during execution of the script - for example, from a file or database.

    If you work without a database, or you are working with the database correctly (as will be written below), extra slashes are only hindering you, and you need to get rid of them. The easiest and most correct way to disable automatic adding, in the PHP settings.
    This can be done either by correcting the appropriate directives in php.ini, if you have access to it, or by creating a .htaccess file in the site's .htaccess , and adding lines to it
    php_flag magic_quotes_gpc 0
    php_flag magic_quotes_runtime 0

    If you can not disable this way, then you'll have to write code of varying degrees of complexity to clear incoming data from the slash. (However, if you want to write a portable application that does not depend on the PHP settings, you'll have to write it anyway and include a separate block at the beginning of your scripts).

    With the data received during work, to understand the easiest: just at the beginning of the script write
    set_magic_quotes_runtime(0); For the data received from the user, everything is much more complicated. For this code we need two functions:
  • To check if PHP was added, you can use the get_magic_quotes_gpc function.
  • Removes the stripslashes function.
    Accordingly, with the help of the first one it is necessary to check, and if PHP added, then iterate over all incoming variables and clear with the second one.
    If you are working correctly, with register_globals = off , then it is enough to apply stripslashes to all arrays containing data coming from the browser.
    For example, you can include in all the scripts of the site this code:
    function strips (& $el ) {
    if (
    is_array ( $el ))
    foreach(
    $el as $k => $v )
    strips ( $el [ $k ]);
    else
    $el = stripslashes ( $el );
    }
    if (
    get_magic_quotes_gpc ()) {
    strips ( $_GET );
    strips ( $_POST );
    strips ( $_COOKIE );
    strips ( $_REQUEST );
    if (isset(
    $_SERVER [ 'PHP_AUTH_USER' ])) strips ( $_SERVER [ 'PHP_AUTH_USER' ]);
    if (isset(
    $_SERVER [ 'PHP_AUTH_PW' ])) strips ( $_SERVER [ 'PHP_AUTH_PW' ]);
    }
    In the case of incorrect register_globals settings, an acceptable solution will not be found at all, so it's better - I repeat - to work right away with the correct settings.

    Remarks
    • Among the reasons why you should not rely on "magic quotes", there is one more. Very unlikely, but still. To "magic quotes" refers, in fact, not two directives, but three. The third one is magic_quotes_sybase . Not only does it add a quotation mark instead of a slash - it also overrides the magic_quotes_gpc action. If by some miracle both these directives have the status 'on', then the latter will not work! That is, relying on "magic quotes", in this case we will get all the charms of incorrectly composed queries. In general, purely theoretically, one must take into account the existence of this directive, since it also presents such a surprise as ... changing the behavior of the addslashes and stripslashes functions! If magic_quotes_sybase = on , then these functions begin to add and remove a single quote instead of the slash, respectively.
    • Все приведенные примеры касаются только БД Mysql. Конкретные правила составления запросов могут отличаться для других СУБД, но общий принцип остается прежним:
      • если API для работы с БД или сторонняя библиотека предоставляет специальные функции для составления запросов , и есть возможность их использования, то пользоваться в первую очередь надо ими.
      • если таких функций нет, то следует искать в документации функции экранирования спецсимволов для этой СУБД.

    Примечание: формы
    При выводе value в тегах input форм, слеши не помогают.
    Чтобы текст в таком поле выводился целиком, value надо заключать в кавычки , а к выводимым данным применять функцию htmlspecialchars()
    Example:
    <input type="text" name="name" value=" <? echo htmlspecialchars ( $name , ENT_QUOTES ) ?> "> <textarea> <? echo htmlspecialchars ( $text , ENT_QUOTES ) ?> </textarea>
    Необходимо так же отметить (хоть это уже совсем не имеет отношения к кавычкам и слешам), что функцию htmlspecialchars следует применять при выводе в браузер вообще ко всем данным, которые получены от непроверенного пользователя. Почему это следует делать, можно почитать в гугле по запросу что такое XSS уязвимость
    by phpfaq.ru