All about CGI (Common Gateway Interface)
What is CGI?
From the beginning we will understand the terminology. CGI - Commom Gateway Interface is an interface that allows a web server to launch any programs on its request from the browser, as well as to give the result of their work to the browser. CGI program (script) is a program (script) running on the server also communicating with the browser through the above interface. Since there is no strict regulation about the definitions of terms, it is very often, speaking CGI , because of the program (script), but not the interface itself.
If this is a program, then it must own any executable format acceptable for a particular operating system. Programs are allowed to scribble on anything: C / C ++, Pascal, Java, Visual is also just Basic, delphi, etc.
If this script (script), then on the operating system, under which the web server is spinning, there must be a corresponding script interpreter: shell, perl, tcl / tk, command.com, etc.
The main purpose of the CGI program (script) development tool was to meet the following requirements: - You can pronounce from the standard input stream (stdin) - acquire the values of environment variables - enclose in the standard stdout flow
For what is used CGI:
- Working with help systems is also a database.
- Creating dynamic HTML documents is also a resource (including counters, guestbooks, etc.)
- Remote administration of various systems.
- Just work with different programs, because the HTML interface is quite easy to use, easy to manufacture is also nice looking
The working mechanism of CGI programs
As already mentioned, CGI acquires input data from the stdin standard input stream, or from environment variables, but outputs the results of its work to the stdout standard output stream. For those. Who does not know what it is: Standard input stream (stdin) - hence the program (script) by default acquires input information. Usually this is a keyboard, but it can be reassigned, and the program (script) will acquire input from the file, socket, output stream of another program.Environment variables are variables defined for the system of the server on which the CGI will be executed. .
Standard output stream (stdout) - here the program (script) displays the results of its work. Usually this is a "monitor", but it is allowed to remap to a file, a socket, an input stream of another program, a printer, etc.
Most examples in this manual are written on the shell only in order to simplify the presentation of the material.
It is not recommended to use the shell to write CGI scripts, according to the final security guidelines.
1.1 Calling CGI without parameters
A very simple script that prints the current date: #!/bin/sh echo Content-type: text/html echo echo "
Today is "date echo"
"In the HTML act, a reference to it is described here in this wayIMPORTANT NOTE The basic error that almost everyone who starts to scribble CGI programs or scripts commit is that they forget to insert a pointer to the type of the result to be printed - the header of the output document. This is also the third line in the example.
The empty line in the conclusion expresses that the title has also terminated in the document itself.
1.2 Passing parameters to a CGI script or program
Parameters are transferred in pairs using the basic methods: GET is also POST . Each of them has its pluses plus disadvantages.
If you use GET, the parameters are added to the requested URL, and you can also call it:
The text of the script itself: #!/bin/sh echo Content-type: text/html echo echo "
You sent this here:
"Echo" "set | grep QUERY_STRING echo""Echo" Environment
However, the use of the GET method to transmit parameters containing confidential information is unacceptable. In this case all this information is transmitted openly.
The POST method allows you to ensure confidentiality when passing parameters to a script. But it passes parameters to the standard input stream, and you have to use forms to do this. The server does not send the EOF script at the end of the program. Instead, you will need to use the CONTENT_LENGTH environment variable to determine what data capacity you need to compute from stdin.
Building a counter
Recently, the number of people wishing to attach a visit counter to their page grows at a frantic pace. On the Internet, there are many places where people can take any counters for any operating systems and screw them to their pages.
This head of leadership will become more useful to those who are interested in the mechanism of the work of counters, since all the attached examples are specially "twisted" according to the elements of settings, administration, and so on. Have no way. More "sophisticated", ready-to-use counters are looking for Altavista, Yahoo, and other search engines. Either ask in relevant news conferences (relcom.www.users, relcom.www.support; in fidoshnyh ehah ru.internet. *).
2.1 Types of counters
With respect to the work mechanism, counters can be conditionally divided into a pair of type:- CGI scripts working like Server Side Include
- CGI scripts do not use Server Side Include
<!--#command tag="value"...-->
Where #command is any of the many commands understood by the Web server. In this case, the most interesting is the #exec command, which allows executing programs to also substitute the results of their work. The documents analyzed by the Web server are called server-parsed documents.
2.2 Visiting counter working as SSI
A work algorithm:
- The server acquires a request from the browser for an HTML document.
- The server scans for the SSI call.
- If such calls are detected, then the result is placed on their premises. In the case of the #exec command, the result of the work of the program specified in "value" .
- The generated HTML act was set in the reverse path to the browser.
Required server settings (on Apache server sample):
- In the srm.conf file, write (if there is not yet written in it):
AddType text / html .shtml AddHandler server-parsed .shtml These directives tell the server that files with .shtml extension are server-parsed documents. - In the file access.conf on the directory where server-parsed documents will lie, add the Includes option in Options.
- For files containing SSI calls, assign the extension .shtml (see clause 1)
In from here we will calculate: <! - # exec cgi = "/ cgi-bin / counter" ->
(Click Reload until you get bored)
This counter is text, i.e. The script gives just the text, which is shown. Similarly, images are allowed to be enclosed. For this you need to replace the text digits with the tags img src = "picture_ with the corresponding_ digit". An inquisitive reader will easily guess that the number of tags img src ... is equal to the number of digits in the value returned by the counter.
The counter of this counter in the body of the act is executed by the command: <!--#exec cgi="/cgi-bin/counter"-->
2.3 A counter that does not use SSI
More simple from the point of view of the user, but more difficult in programming is the counter not using SSI in any way. The working mechanism of such a counter is as follows:
- In the body of the HTML act it is indicated:
<Img src = / cgi-bin / examples / counter.cgi> those. The requested picture is not static, but is dynamically generated by the CGI script. - The server, having received a request for a picture, runs the script specified in the src tag of the img tag.
- Script, increases the value of the counter on the piece, generates a picture with the value of the counter also gives it to the browser.
Since this type of counters is the most popular on the Internet, the algorithm of its work will be considered in more detail.
With the crypt ( counter.cgi ), which is called in the body of the HTML document by the tag img src = "... counter.cgi" written in the shell also owns the following source code (line numbers added only to simplify the explanation): 1: #!/bin/sh 2: now=`date -u` 3: echo "Content-type: image/gif;" 4: echo "Expires: $now" 5: echo 6: counter|showdigits
1: #!/bin/sh 2: now=`date -u` 3: echo "Content-type: image/gif;" 4: echo "Expires: $now" 5: echo 6: counter|showdigits
That this script (a line description) operates:
1 - The title of the script itself. It points to the command interpreter that will execute it.
2 - Define the now variable, which contains the start time of the script (the time the picture was created). The ' -u ' key says that the creation date / time is displayed in GMT . Why should this be described below.
3 - We are starting to form the title of the server's objection. Specify the type of returned data: image / gif
4 - Since this is a counter, it is necessary to ensure that the kartika with its indications is not cached in any way (and what kind of counter it later is . To do this, we indicate that the image received by the browser should immediately zaekspayritsya. Here we also use the now variable defined in line number 2. The use of Expires in this form corresponds to the standard for HTTP protocol version 1.1. But when using Expires, an interesting bug may begin to equal the date of creation of the act, if the hours on the customer lag behind the server clock for several minutes. A dilemma begins - it is laid down by the standard in this way, but it turns out that this is not what it should be. What to do? In the previous version of the protocol (HTTP 1.0), Expires was allowed to set to 0, but RFC2068 states that clients also running on HTTP 1.1 should support the ancient variation of the use of Expires (Expires: 0). So the staff, dear Russians, decide for yourself.
5 - End of the objection header - return an empty string.
6 - Using two programs (counter also showdigits) generate the picture itself.
The counter programs also showdigits are written in C using the labor library with GIF files - libgd. Without it, the programs will not be compiled. The latest version of the library can always be obtained at http://www.boutell.com/gd/ .
These programs are valid:
- Counter - reads from the file counter.rc the number representing the previous value of the counter, adds to it a single also scribbles back. If you do not specify the route to the files - the pictures with the figures are also the mask of these files, then take the default, which is defined in the body of the program. After that, it computed the value of the counter and the route to the pictures output to stdout, than the command line for showdigits is formed.
- Showdigits - this program, in fact, also forms a picture with the current meter reading. To do this, we use a set of ready-made images with figures (gif format, all pictures of the same size) also received on stdin from the counter data. On the route, the mask is also taken from the number of required pictures and one of them is collected by one hyph. After that he goes straight to ... stdout ! And then the server redirects this stream to the browser, too, it (the browser) illustrates it as a picture, since the objection header indicates that it is a hyph.
Server side includes
Well, it's clear that static HTML documents are good, but dynamically created ones are even better.
3.1 What is SSI
As already mentioned in the previous chapter, Server Side Include (SSI) is a Web server directive that allows the server to substitute any data for putting a call on the server. In an HTML act, the SSI call looks like a format comment: <!--#command tag="value"...-->
Where #command is any of the SSI directives understood by the Web server, but " value " is its parameters.
The inserted data can also be static, dynamically generated. Static data is already ready, written in the form of files, fragments of text or HTML. Such data is convenient to apply in the case at which time in different HTML documents you can place duplicate fragments. Dynamically generated data results in the work of any CGI scripts or operating system commands on which a particular Web server is running. Using this type of data gives the Web-developer great opportunities. But, according to the debilitated Russian-bourgeois advertisement, "Do not forget about Orbit without sugar!". I mean, REMEMBER ON MEASURES TO COMPLY WITH SAFE ACCESS TO INFORMATION! Incorrect use of SSI can lead to the possibility of unauthorized access to information and, accordingly, to various grave consequences. .3.2 Basic SSI directives
Config manages various aspects of parsing a document. Attributes : errmsg error message returned to the client, if any error occurred while parsing the document. Sizefmt sets the size format for the file size (bytes, kilobytes, megabytes). Timefmt sets the date / time format. Echo prints the value of one of the following environment variables. Attributes: var The name of the print variable exec executes the specified command or the CGI script. Attributes: cgi is specified (% -coded) URL-relative route to the CGI script. If the route does not begin with (/), it is assumed that the route is specified relative to the current document.The CGI script is passed to the value of the PATH_INFO and QUERY_STRING variables of the original client request.
The cmd server executes the specified string using the operating system shell. Fsize prints the size of the specified file with regard to sizefmt . Attributes: file specifies the path to the file relative to the current directory containing the analyzed file. Virtual is specified (% -coded) URL-relative route to the file. If the route does not start with (/), it is assumed that the route is specified relative to the current document. Flastmod prints the date / time of the final change of the specified file taking into account the timefmt . Attributes are as good as the fsize command. Include inserts the text of another act or file into the analyzed document. It is very useful for repeating fragments in different documents. Attributes: file specifies the path to the file only relative to the current directory containing the analyzed file. Virtual is specified (% -coded) URL-relative route to the file. If the route does not start with (/), it is assumed that the route is specified relative to the current document. In Apache, the included files can exist nested. Printenv prints a list of all existing variables as well as their values. There are no attributes. Example:<!--#printenv -->
set sets the value of the variable. Attributes: var specifies the name of the variable to be set. Value indicates the value of the variable being set. Example: <!--#set var="variable_1" value="some_value_of_variable_1" -->
3.3 SSI environment variables
DOCUMENT_NAME - filename Description in the body of the document:<!--#echo var="DOCUMENT_NAME" -->
Result of usage: <! - # echo var = "DOCUMENT_NAME" -> DOCUMENT_URI - virtual path to the file Description in the body of the document: <!--#echo var="DOCUMENT_URI" -->
Result of usage: <! - # echo var = "DOCUMENT_URI" -> QUERY_STRING_UNESCAPED - decoding the query string, with all shell metacharacters preceded by "\" Description in the body of the document: <!--#echo var="QUERY_STRING_UNESCAPED" -->
Result of use: (none) DATE_LOCAL - current date also time (local) Description in the body of the document: <!--#echo var="DATE_LOCAL" -->
Result of usage: <! - # echo var = "DATE_LOCAL" -> DATE_GMT - current date also time (GMT) Description in the body of the document: <!--#echo var="DATE_GMT" -->
Result of usage: <! - # echo var = "DATE_GMT" -> LAST_MODIFIED - date is also the time of the final change of the file Description in the body of the document: <!--#echo var="LAST_MODIFIED" -->
Result of usage: <! - # echo var = "LAST_MODIFIED" -> 3.4 Configuring the server
In order that the silver knows in which room in the act to substitute the data, he is forced to analyze this act. The documents analyzed by the server are called server-parsed documents.
First of all, you need to let the server know what documents it needs to analyze. To do this, you need to add the following parameters to the coniguration file (for Apache older versions of the NCSA web servers are also the srm.conf file, but for the new Apache versions, for example, 1.3.4 - httpd.conf ), you need to add the following parameters: Apache server:
The NCSA server:
Why specify a separate extension for server-parsed documents ?, the inquisitive reader will ask. We answer. Of course, no person interferes with adding to the configuration file a string
It should also be remembered that it is unsuccessful to include the SSI call in the CGI program, because their conclusion is not analyzed by the server.
For more detailed information on configuring your server for SSI usage, read the documentation on your server.
Apps
Appendix 1. Variables of the server environment
The following is a list of the main server environment variables with a brief description of the purpose. In this case, the Apache 1.2.5 server with the PHP / FI-2.0.1 module. For other web servers (MS IIS, Netscape, NCSA httpd, etc.), the variables may vary.
REMOTE_HOST - the name of the host on the server. In case of working through a proxy, the name of the proxy.
Example: REMOTE_HOST = lom.pvrr.ru
REMOTE_ADDR is the IP address of the host that is connected to the server. In case of working through a proxy - the IP address of the proxy.
Example: REMOTE_ADDR = 194.87.186.11
REMOTE_PORT is the client port number.
Example: REMOTE_PORT = 3381
HTTP_USER_AGENT - name / version number / etc. Customer (browser). The use of this variable sometimes leads to the frenzy of individual Internet users. But in the class itself is a very useful thing. For example, to auto-detect Russian encodings.
Example: HTTP_USER_AGENT = Mozilla / 4.07 [en] (X11; I; FreeBSD 2.2.6-RELEASE i386)
HTTP_ACCEPT - data types other than text / html, perceived by the client (browser)
Example: HTTP_ACCEPT = image / gif, image / x-xbitmap, image / jpeg, image / pjpeg, image / png, * / *
HTTP_ACCEPT_CHARSET - which charsets are understood by the customer (browser).
Example: HTTP_ACCEPT_CHARSET = iso-8859-1, *, utf-8
HTTP_ACCEPT_LANGUAGE - what languages are perceived by the customer (browser).
Example: HTTP_ACCEPT_LANGUAGE = nl, nl-BE, en
* * *
SERVER_NAME - the server name corresponding to the record IN A in DNS, or the value of the ServerName (or similar) variable in the server config.
Example: SERVER_NAME = arche.pvrr.ru
HTTP_HOST is the name of the server or virtual host that the client accesses. The value HTTP_HOST can exist equal to the value of SERVER_NAME.
Example: HTTP_HOST = www.pvrr.ru
SERVER_SOFTWARE - what software is used as a server.
Example: SERVER_SOFTWARE = Apache / 1.2.5 PHP / FI-2.0.1
DOCUMENT_ROOT - route to the "root" of the web server from the "root" of the file system of the computer on which it is running.
Example: DOCUMENT_ROOT = / usr / local / www / html
HTTP_CONNECTION is the connection type.
Example: HTTP_CONNECTION = keep-alive
SERVER_PROTOCOL is the protocol used to communicate with a particular client.
Example: SERVER_PROTOCOL = HTTP / 1.0
REQUEST_URI is the name of the requested resource / document, which includes the path from the root of the web server. When accessing the server root or the directory of this variable, the directory name is given either "/" in the case of the server root.
Example: REQUEST_URI = / cgi-bin / tralala / script.cgi
DOCUMENT_URI - имя запрашиваемого ресурса/документа, включающее в себя путь от корня веб-сервера. Обычно инициализируется при вызове SSI. В отличие от REQUEST_URI эта переменная, в случае обращения к каталогу либо корню сервера получает значение содержащее также имя файла, являющегося Directory Index'ом этого каталога.
Пример: DOCUMENT_URI=/tralala/index.shtml
HTTP_REFERER - наполненный URL документа, по ссылке с которого вы попали на этот сервер. Данную переменную разрешено использовать при написании счетчиков.
Пример: HTTP_REFERER=http://lom.pvrr.ru/java/cgi/cgi_1.html
GATEWAY_INTERFACE - название/версия интерфейса, чрез какой сервер работает со скриптом.
Пример: GATEWAY_INTERFACE=CGI/1.1
SCRIPT_FILENAME - имя скрипта, содержащее наполненный маршрут от "корня" файловой системы.
Пример:SCRIPT_FILENAME=/usr/local/www/cgi-bin/tralala/script.cgi
SCRIPT_NAME - имя скрипта, содержащее маршрут от "корня" веб-сервера.
Пример: SCRIPT_NAME=/cgi-bin/tralala/script.cgi
REQUEST_METHOD - метод используемый заказчиком для передачи данных серверу. Бывают GET, HEAD, POST, PUT.
Пример: REQUEST_METHOD=GET
QUERY_STRING - этой переменной значение присваивается при передаче данных серверу методом GET
Пример: QUERY_STRING=button=on
CONTENT_LENGTH - этой переменной присваивается значение, равное количеству байт, переданных браузером серверу при использовании метода POST.
Пример: CONTENT_LENGTH=9
REMOTE_USER - имя пользователя. Передается только если аутентифицируется доступ к CGI скрипту.
PATH_INFO - дополнительная информация о маршруту, которую передал клиент. То кушать скрипт может приобретать некоторые параметры, содержащие информауцию о некотором "маршруте" к некоторым данным (например к файлу конфигурации, необходимому для отделки запроса отименно этого клиента). Этот маршрут "виртуальный" - т.е от "корня веб-сервера". Остальные данные разрешено передавать как обычно - методом GET или POST.
Пример: PATH_INFO=/some/path
PATH_TRANSLATED - то бла бла , что также PATH_INFO, только маршрут физический - "от корня файловой системы"
REMOTE_IDENT - Если HTTP сервер поддерживает идентификацию согласно RFC 931, то этой переменной присваивается имя пользователя получаемое от сервера.
SERVER_ADMIN - e-mail правителя веб-сервера.
Пример: [email protected]
SERVER_PORT - порт, какой "слушает" веб-сервер.
Пример: SERVER_PORT=80
* * *
HTTP_X_FORWARDED_FOR - в случае труда чрез прокси - IP адрес клиента, работаеющего чрез прокси.
Пример: HTTP_X_FORWARDED_FOR=194.87.186.11
HTTP_VIA - имя, номер порта, разновидность ПО прокси-сервера.
Пример: HTTP_VIA=1.0 proxy1.pvrr.ru:8080 (Squid/2.1.PATCH1)
HTTP_CACHE_CONTROL - что-то связанное с возрастом акта в кэше прокси сервера Лгать никак не буду - никак не знаю
Пример: HTTP_CACHE_CONTROL=max-age=259200
Comments
When commenting on, remember that the content and tone of your message can hurt the feelings of real people, show respect and tolerance to your interlocutors even if you do not share their opinion, your behavior in the conditions of freedom of expression and anonymity provided by the Internet, changes Not only virtual, but also the real world. All comments are hidden from the index, spam is controlled.