Non-refundable c board cgi. Directors' mobile phones
Thanks to the World Wide Web, almost anyone can provide information online in a form that is easy on the eyes and can be widely disseminated. You've no doubt surfed the Internet and seen other sites, and now you probably know that scary acronyms like "HTTP" and "HTML" are simply shorthand for "Web" and "the way information is expressed on the Internet." You may already have some experience presenting information on the Internet.
The Internet has proven to be an ideal medium for distributing information, as can be seen from its enormous popularity and widespread development. Although some have questioned the usefulness of the Internet and attribute its widespread development and popularity mainly to intrusive advertising, the Internet is undeniably an important medium for presenting all kinds of information. Not only are there many services to provide the latest information (news, weather, live sporting events) and reference materials in in electronic format, significant amounts of other types of data are also offered. The IRS, which distributed all of its 1995 tax return forms and other information via the World Wide Web, recently admitted to receiving fan mail for its Web site. Who would have thought that the IRS would ever receive fan mail? This happened not because his site was well designed, but because it turned out to be truly useful tool for thousands, perhaps millions of people.
What makes the Web unique and such an attractive information service? First of all, it provides a hypermedia interface to data. Think about your computer's hard drive. Typically, data is expressed in linear form, similar to file system. For example, you have a number of folders, and inside each folder there are either documents or other folders. The web uses a different paradigm to express information called a hypermedia. A hypertext interface consists of a document and links. Links are words that are clicked to see other documents or find other types of information. The Web expands the concept of hypertext to include other types of media, such as graphics, sounds, video (hence the name "hypermedia"). Selecting text or graphics on a document allows you to see related information about the selected item in any number of forms.
Almost everyone can benefit from this simple and unique way of presenting and distributing information, from academics who want to immediately use data with their colleagues to business people who share information about their company with everyone. However, although it is extremely important to provide information, in the last few years many have felt that no less important process is to obtain information.
Although the Web provides a unique hypermedia interface for information, there are many other effective ways to distribute data. For example, network services such as File Transfer Protocol (FTP) and the Gopher newsgroup existed long before the advent of World Wide Web. Electronic mail has been the primary medium for communication and information exchange on the Internet and most other networks almost from the very beginning of these networks. Why has the Internet become such a popular way of distributing information? The multimedia aspect of the Internet has contributed significantly to its unprecedented success, but for the Internet to be most effective, it must be interactive.
Without the ability to receive user input and provide information, the Web would be a completely static environment. The information would only be available in the format specified by the author. This would undermine one of the capabilities of computing in general: interactive information. For example, rather than forcing the user to view multiple documents as if he or she were looking through a book or dictionary, it would be better to allow the user to identify keywords on a topic of interest. Users can customize the presentation of data rather than relying on a rigid structure defined by the content provider.
The term "Web server" can be misleading because it can refer to both the physical machine and the software it uses to communicate with Internet browsers. When a browser requests a given Web address, it first connects to the machine via the Internet, sending the Web server software a request for the document. This software runs continuously, waiting for such requests to arrive and responding accordingly.
Although servers can send and receive data, the server itself has limited functionality. For example, the most primitive server can only send the required file to the browser. The server usually does not know what to do with this or that additional input. If the ISP does not tell the server how to handle such Additional information, most likely the server will ignore the input.
In order for the server to be able to perform other operations besides searching and sending files to the Internet browser, you need to know how to expand the functionality of the server. For example, a Web server cannot search a database based on a keyword entered by a user and return multiple matching documents unless such a capability has been programmed into the server in some way.
What is CGI?
The Common Gateway Interface (CGI) is an interface to the server that allows you to extend the functionality of the server. Using CGI, you can interact interactively with users who access your site. At a theoretical level, CGI allows the server to be able to parse (interpret) input from the browser and return information based on the user's input. On a practical level, CGI is an interface that allows a programmer to write programs that communicate easily with a server.
Typically, to expand the server's capabilities, you would have to modify the server yourself. This solution is undesirable because it requires understanding the lower layer of Internet Protocol network programming. This would also require editing and recompiling the server source code or writing a custom server for each task. Let's say you want to extend the server's capabilities so that it acts as a Web-to-e-mail gateway, taking user-entered information from the browser and emailing it to another user. The server would have to insert code to parse the input from the browser, forward it via email to the other user, and forward the response back to the browser over the network connection.
Firstly, such a task requires access to the server code, which is not always possible.
Secondly, it is difficult and requires extensive technical knowledge.
Third, this only applies to a specific server. If you need to move your server to another platform, you will have to run or at least spend a lot of time porting code to that platform.
Why CGI?
CGI offers a portable and simple solution to these problems. The CGI protocol defines standard way for programs to contact the Web server. Without any special knowledge, you can write a program in any machine language that interfaces and communicates with the Web server. This program will work with all Web servers that understand the CGI protocol.
CGI communication is done using standard input and output, which means that if you know how to print and read data using your programming language, you can write a Web server application. Apart from parsing input and output, programming CGI applications is almost equivalent to programming any other application. For example, to program the "Hello, World!" program, you use your language's print functions and the format defined for CGI programs to print the corresponding message.
Selecting a programming language
Because CGI is a universal interface, you are not limited to any specific machine language. An important question that is often asked is: what programming languages can be used for CGI programming? You can use any language that allows you to do the following:
- Print to standard output
- Read from standard input
- Read from variable modes
Almost all programming languages and many scripting languages do these three things, and you can use any of them.
Languages fall into one of the following two classes: translated and interpreted. A translated language such as C or C++ is usually smaller and faster, while interpreted languages such as Perl or Rexx sometimes require a large interpreter to be loaded upon startup. Additionally, you can distribute binary codes (code that translates into machine language) without source code if your language is translatable. Distributing interpretable scripts usually means distributing source code.
Before choosing a language, you first need to consider your priorities. You need to weigh the benefits of the speed and efficiency of one programming language against the ease of programming of another. If you have a desire to learn another language, instead of using the one you already know, carefully weigh the advantages and disadvantages of both languages.
The two most commonly used languages for CGI programming are C and Perl (both of which are covered in this book). Both have clear advantages and disadvantages. Perl is a very high-level language, and at the same time a powerful language, especially suitable for parsing text. Although ease of use, flexibility, and power make it an attractive language for CGI programming, its relatively large size and more slow work sometimes makes it unsuitable for some applications. C programs are smaller, more efficient, and provide lower-level system control, but are more complex to program, do not have lightweight built-in text processing routines, and are more difficult to debug.
Which language is most suitable for CGI programming? The one that you consider more convenient from a programming point of view. Both are equally effective for programming CGI applications, and with the proper libraries, both have similar capabilities. However, if you have a hard-to-reach server, you can use smaller, translated C programs. If you need to quickly write an application that requires a lot of text processing work, you can use Perl instead.
Cautions
There are some important alternatives to CGI applications. Many servers now include API programming, which makes it easier to program direct server extensions as opposed to standalone CGI applications. API servers are generally more efficient than CGI programs. Other servers include built-in functionality that can handle special non-CGI elements, such as database linking. Finally, some applications can be handled by some new client-side (rather than server-side) technologies like Java. With such rapid changes in technology, will CGI quickly become obsolete?
Hardly. CGI has several advantages over newer technologies.
- It is versatile and portable. You can write a CGI application using almost any programming language on any platform. Some of the alternatives, such as the server API, limit you to certain languages and are much more difficult to learn.
- It is unlikely that client-side technologies such as Java will replace CGI, because there are some applications that server-side applications are much better suited to run.
- Many of the limitations of CGI are limitations of HTML or HTTP. As Internet standards as a whole evolve, so do CGI capabilities.
Summary
The Common Gateway Interface is the protocol by which programs interact with Web servers. The versatility of CGI gives programmers the ability to write gateway programs in almost any language, although there are many trade-offs associated with different languages. Without this ability, creating interactive Web pages would be difficult, at best requiring server modifications, and interactivity would be unavailable to most users who are not site administrators.
Chapter 2: Basics
Several years ago, I created a page for a college at Harvard where people could submit comments about them. At the time, the Internet was young and documentation was scarce. I, like many others, relied on short documentation and programming systems created by others to teach myself CGI programming. Although this method of study required some searching, many experiments, and created many questions, it was very effective. This chapter is the result of my early work with CGI (with a few tweaks, of course).
Although it takes some time to fully understand and master the common gateway interface, the protocol itself is quite simple. Anyone who has some basic programming skills and is familiar with the Web can quickly learn to program fairly complex CGI applications just as I and others learned to do several years ago.
The purpose of this chapter is to present the basics of CGI in a comprehensive, albeit condensed, way. Each concept discussed here is presented in detail in subsequent chapters. However, after completing this chapter, you can immediately begin programming CGI applications. Once you reach this level, you can learn the intricacies of CGI, either by reading the rest of this book or simply experimenting on your own.
You can boil down CGI programming to two tasks: receiving information from the Web browser and sending information back to the browser. This is done quite intuitively once you become familiar with the normal use of CGI applications. Often the user is asked to fill out some form, for example, insert his name. Once the user fills out the form and presses Enter, this information is sent to the CGI program. The CGI program must then convert this information into something it understands, process it accordingly, and then send it back to the browser, be it a simple confirmation or the result of a search in a multi-purpose database.
In other words, programming CGI requires understanding how to receive input from the Internet browser and how to send output back. What happens between the input and output stages of a CGI program depends on the developer's goal. You'll find that the main difficulty in CGI programming lies in this intermediate stage; Once you learn how to work with input and output, that is essentially enough to become a CGI developer.
In this chapter, you'll learn the principles behind CGI input and output, as well as other basic skills you'll need to write and use CGI, including things like creating HTML forms and naming your CGI programs. This chapter covers the following topics:
- Traditional program "Hello, World!";
- CGI Output: Sending information back for display in an Internet browser;
- Configuring, installing, and running the application. You will learn about different Web platforms and servers;
- CGI Input: Interpretation of information sent by the Web browser. Introduction to some useful programming libraries for parsing such input;
- A simple example: it covers all the lessons in a given chapter;
- Programming strategy.
Due to the nature of this chapter, I only touch lightly on some topics. Don't worry; All of these topics are covered in much more depth in other chapters.
Hello, World!
You start with a traditional introductory programming problem. You will write a program that displays "Hello, World!" on your Web browser. Before you write this program, you must understand what information the Web browser expects to receive from CGI programs. You also need to know how to run this program so you can see it in action.
CGI is language independent, so you can implement this program in any language. Several different languages are used here to demonstrate the independence of each language. IN Perl language, the program "Hello, World!" shown in Listing 2.1.
Listing 2.1. Hello, World! in Perl. #!/usr/local/bin/perl # Hello.cgi - My first CGI program print "Content-Type: text/html\n\n"; print "
\n"; print "Hello, World!
\n"; print " \n";Save this program as hello.cgi, and install it in the appropriate location. (If you're not sure where it is, don't worry; you'll find out in the "Installing and Running a CGI Program" section later in this chapter.) For most servers, the directory you need is cgi-bin. Now, call the program from your Web browser. For most, this means opening the following uniform resource locator (URL):
http://hostname/directoryname/hello.cgi
Hostname is the name of your Web server, and directoryname is the directory where you put hello.cgi (probably cgi-bin).
Splitting hello.cgi
There are a few things to note about hello.cgi.
First, you use simple print commands. CGI programs do not require any special file descriptors or output descriptors. To send output to the browser, simply print to stdout.
Second, note that the content of the first print statement (Content-Type: text/html) does not appear on your Web browser. You can send any information you want back to the browser (HTML page, graphics or sound), but first, you need to tell the browser what kind of data you are sending it. This line tells the browser what kind of information to expect - in this case, an HTML page.
Thirdly, the program is called hello.cgi. You don't always need to use the .cgi extension with the name of your CGI program. Although source for many languages also uses the .cgi extension, it is not used to indicate the type of language, but is a way for the server to identify the file as an executable file and not a graphic file, HTML file or text file. Servers are often configured to only attempt to execute those files that have this extension, displaying the contents of all others. Although using the .cgi extension is not required, it is still considered good practice.
In general, hello.cgi consists of two main parts:
- tells the browser what information to expect (Content-Type: text/html)
- tells the browser what to display (Hello, World!)
Hello, World! in C
To show the language independence of CGI programs, Listing 2.2 shows the equivalent of the hello.cgi program written in C.
Listing 2.2. Hello, World! in C. /* hello.cgi.c - Hello, World CGI */ #include Hello, World!
\n"); printf(" \n"); )
Note
Note that the Perl version of hello.cgi uses Content-Type print ": text/html\n\n "; While version C uses Printf("Content-Type: text/html\r\n\r\n");
Why does Perl print the operator end with two characters new line(\n) while C printf ends with two carriage returns and newlines (\r\n)?
Technically, headers (all output before the blank line) are expected to be separated by carriage returns and newlines. Unfortunately, on DOS and Windows machines, Perl translates \r as another newline rather than as a carriage return.
Although Perl's \rs exception is technically incorrect, it will work on almost all protocols and is equally portable across all platforms. Therefore, in all the Perl examples in this book, I use newline separating headers rather than carriage returns and newlines.
An appropriate solution to this problem is presented in Chapter 4, Conclusion.
Neither the Web server nor the browser cares what language is used to write the program. Although each language has advantages and disadvantages as a CGI programming language, it is best to use the language that you are most comfortable working with. (The choice of programming language is discussed in more detail in Chapter 1, “Common Gateway Interface (CGI)”).
CGI rendering
Now you can take a closer look at the issue of sending information to the Web browser. From the "Hello, World!" example, you can see that Web browsers expect two sets of data: a header, which contains information such as what information to display (eg Content-Type: line) and actual information (what the Web browser displays). These two pieces of information are separated by a blank line.
The header is called the HTTP header. It gives important information about the information that the browser is going to receive. There are several different types of HTTP headers, and the most common is the one you've used before: Content-Type: header. You can use different combinations of HTTP headers, separated by carriage returns and newlines (\r\n). The blank line separating the header from the data also consists of a carriage return and a newline (why both are needed is briefly discussed in the preceding note and detailed in Chapter 4). You'll learn about other HTTP headers in Chapter 4; Currently you are dealing with Content-Type: header.
Content-Type: The header describes the type of data that the CGI returns. The appropriate format for this header is:
Content-Type: subtype/type
Where subtype/type is the correct Multipurpose Internet Mail Extensions (MIME) type. The most common MIME type is the HTML type: text/html. Table 2.1 lists a few more common MIME types that will be discussed; A more complete listing and analysis of MIME types is provided in Chapter 4.
Note
MIME was originally invented to describe the contents of mail message bodies. It has become a fairly common way to represent Content-Type information. You can read more about MIME in RFC1521. RFCs on the Internet stand for Requests for Comments, which are summaries of decisions made by groups on the Internet trying to set standards. You can view the results of RFC1521 at the following address: http://andrew2.andrew.cmu.edu/rfc/rfc1521.html
Table 2.1. Some common MIME types. MIME Type Description Text/html Hypertext Markup Language (HTML) Text/plain Plain text files Image/gif Graphic files GIF Image/jpeg Compressed graphic files JPEG Audio/basic Audio - Sun *.au Audio/x-wav files Windows files*.wav
After the header and an empty line, you simply print the data in the form you need. If you are sending HTML, then print HTML tags and data to stdout after the header. You can also send graphics, sound and other binary files by simply printing the contents of the file to stdout. Several examples of this are given in Chapter 4.
Installing and Running a CGI Program
This section deviates somewhat from CGI programming and talks about configuring your Web server to use CGI, installing and running programs. You'll be introduced to different servers for different platforms in more or less detail, but you'll have to dig deeper into your server's documentation to find the best option.
All servers require space for server files and space for HTML documents. In this book, the server area is called ServerRoot, and the document area is called DocumentRoot. On UNIX machines, ServerRoot is usually in /usr/local/etc/httpd/, and DocumentRoot is usually in /usr/local/etc/httpd/htdocs/. However, this will not make any difference to your system, so replace all references to ServerRoot and DocumentRoot with your own ServerRoot and DocumentRoot.
When you access files using your Web browser, you specify the file in the URL relative to the DocumentRoot. For example, if your server address is mymachine.org, then you access this file with the following URL: http://mymachine.org/index.html
Configuring the server for CGI
Most Web servers are pre-configured to allow the use of CGI programs. Typically two parameters indicate to the server whether the file is a CGI application or not:
- Designated directory. Some servers allow you to determine that all files in a designated directory (usually called cgi-bin by default) are CGI.
- File name extensions. Many servers have this pre-configuration that allows all files ending in .cgi to be defined as CGI.
The designated directory method is something of a relic of the past (the very first servers used it as the only method for determining which files were CGI programs), but it has several advantages.
- It keeps CGI programs centralized, preventing other directories from becoming cluttered.
- You're not limited to any particular filename extension, so you can name your files whatever you want. Some servers allow you to designate several different directories as CGI directories.
- It also gives you more control over who can record CGI. For example, if you have a server and support a system with multiple users and don't want them to use their own CGI scripts without first reviewing the program for security reasons, you can designate only those files in a limited, centralized directory as CGI. Users will then have to provide you with CGI programs to install, and you can first audit the code to make sure the program doesn't have any major security issues.
The CGI notation via filename extension can be useful due to its flexibility. You are not limited to one single directory for CGI programs. Most servers can be configured to recognize CGI via the filename extension, although not all are configured this way by default.
Warning
Remember the importance of security issues when you configure your server for CGI. Some tips will be covered here, and Chapter 9, Protecting CGI, covers these aspects in more detail.
Installing CGI on UNIX servers
Regardless of how your UNIX server is configured, there are several steps you need to take to ensure that your CGI applications run as expected. Your Web server will typically run as a non-existent user (that is, the UNIX user nobody - an account that has no file permissions and cannot be logged in). CGI scripts (written in Perl, the Bourne shell, or another scripting language) must be executable and readable throughout the world.
Clue
To make your files readable and executable worldwide, use next command UNIX permissions: chmod 755 filename.
If you are using a scripting language such as Perl or Tcl, provide the full path of your interpreter on the first line of your script. For example, a Perl script using perl in the /usr/local/bin directory would begin with the following line:
#!/usr/local/bin/perl
Warning
Never place the interpreter (perl, or Tcl Wish binary) in the /cgi-bin directory. This creates a security risk on your system. This is discussed in more detail in Chapter 9.
Some generic UNIX servers
The NCSA and Apache servers have similar configuration files because the Apache server was originally based on the NCSA code. By default, they are configured so that any file in the cgi-bin directory (located by default in ServerRoot) is a CGI program. To change the location of the cgi-bin directory, you can edit the conf/srm.conf configuration file. The format for configuring this directory is
ScriptAlias fakedirectoryname realdirectoryname
where fakedirectoryname is the pseudo directory name (/cgi-bin) and realdirectoryname is the full path where the CGI programs are actually stored. You can configure more than one ScriptAlias by adding more ScriptAlias lines.
The default configuration is sufficient for most users' needs. You need to edit the line in the srm.conf file in either case to determine the correct realdirectoryname. If, for example, your CGI programs are located in /usr/local/etc/httpd/cgi-bin, the ScriptAlias line in your srm.conf file should be something like this:
ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/
To access or link to CGI programs located in this directory, use the following URL:
Http://hostname/cgi-bin/programname
Where hostname is the name of the host of your Web server, and programname is the name of your CGI.
For example, let's say you copy the hello.cgi program to your cgi-bin directory (eg /usr/local/etc/httpd/cgi-bin) on your Web server called www.company.com. To access your CGI, use the following URL: http://www.company.com/cgi-bin/hello.cgi
If you want to configure your NCSA or Apache server to recognize any file with a .cgi extension as a CGI, you need to edit two configuration files. First, in the srm.conf file, uncomment the following line:
AddType application/x-httpd-cgi .cgi
This will associate the MIME type CGI with the .cgi extension. Now, we need to change the access.conf file so that we can run CGI in any directory. To do this, add the ExecCGI option to the Option line. It will look something like the following line:
Option Indexes FollowSymLinks ExecCGI
Now, any file with a .cgi extension is considered CGI; access it as you would any file on your server.
The CERN server is configured in the same way as the Apache and NCSA servers. Instead of ScriptAlias, the CERN server uses the Exec command. For example, in the httpd.conf file, you will see the following line:
Exec /cgi-bin/* /usr/local/etc/httpd/cgi-bin/*
Other UNIX servers can be configured in the same way; This is described in more detail in the server documentation.
Installing CGI on Windows
Most servers available for Windows 3.1, Windows 95 and Windows NT are configured using the "file name extension" method for CGI recognition. In general, changing the server configuration to Windows based simply requires running the server configuration program and making the appropriate changes.
Sometimes configuring a server to run a script (such as Perl) correctly can be difficult. In DOS or Windows, you will not be able to specify the interpreter on the first line of the script, as is the case with UNIX. Some servers have a predefined configuration to associate certain filename extensions with the interpreter. For example, many Windows Web servers assume that files ending in .pl are Perl scripts.
If the server does not perform this type of file association, you can define a packager batch file that calls both the interpreter and the script. As with the UNIX server, do not install the interpreter in either the cgi-bin directory or any Web-accessible directory.
Installing CGI on Macintosh
The two most well-known server options for the Macintosh are WebStar StarNine and its predecessor MacHTTP. Both recognize CGI by its filename extension.
MacHTTP understands two different extensions: .cgi and .acgi, which stands for asynchronous CGI. Regular CGI programs installed on a Macintosh (with a .cgi extension) will keep the Web server in a busy state until the CGI finishes running, causing the server to suspend all other requests. Asynchronous CGI, on the other hand, allows the server to accept requests even while it is running.
A CGI Macintosh developer using any of these Web servers should, if possible, use just the .acgi extension rather than the .cgi extension. It should work with most CGI programs; if it doesn't work, rename the program to .cgi.
Executing CGI
Once you have installed CGI, there are several ways to execute it. If your CGI program is an output-only program, such as the Hello,World! program, then you can execute it simply by accessing its URL.
Most programs run as a server application on an HTML form. Before learning how to get information from these forms, first read a short introduction about creating such forms.
A quick tutorial on HTML forms
The two most important tags in an HTML form are the
to indicate the end of the form. You cannot have a form within a form, although you can set up a form that allows you to present pieces of information in different places; this aspect is discussed extensively in Chapter 3.Tag
You can create text input bars, radio buttons, checkboxes, and other means of accepting input using the tag . This section covers only text input fields. To implement this field, use the tag with the following attributes:
< INPUT TYPE=text NAME = "... " VALUE = "... " SIZE = MAXLENGTH = >
NAME is the symbolic name of the variable that contains the value entered by the user. If you include text in the VALUE attribute, that text will be placed as default in the text input field. The SIZE attribute allows you to specify the horizontal length of the input field as it will appear in the browser window. Finally, MAXLENGTH specifies the maximum number of characters the user can enter into the field. Please note that the VALUE, SIZE, MAXLENGTH attributes are optional.
Form Submission
If you have only one text field within a form, the user can submit the form by simply typing information on the keyboard and pressing Enter. Otherwise, there must be some other way for the user to present the information. The user submits information using a submit button with the following tag:
< Input type=submit >
This tag creates a Submit button inside your form. When the user finishes filling out the form, he or she can submit its contents via URL address specified by the ACTION attribute of the form by clicking the Submit button.
Accepting input from the browser
Above were examples of recording a CGI program that sends information from the server to the browser. In reality, a CGI program that only outputs data does not have many applications (some examples are given in Chapter 4). The more important ability of CGI is to receive information from the browser - the feature that gives the Web its interactive character.
The CGI program receives two types of information from the browser.
- First, it obtains various pieces of information about the browser (its type, what it can view, the host host, and so on), the server (its name and version, its execution port, and so on), and the CGI program itself ( program name and where it is located). The server gives all this information to the CGI program through environment variables.
- Second, the CGI program can receive user input. This information, after being encoded by the browser, is sent either through an environment variable (GET method) or through standard input (stdin - POST method).
Environment Variables
It is useful to know what environment variables are available to a CGI program, both during training and for debugging. Table 2.2 lists some of the available CGI environment variables. You can also write a CGI program that outputs environment variables and their values to a Web browser.
Table 2.2. Some Important CGI Environment Variables Environment Variable Purpose REMOTE_ADDR IP address of the client machine. REMOTE_HOST The host of the client machine. HTTP _ACCEPT Lists the MIME data types that the browser can interpret. HTTP _USER_AGENT Browser information (browser type, version number, operating system, etc.). REQUEST_METHOD GET or POST. CONTENT_LENGTH The size of the input if sent via POST. If there is no input or if the GET method is used, this parameter is undefined. QUERY_STRING Contains the input information when it is passed using the GET method. PATH_INFO Allows the user to specify a path from the CGI command line (for example, http://hostname/cgi-bin/programname/path). PATH_TRANSLATED Translates relative path in PATH_INFO to the actual path on the system.
To write a CGI application that displays environment variables, you need to know how to do two things:
- Define all environment variables and their corresponding values.
- Print results to the browser.
You already know how to perform the last operation. In Perl, environment variables are stored in the associative array %ENV, which is introduced by the name of the environment variable. Listing 2.3 contains env.cgi, a Perl program that accomplishes our goal.
Listing 2.3. A Perl program, env.cgi, that prints out all the CGI environment variables.
#!/usr/local/bin/perl print "Content-type: text/html\n\n"; print "
\n"; print "CGI Environment
\n"; foreach $env_var (keys %ENV) ( print " $env_var= $ENV($env_var)\n"; ) print " \n";
A similar program could be written in C; the complete code is in Listing 2.4.
Listing 2.4. Env.cgi.c in C. /* env.cgi.c */ #include CGI Environment
\n"); while(*p != NULL) printf("%s
\n",*p++); printf(" \n"); )
GET or POST?
What's the difference between the GET and POST methods? GET passes the encoded input string through the QUERY_STRING environment variable, while POST passes it through stdin. POST is the preferred method, especially for forms with a lot of data, because there are no restrictions on the amount of information sent, and when GET method the amount of environmental space is limited. GET does however have a certain useful property; this is covered in detail in Chapter 5, Input.
To determine which method is used, the CGI program checks the environment variable REQUEST_METHOD, which will be set to either GET or POST. If it is set to POST, the length of the encoded information is stored in the CONTENT_LENGTH environment variable.
Coded Input
When a user submits a form, the browser first encodes the information before sending it to the server and then to the CGI application. When you use the tag , each field is given a symbolic name. The value entered by the user is represented as the value of the variable.
To determine this, the browser uses a URL encoding specification, which can be described as follows:
- Separates different fields with an ampersand (&).
- Separates the name and values with equal signs (=), with the name on the left and the value on the right.
- Replaces spaces with plus signs (+).
- Replaces all "abnormal" characters with a percent sign (%) followed by a two-digit hex code for the character.
Your final encoded string will be similar to the following:
Name1=value1&name2=value2&name3=value3 ...
Note: Specifications for URL encoding are found in RFC1738.
For example, let's say you had a form that asked for name and age. The HTML code that was used to display this form is shown in Listing 2.5.
Listing 2.5. HTML code to display the name and age form.
Let's say the user enters Joe Schmoe in the name field and 20 in the age field. The input will be encoded in the input string.
Name=Joe+Schmoe&age=20
Parsing input
For this information to be useful, you need to use the information on something that can be used by your CGI programs. Strategies for parsing input are covered in Chapter 5. In practice, you will never have to think about how to parse input, because several experts have already written publicly accessible libraries that produce parsing. Two such libraries are presented in this chapter in the following sections: cgi -lib.pl for Perl (written by Steve Brenner) and cgihtml for C (written by me).
The general purpose of most libraries written in various languages, is to parse the encoded string and put name and value pairs into a data structure. There is an obvious advantage to using a language that has built-in data structures like Perl; however, most libraries for low-level languages such as C and C++ include data structure and subroutine execution.
It is not necessary to achieve a complete understanding of libraries; it's more important to learn how to use them as tools to make the CGI programmer's job easier.
Cgi-lib.pl
Cgi-lib.pl uses Perl associative arrays. The &ReadParse function parses the input string and enters each name/value pair by name. For example, corresponding Perl strings, needed to decode the "name/age" input string just presented would be
&ReadParse(*input);
Now, to see the value entered for "name", you can access the associative array $input("name"). Similarly, to access the value of "age", you need to look at the variable $input ("age").
Cgihtml
C doesn't have any built-in data structures, so cgihtml implements its own linklist for use with its CGI parsing routines. This defines the entrytype structure as follows:
Typedef struct ( Char *name; Char *value; ) Entrytype;
To parse the input string "name/age" in C using cgihtml, the following is used:
/* declare a linked list called input */ Llist input; /* parse input and location in linked list */ read_cgi_input(&input);
To access age information, you can either parse the list manually or use the available cgi _val() function.
#include
The "age" value is now stored in the age string.
Note: Instead of using a simple array (like char age ;), I'm dynamically allocating memory space for the string age. Although this makes programming more difficult, it is nevertheless important from a security point of view. This is discussed in more detail in Chapter 9.
A simple CGI program
You are going to write a CGI program called nameage.cgi that handles the name/age form. Data processing (what I usually call "stuff") is minimal. Nameage.cgi simply decodes the input and displays the user's name and age. While there isn't much use for such a tool, it does demonstrate the most critical aspect of CGI programming: input and output.
You use the same form as above, calling up the "name and age" fields. Don't worry about robustness and efficiency just yet; solve the existing problem in the simplest way. The Perl and C solutions are shown in Listings 2.6 and 2.7, respectively.
Listing 2.6. Nameage.cgi in Perl
#!/usr/local/bin/perl # nameage.cgi require "cgi-lib.pl" &ReadParse(*input); print "Content-Type: text/html\r\n\r\n"; print "
\n"; print "\n"; print " \n";
Listing 2.7. nameage.cgi in C
/* nameage.cgi.c */ #include\n",cgi_val(input,"age")); printf(" \n"); )
Please note that these two programs are almost equivalent. They both contain parsing routines that occupy only one line and process the entire input (thanks to the corresponding library routines). The output is essentially a modified version of your main Hello, World! program.
Try to run the program by filling out the form and clicking the Submit button.
General programming strategy
You now know all the basic principles required for CGI programming. Once you understand how CGI receives information and how it sends it back to the browser, the actual quality of your final product depends on your general programming abilities. Namely, when you program CGI (or anything at all, for that matter), keep the following qualities in mind:
- Simplicity
- Efficiency
- Versatility
The first two qualities are quite common: try to make your code as readable and efficient as possible. Versatility applies more to CGI programs than to other applications. When you start developing your own programs CGI, you will learn that there are several basic applications that everyone wants to make. For example, one of the most common and obvious tasks of a CGI program is to process a form and email the results to a specific recipient. You could have multiple separate forms processed, each with a different recipient. Instead of writing a CGI program for each individual form, you can save time by writing a more general CGI program that applies to all forms.
By covering all the basic aspects of CGI, I've provided you with enough information to get started with CGI programming. However, to become an effective CGI developer, you need to have a deeper understanding of how CGI communicates with the server and browser. The remainder of this book covers in detail the issues that were briefly mentioned in this chapter, as well as application development strategy and the advantages and limitations of the protocol.
Summary
This chapter briefly introduced the basics of CGI programming. You create output by formatting your data correctly and printing to stdout. Receiving CGI input is a bit more complex because it must be parsed before it can be used. Fortunately, several libraries already exist that perform parsing.
By now you should be fairly comfortable with programming CGI applications. The remainder of this book goes into more detail about specifications, tips, and programming strategies for more advanced and complex applications.
Chapter #9.
Programming with using CGI
Including a section on CGI in a book on databases may seem as strange as if cookbook a chapter on car repair was included. Of course, in order to go to the grocery store, you need a working car, but is it appropriate to talk about this? A full discussion of CGI and Web programming in general is beyond the scope of this book, but a brief introduction to these topics is enough to expand the capabilities of MySQL and mSQL for presenting data in the realm of the Web.
This chapter is primarily intended for those who are studying databases but would like to gain some knowledge of Web programming. If your last name is Berners-Lee or Andreessen, you're unlikely to find anything here that you don't already know. But even if you're not new to CGI, having a quick reference guide can be very useful when diving into the mysteries of MySQL and mSQL.
What is CGI?
Like most acronyms, Common Gateway Interface (CGI) doesn't really say much. Interface with what? Where is this gateway? What kind of community are we talking about? To answer these questions, let's go back a little and take a look at the WWW as a whole.
Tim Berners-Lee, a physicist who worked at CERN, came up with the idea of the Web in 1990, although the plan dates back to 1988. The idea was to enable particle physics researchers to easily and quickly share multimedia data - text, images and sound - through the Internet. The WWW consisted of three main parts: HTML, URL and HTTP. HTML - A formatting language used to present content on the Web. URL - this is the address used to retrieve HTML (or other) content from the web server. And finally HTTP - it is a language that is understood by the web server and allows clients to request documents from the server.
The ability to send information of all types over the Internet was revolutionary, but another possibility was soon discovered. If you can send any text over the Web, then why can’t you send text created by a program, and not taken from a ready-made file? This opens up a sea of possibilities. A simple example: you can use a program that prints current time, so that the reader sees the correct time every time they view the page. Several smart heads at the National Center for Supercomputing Applications ( National Center Supercomputer Application Development - NCSA), who were creating a web server, saw this opportunity, and CGI soon appeared.
CGI is a set of rules that allow programs on a server to send data to clients through a web server. The CGI specification was accompanied by changes to HTML and HTTP that introduced new characteristic, known as forms.
If CGI allows programs to send data to the client, then forms extend this capability by allowing the client to send data to that CGI program. Now the user can not only see the current time, but also set the clock! CGI shapes have opened the door to true interactivity in the Web world. Common CGI applications include:
- Dynamic HTML. Entire websites can be generated by one CGI program.
- Search engines that find documents containing user-specified words.
- Guest books and message boards where users can add their messages.
- Order forms.
- Questionnaires.
- Retrieving information from a database hosted on the server.
In subsequent chapters we will discuss all of these CGI applications, as well as some others. They all provide a great way to connect CGI to a database, which is what we're interested in in this section.
HTML Forms
Before exploring the specifics of CGI, it's useful to look at the most common way that end users provide an interface to CGI programs: HTML forms. Forms are part HTML language, providing the end user with fields of various types. Data entered into fields can be sent to the web server. Fields can be used to enter text or be buttons that the user can click or check. Here is an example of an HTML page containing a form:
<НТМL><НЕАD><ТITLЕ>My forms page
<р>This is a page with a form.
This form creates a 40 character string where the user can enter their name. Below the input line there is a button, when clicked, the form data is transferred to the server. Listed below are form-related tags supported by HTML 3.2, the most widely used standard today. Tag and attribute names can be entered in any case, but we adhere to the optional convention that start tags are written in upper case and closing tags are written in lower case.
The only input type we haven't used here is the IMAGE type for the tag . It could be used as an alternative form submission method. However, the IMAGE type is rarely compatible with text-based and less-responsive browsers, so it's wise to avoid it unless your site has a graphic-heavy style.
Once you've learned the basics of HTML forms, you can start learning about CGI itself.
CGI Specification
So what exactly is the “set of rules” that allows a CGI program in, say, Batavia, Illinois, to communicate with a web browser in Outer Mongolia? The official CGI specification, along with a wealth of other information about CGI, can be found on the NCSA server at http://hoohoo . ncsa.uluc.edu/cgi/. However, this chapter exists for this reason, so that you do not have to travel for a long time and look for it yourself.
There are four ways in which CGI passes data between the CGI-npor frame and the Web server, and therefore the Web client:
- Environment variables.
- Command line.
- Standard input device.
- Standard output device.
With these four methods, the server forwards all the data sent by the client to the CGI program. The CGI program then does its magic and sends the output back to the server, which forwards it to the client.
This data is based on the Apache HTTP server. Apache is the most common web server, running on almost any platform, including Windows 9x and Windows NT. However, they may apply to all HTTP servers that support CGI. Some proprietary servers, such as those from Microsoft and Netscape, may have additional features or operate slightly differently. As the face of the Web continues to change at an incredible rate, standards are still evolving and there will undoubtedly be changes in the future. However, when it comes to CGI, the technology appears to be established - at the cost of being replaced by other technologies, such as applets. Any CGI programs you write using this information will almost certainly be able to run for many years on most web servers.
When a CGI program is called through a form, the most common interface, the browser sends the server a long string that begins with the path to the CGI program and its name. This is followed by various other data called path information, which is passed to the CGI program through the PATH_INFO environment variable (Figure 9-1). The path information is followed by a "?" character, followed by the form data, which is sent to the server using the HTTP GET method. This data is made available to the CGI program through the QUERY_STRING environment variable. Any data that the page sends using the HTTP POST method, which is the most commonly used method, will be passed to the CGI program through the standard input device. A typical string that a server might receive from a browser is shown in Fig. 9-1. Program named formread in the catalog cgi-bin called by the server with additional path information extra/information and choice=help request data - presumably as part of the original URL. Finally, the form data itself (the text “CGI programming” in the “keywords” field) is sent via the HTTP POST method.
Environment Variables
When the server runs a CGI program, it first passes it some data to run in the form of environment variables. The specification officially defines seventeen variables, but many more are used informally through the mechanism described below, called HTTP_/nec/zams/n. CGI program
has access to these variables in the same way as any shell environment variables when run from the command line. In a shell script, for example, the F00 environment variable can be accessed as $F00; in Perl this call looks like $ENV("F00"); in C - getenv("F00"); etc. Table 9-1 lists the variables that are always set by the server - even if they are null. In addition to these variables, the data returned by the client in the request header is assigned to variables of the form HTTP_F00, where F00 is the name of the header. For example, most web browsers include version information in a header called USEfl_AGENT. Your CGI-npor-ramma can obtain this data from the HTTP_USER_AGENT variable.
Table 9-1.CGI Environment Variables
Environment variable |
Description |
||
CONTENT_LENGTH |
Length of data transferred using POST or PUT methods, in bytes. |
||
CONTENT_TYPE |
The MIME type of the data attached using the POST or PUT methods. |
||
GATEWAY_INTERFACE |
The version number of the CGI specification supported by the server. |
||
PATH_INFO |
Additional path information sent by the client. For example, for the request http://www.myserver.eom/test.cgi/this/is/a/ path?field=green the value of the variable PATH_ INFO will be /this/is/a/path. |
||
PATH_TRANSLATED |
Same as PATH_INFO, but the server produces all |
||
|
Possible translation, for example, name extensions like “-account”. » |
||
QUERY_STRING |
All data following the "?" in URL. This is also the data passed when the form's REQ-UEST_METHOD is GET. |
||
REMOTE_ADDR |
IP address of the client making the request. |
||
REMOTE_HOST |
The host name of the client machine, if available. |
||
REMOTE_IDENT |
If the web server and client support type identification identd then this is the username of the account that is making the request. |
||
REQUEST_METHOD |
The method the client uses to make the request. For the CGI programs we are going to create, this will usually be POST or GET. |
||
SERVER_NAME | The hostname—or IP address if no name is available—of the machine running the web server. | ||
SERVER_PORT | The port number used by the web server. | ||
SERVER_PROTOCOL |
The protocol used by the client to communicate with the server. In our case, this protocol is almost always HTTP. | ||
SERVER_SOFTWARE | Information about the version of the web server running the CGI program. | ||
SCRIPT_NAME |
The path to the script to run, as specified by the client. Can be used when a URL refers to itself, and so that scripts referenced in different locations can be executed differently depending on the location. |
||
Here's an example of a CGI Perl script that prints out all the environment variables set by the server, as well as any inherited variables, such as PATH, set by the shell that started the server.
#!/usr/bin/perl -w
print<< HTML;
Content-type: text/html\n\n
HTML
foreach (keys %ENV) ( print "$_: $ENV($_)
\n"; )
print<
HTML
All of these variables can be used and even modified by your CGI program. However, these changes do not affect the web server that runs the program.
Command line
CGI allows arguments to be passed to the CGI program as command line parameters, which is rarely used. It is rarely used because its practical applications are few, and we will not dwell on it in detail. The bottom line is that if the QUERY_STRING environment variable does not contain the "=" character, then the CGI program will be executed with the command line parameters taken from QUERY_STRING. For example, http://www.myserver.com/cgi- bin/finger?root will run finger root on www.myserver.com.
There are two main libraries that provide a CGI interface to Perl. The first one is cgi-lib.pl Utility cgi-lib.pl very common because for a long time it was the only large library available. It is designed to work in Perl 4, but works with Perl 5. The second library, CGI.pm, newer and in many ways superior cgi-lib.pl. CGI.pm written for Perl 5 and uses a fully object-oriented design for working with CGI data. Module CGI.pm parses the standard input device and the QUERY_STRING variable and stores the data in a CGI object. Your program only needs to create a new CGI object and use simple methods like paramQ to retrieve the data you need. Example 9-2 serves as a short demonstration of how CGI.pm interprets the data. All Perl examples in this chapter will use CGI.pm.
Example 9-2.
Parsing CGI Data in Perl
#!/usr/bin/perl -w
use CGI qw(:standard);
# The CGI.pm module is used. qw(:standard) imports
# namespace of standard CGI functions to get
# clearer code. This can be done if in the script
# only one CGI object is used.
$mycgi = new CGI; #Create a CGI object that will be the gateway to the form data
@fields = $mycgi->param; # Retrieve the names of all completed form fields
print header, start_html("CGI.pm test"); ft Methods "header" and "start_html",
# provided
# CGI.pm, make it easier to get HTML.
# "header" outputs the required HTTP header, a
#"start_html" outputs an HTML header with the given name,
#a is also a tag
.print "<р>Form data:
";
foreach (@fields) ( print $_, ":",- $mycgi->param($_), "
"; }
# For each field, print the name and value obtained using
#
$mycgi->param("fieldname").
print end_html; # Shorthand for displaying ending tags "".
Processing input data in C
Since the core APIs for MySQL and mSQL are written in C, we won't completely abandon C in favor of Perl, but we will provide some C examples where appropriate. There are three widely used C libraries for CGI programming: cgic Tom Boutell*; cgihtml Eugene Kim and libcgi from EIT*. We believe that cgic is the most complete and easy to use. What it lacks, however, is the ability to list all the form variables when you don't know them in advance. In fact, it can be added with a simple patch, but that is beyond the scope of this chapter. Therefore, in Example 9-3 we use the library cgihtml, to repeat the above Perl script in C.
Example 9-3.Parsing CGI data in C
/*
cgihtmltest.c - Typical CGI program for displaying keys and their values
from data received from the form */
#include
#include "cgi-lib.h" /* This contains all CGI function definitions */
#include "html-lib.h" /* This contains "all HTML helper function definitions */
void print_all(llist 1)
/* These functions output the data submitted by the form in the same format as the Perl script above. Cgihtml also provides a built-in function
Print_entries(), which does the same thing using HTML list format. */ (
node*window;
/* The "node" type is defined in the cgihtml library and refers to a linked list that stores all the form data. */
window = I.head; /* Sets a pointer to the beginning of the form data */
while (window != NULL) ( /* Loop through the linked list to the last (first empty) element */
printf(" %s:%s
\n",window->entry. name,replace_ltgt(window->entry.value));
/* Print data. Replace__ltgt() is a function that understands the HTML encoding of text and ensures that it is correctly output to the client browser. */
window = window->next; /* Move to the next list element. */
} }
int main() (
llist entries; /* Pointer to parsed data*/
int status; /* Integer representing status */
Html__header(); /* HTML helper function that outputs the HTML header*/
Html_begin("cgihtml test");
/* An HTML helper function that prints the beginning of an HTML page with the specified title. */
status = read_cgi_input(&entries); /* Enters and parses form data*/
Printf("<р>Form data:
");
Print_all(entries); /* Calls the print_all() function defined above. */
html_end(); /* HTML helper function that prints the end of the HTML page. */
List_clear(&entries); /* Frees memory occupied by form data. */
return 0; )
Standard Output Device
The data sent by the CGI program to the standard output device is read by the web server and sent to the client. If the script name starts with nph-, then the data is sent directly to the client without intervention from the web server. In this case, the CGI program must generate the correct HTTP header that the client will understand. Otherwise, let the web server generate the HTTP header for you.
Even if you don't use nph-scenario, the server needs to be given one directive that will tell it information about your output. This is usually the Content-Type HTTP header, but can also be the Location header. The header must be followed by an empty line, that is, a line feed or a CR/LF combination.
The Content-Type header tells the server what type of data your CGI program is producing. If this is an HTML page, then the string should be Content-Type: text/html. The Location header tells the server a different URL - or a different path on the same server - where to direct the client. The header should look like this: Location: http:// www. myserver. com/another/place/.
After the HTTP headers and an empty line, you can send the actual data produced by your program - an HTML page, an image, text, or anything else. Among the CGI programs supplied with the Apache server are nph-test-cgi And test-cgi which nicely demonstrate the difference between nph and non-nph style headings, respectively.
In this section we will use libraries CGI.pm And cgic, which have functions to output both HTTP and HTML headers. This will allow you to focus on outputting the actual content. These helper functions are used in the examples given earlier in this chapter.
Important Features of CGI Scripts
You already know basically how CGI works. The client sends data, usually using a form, to the web server. The server executes the CGI program, passing data to it. The CGI program does its processing and returns its output to the server, which passes it on to the client. Now from understanding how CGI npor frames work, we need to move on to understanding why they are so widely used.
Although you already know enough from this chapter to be able to put together a simple working CGI program, there are a few more important issues to cover before you can create actually working programs for MySQL or mSQL. First, you need to learn how to work with multiple forms. Next, you need to learn some security measures that will prevent attackers from illegally accessing or destroying your server files.
Storing state
State remembering is a vital means of providing a good service to your users, and not just for fighting hardened criminals as it may seem. The problem is caused by the fact that HTTP is a so-called “memoryless” protocol. This means that the client sends data to the server, the server returns the data to the client, and then everyone goes their own way. The server does not store data about the client that may be needed in subsequent operations. Likewise, there is no guarantee that the client will retain any data about the transaction that can be used later. This places an immediate and significant limitation on the use of the World Wide Web.
Scripting CGI with this protocol is similar to not being able to remember a conversation. Whenever you talk to someone, no matter how often you've talked to them before, you have to introduce yourself and look for a common topic of conversation. There is no need to explain that this is not conducive to productivity. Figure 9-2 shows that whenever a request reaches a CGI program, it is an entirely new instance of the program, with no connection to the previous one.
On the client side, with the advent of Netscape Navigator, a seemingly hastily made solution called cookies appeared. It consists of creating a new HTTP header that can be sent back and forth between the client and server, similar to the Content-Type and Location headers. The client browser, upon receiving the cookie header, must store the data in the cookie, as well as the name of the domain in which the cookie operates. Then, whenever a URL within the specified domain is visited, a cookie header must be returned to the server for use by CGI programs on that server.
The cookie method is mainly used to store the user ID. Information about the visitor can be saved in a file on the server machine. This user's unique ID can be sent as a cookie to the user's browser, and then each time the user visits the site, the browser automatically sends this ID to the server. The server passes the ID to the CGI program, which opens the corresponding file and gains access to all data about the user. All this happens unnoticed by the user.
Despite the usefulness of this method, most large sites do not use it as their only means of remembering state. There are a number of reasons for this. First, not all browsers support cookies. Until recently, the main browser for people with limited vision (not to mention people with insufficient Internet connection speeds) - Lynx - did not support cookies. It still doesn't "officially" support them, although some of its widely available "side branches" do. Secondly, and more importantly, cookies tie the user to a specific machine. One of the great benefits of the Web is that it is accessible from anywhere in the world. No matter where your web page was created or stored, it can be displayed from any Internet-connected machine. However, if you try to access a cookie-enabled site from someone else's machine, all of your personal information maintained by the cookie will be lost.
Many sites still use cookies to personalize user pages, but most complement them with a traditional login/password style interface. If the site is accessed from a browser that does not support cookies, the page contains a form in which the user enters the login name and password assigned to him when he first visited the site. Typically this form is small and unassuming, so as not to scare off the majority of users who are not interested in any personalization, but simply want to move on. After the user enters a login name and password into the form, CGI finds a file containing data about that user, as if the name was sent with a cookie. Using this method, the user can register on a personalized website from anywhere in the world.
In addition to the tasks of taking into account user preferences and long-term storage of information about him, we can give a more subtle example of remembering state, which is provided by popular search engines. When you search using services such as AltaVista or Yahoo, you will usually get many more results than can be displayed in an easy-to-read format. This problem is solved by showing a small number of results - usually 10 or 20 - and giving some navigation facility to view the next group of results. While this behavior seems normal and expected to the average Web surfer, actually implementing it is nontrivial and requires state storage.
When a user first makes a query to a search engine, it collects all the results, perhaps limited to some predefined limit. The trick is to produce these results in small quantities at a time, while remembering what kind of user requested these results and what portion he is expecting next. Leaving aside the complexities of the search engine itself, we are faced with the problem of consistently providing the user with some information on one page. Consider Example 9-4, which shows a CGI script that prints ten lines of a file and gives it the option to look at the next or previous ten lines.
Example 9-4. Saving State in a CGI Script
#!/usr/bin/perl -w
use CGI;
Open(F,"/usr/dict/words") or die("He can't open! $!");
#This is the file that will be output, it can be anything.
$output = new CGI;
sub print_range ( # This main function programs, my $start = shift;
# Starting line of the file, my $count = 0;
# Pointer, my $line = "";
# Current line of file, print $output->header,
$output->start_html("My dictionary");
#
Produces HTML with the title "My Dictionary", print "
while (($count< $start) and ($line =
# Skip all lines before the initial one, while (($count< $start+10) and
($line ?
#
Print the next 10 lines.
my $newnext = $start+10; my $newprev = $start-10;
# Set initial lines for URLs "Next" and "Previous"
print "
";
unless ($start == 0) ( # Include "Previous" URL unless you
# is no longer at the beginning.
print qq%Previous%; )
unless (eof) ( # Include the "Next" URL unless you #
not at the end of the file.
print qq% Next%;
}
print "HTML;HTML
exit(0); )
# If there is no data, start over,
if (not $output->param) (
&print_range(0); )
# Otherwise, start from the line specified in the data.
&print_range($output->param("start"));
In this example, the state is stored using the simplest method. There is no problem with saving data, since we keep it in a file on the server. We only need to know where to start output, so the script simply includes in the URL the starting point for the next or previous group of lines - all that is needed to generate the next page.
However, if you need more than the ability to simply flip through a file, then relying on a URL can be cumbersome. You can alleviate this difficulty by using an HTML form and including state data in tags type HIDDEN. This method has been used successfully on many sites, allowing links to be made between related CGI programs or expanding the use of a single CGI program, as in the previous example. Instead of pointing to a specific object, such as a home page, the URL data may point to an automatically generated user ID.
This is how AltaVista and other search engines work. The first search generates a user ID, which is hidden behind the scenes in subsequent URLs. Associated with this ID is one or more files containing the results of the query. The URL includes two more values: your current position in the results file and the direction in which you want to navigate next in it. These three values are all that is needed for the powerful navigation systems of large search engines to work.
However, there is still something missing. The file used in our example /usr/diet/words very big. What if we give up halfway through reading it, but want to come back to it later? If you don't remember the URL of the next page, there is no way to go back, not even AltaVista will allow it. If you restart your computer or use another computer, you won't be able to return to your previous search results without re-entering your search. However, this long-term state storage is at the heart of the website personalization we discussed above, and it's worth looking at how it can be used. Example 9-5 is a modified version of Example 9-4.
Example 9-5.
Stable state memorization
#!/usr/bin/perl -w
use CGI;
umask 0;
Open(F,"/usr/dict/words") or die("He can't open! $!");
Chdir("users") or die("I can't go to directory $!");
#
This is the directory where all data will be stored
# about the user.
Soutput = new CGI;
if (not$output->param) (
print $output->header,
$output->start_html("My dictionary");
print "HTML;