Site map | Home | cgi errors - how to fix | cgi links

CGI Errors and how to fix them

Short overview of the CGI calling process

Before you try to find answers to specific problems you have to understand how CGI programs are executed on the server. There is no magic behind the Common Gateway Interface, it is just a standardized specification for interfacing external applications with information servers, such as HTTP or Web servers. The current version of this specification is CGI/1.1 and all the information in this article refers to this version. All recent servers follow this specification and if you want to get into technical detail (you should sooner or later) you can find it at http://hoohoo.ncsa.uiuc.edu/cgi/interface.html

This specification is usually called the Standard CGI Interface because it uses Standard Input (STDIN) and Standard Output (STDOUT) to read and send the data. There is another form of CGI which is called Win CGI, this is a specification for the same purpose but tailored to Windows servers. Instead of Standard Input/Output it uses spooled I/O and originated in 16bit Windows servers but has been pushed forward since to make use of Win32 features. Visual Basic and Delphi applications usually make use of the Win CGI interface. This article doesn't get into Win CGI, but you can get the specs at http://website.ora.com/wsdocs/32demo/windows-cgi.html

The process is simple: The webserver receives a request for a document in form of an URL and a method type. For simplicity's sake we will only talk about the GET and POST methods but you should be aware that those are not the only methods (At the time of this writing three other methods are in use, PUT, DELETE and HEAD). The URL is mapped internally to a filelocation. The type of the requested file determines what the webserver does next. To identify the type of the file the filename suffix is mapped to a (customizable) table in the server configuration. 

If it doesn't find the suffix the server tries to spit out the requested file on STDOUT and the trouble to identify the filetype is left to the browser. If the suffix is found but not registered as a CGI type, the server adds the MIME type that this suffix is mapped to to its output Header and outputs the file. In this case the method POST is not valid; the server will complain if the request is a POST, usually with a message like "Method not implemented". The browser tries to map the MIME type to its own table of recognized filetypes and determines how to hand the file to the user, i.e. displaying the content in the browser window or lauching an external application or plugin.

If the server encounters a file type that it considers to be a handler (Apache Jargon) for a CGI program (usually that type would be application/x-httpd-cgi) it will execute the file. But before it executes the script, the server usually does other security checks first, i.e. whether a certain directory is allowed to contain CGI scripts is entirely up to the person configuring the server. If everything is fine the server finally tries to execute the script, and along the way it passes a variety of Environment Variables to the CGI program, like the method, the content- length of the request, and other useful information about the server and the client. From this point on any output needs to be done by the just called CGI application.

At this point there are already a lot of things that could have gone wrong: The extension must be mapped to a file-type and this type needs to be associated with CGI. Many servers also have the possibility to recognize CGI programs by their location, so you can specify a directory in which every file is considered to be a CGI script. Administrators often make use of this and keep all CGI files in a cgi-bin directory. Also, since the server is the one executing the script it needs to have rights to do that. Finally, the script must actually execute and do some output. Errors at this stage usually result in that famous 500 Server Error every CGI programmer has encountered often enough. The output needed consists of a valid Header (the very least you need to output is a MIME type as the first line of the output like this: Content type: <MIME type>), followed by a blank line and finally some data (without any data you will get an error). 

Server Errors

Server errors are pretty general because the server only has an approximate idea why it couldn't serve the requested script. The most common ones are 500 Internal Server Error, 501 Method not implemented, 403 Forbidden, 401 File not found and Document contains no data. If you can't make out the cause right away look at the server's error log. Chances are you get a more detailed description of the error. There is unfortunately no sure way to say where the logs are located but if you can't find them ask the administrator or use search tools (On unix machines try locate or find, NCSA and Apache logs tend to be named error_log)
If you have read the previous section, you already should have a basic idea why and when those errors occur. Let's break them down into two categories: Errors that occur before the script is invoked and those that occur while or after execution.
Misconfiguration
If you are getting a 401 File not found error but you believe the file to be in the correct location, i.e. your html directory is at /~yourname and you created a subdirectory called cgi-bin in this directory, your first guess would be that files in this directory have the URL /~yourname/cgi-bin/file.cgi, right? That's usually the case, but as I said earlier, the process of mapping the URL to a file location is done by the webserver, and it's totally up to the server how it does the mapping. So don't take anything for granted, it might well be that a partial URL is mapped to a totally different directory, some virtual servers often make use of that in conjunction with /cgi-bin URL's to map those automatically to the same directory for all hosts on the system.

403 Forbidden errors should be rather easy to debug. Somewhere along the way to the requested file the server was instructed not to serve the file. This can be either in form of a htaccess directive or a directory along the way was not readable by the server. Check the permissions of the directories. Keep in mind that the server runs under its own userid and that is most likely NOT root but a user with minimal rights. There are ways to run scripts suid meaning with the ID of the owner of the file but until the server gets to the file this doesn't matter - all directories up to that point must be readable by the userid under which the server runs. If you verified the permissions and can still not figure out why you get this error, chances are the problem lies in the server configuration in form of an access restriction to certain paths.

We already covered an example when the 501 Method not implemented error occurs. This error can also occur when the server configuration says not to execute CGI scripts in this directory. In this case the file is treated as a normal file but was called with the POST method - for the server this is a No-No, only cgi-scripts can be called with POST.

Now to the hardest to debug server error, the all-too-generic 500 Internal Server Error : This is basically the servers way of saying "I tried to access that file and it exists but I either can't execute it or the output wasn't at all what I expected." There are many reasons for the first possibility (can't execute the file):

* The userID the server runs under doesn't have permissions to execute the file, check if you have set the file to be executable.
* The server configuration disallows execution of CGI scripts in this path, find the configuration files and look if that is the case, best ask the administrator who set up the server.
* An I/O error occured while reading the file (unlikely), try invoking the script again.

If the server executes the file but the output isn't what the server expects (a valid Header followed by a blank line followed by some data) we have another problem which we will examine next.

Syntax Errors
Let's consider the process of executing a CGI script written in Perl (on a unix machine), the first line in the script points to the local Perl interpreter, i.e. #!/usr/bin/perl. What really happens is the server finds the script, sees it is executable and trusts that from now on everything will work flawless, it's out of the server's responsibility. But this is just a textfile, not a compiled program. The so-called she-bang notation on the first line is part of any shellscript that needs to invoke an interpreter, in this case it happens to be Perl. What it says is "Start up this program to run the remaining lines." Make sure that the path in the first line is correct, a common error is to have the first line point to something like /usr/local/bin while the program actually resides in /usr/bin, or the other way around.
If the script can't be interpreted because of syntax errors the interpreter will output some error message and quit. The webserver gets this output and looks for a correct header, but of course something like "Syntax error on line 5" isn't a valid header so the server has no choice but bailing out with a 500 Server error.
So before you ever test a script via the webserver with your browser, check it for syntax errors first! In Perl you can do that by invoking the script with the -c switch from the commandline, i.e. perl -c myscript.cgi. Be aware that the path to your Perl interpreter has to be right as well, so better use the same path for testing that you have defined in your script (in our example /usr/bin/perl -c myscript.cgi). Whenever possible test on the same machine your script will be running on, so you'll notice newbie-mistakes like uploading the script source in binary mode instead of ascii right away.

Perl scripts on Windows platforms are a different issue, here we don't have a unix shell that will let us use the she-bang notation, but still the problem with syntax errors is the same. On Windows it depends on the webserver how it implements Standard CGI, Microsoft's IIS® for example needs to have the info which interpreter to call (and how) in the registry, O'Reilly's WebSite® on the other hand just needs the extension to be registered with the Windows Explorer. (Correction: version 2.0 of WebSite Pro uses its own internal file mapping which can be set in the Server Properties) So if this information isn't given or is wrong the interpreter will never be executed or not executed correctly, resulting in unexpected output and most of the time a standard 500 Server Error. If you have trouble with CGI on Windows you should definetely check out the Perl for Win32 FAQ at http://www.activestate.com/ActivePerl/docs/Perl-Win32/perlwin32faq.html. Among other things it contains detailed configuration instructions for all common Windows webservers.

All contents copyright © TheCGIpath.com all rights reserved