![]() |
![]() |
2006/08/23
$web/etc/handler
The file "$web/etc/handler" connects requested URI path to the action.
User can define relations between path pattern and the script*. This mechanism is used to define the form of CGI file, SSI (Server Side Include) and auto-indexing service for specific directories.
Relations between path patterns and the script is defined in $web/etc/handler. The following is the content of my configuration (http://plan9.aichi-u.ac.jp).
# path mimetype hctl execpath arg ... /netlib/*/index.html text/html 0 /bin/ftp2html *.http - 1 $target *.cgi text/html + $target *.html text/html 0 $target *.tt text/html 0 /bin/peep $target
Example execution handler
First field is a path pattern, second field is default mime type, third fields is the control level of http header by the script, and 4th field is the path to a script. The 4th field may be followed by arguments of the script.
Comparison of path pattern and requested path is performed from the top of line. Comparison is stopped if pattern is matched to the requested path.
In path pattern, directory separator "/"' is not special. ( Therefore this pattern matching is not same as that of shell. ) There is one exception: we have a rule that pattern "/*/" matches "/". Therefore the pattern
/netlib/*/index.html
matches to /netlib/index.html as well as /netlib/cmd/backup/index.html for example.
Second field denotes the default value of HTTP header "Content-Type". If the field is "-", the script must set the header.
Third field named "hctl" takes values '1', '+', and '0' that means control level to the http headers by the script; the meanings are
1 full control by the script
+ partial control by the script
0 no control by the script
If '1' is specified the script has responsibility to write all http headers; the script is called non-parsed CGI in CGI/1.1. HTTP headers must be separated from HTML headers by a single blank line: a line that contains only "\n" code. Trailing code "\n" in these headers will be converted to "\r\n" by Pegasus.
If '0' is specified the script must not write http header. The header is provided by httpd.
If '+' is specified the script may contain http headers in compliance with CGI/1.1: if the header name is absent in regular http header, then the header is added; if the header name is already present in regular http header, then the header is replaced by new one in the script.
An example is shown below.
Set-Cookie: cookie=something; expire=Sun, 6-Aug-2006 11:43:57 GMT; domain=ar.aichi-u.ac.jp; path=/test4; secure <html> <head> <title>Cookie sample</title> </head> <body> ... </body> </html>
In case that hctl field is '+', http header "Content-Length" in the script will be ignored because the server cares the header.
A reserved word $target in or after 4th field denotes absolute path to the requested document. Note that $target in 4th field means the requested path is an executable program.
Clients can request to Pegasus adding arguments to CGI . These arguments are automatically added without description in handler.
/bin/ftp2html in this example is a script that is used in
http://plan9.aichi-u.ac.jp/netlib/
to handle my FTP directories. Other server such as Apache has an option to show directory index if index.html is absent. ftp2html also does this action but does much more: if README file is present then the content is shown, and if INDEX file is present then the content is shown with appropriate action tag to the index label.
Ramdisk is private to the script, and is automatically vanished as soon as the script is finished or terminated.
If "text/html" is specified for mimetype and the hctl value is '0', the format of CGI is:
<html> ... </html>
That is, don't start with "Content-Type:" as Apache does:
Content-Type: text/html <html> ... </html>
Apache type CGI is also supported. The file with suffix "cgi" in the example execution handler will configure CGI/1.1 for the file.
In case that "text/html" is specified for "mimetype" in execution handler, Pegasus automatically send HTML headers to the client. Then the response headers follow the rule:
200 OK" is sent if exit status is not given.
500 Internal Error" and close the connection*.
200 OK" for compatibility reason.It seems this rule is working well, however we can control directly the connection: we can specify "keep" or "close" after "#"
exit '403 Forbidden # keep'
Both stdout and stderr are passed to client.
Params in HTTP/1.0 及び HTTP/1.1 is applied to Pegasus. According to RFC specification the format is:
path;params?query
params = param[;params]
Some of traditional web server neglect params and passe query to CGI as the argument. Pegasus disapproves this traditional manner and accepts param as argument parts that should be passed to CGI. On the other hand, Pegasus does not participate in interpritating query and passes it to CGI as environment variable without translation.
Pegasus has many environment variables. However most of them are only experimental. Solid variables are shown in the following:
AUTH_TYPE
CONTENT_LENGTH
CONTENT_TYP
GATEWAY_INTERFACE
PATH_INFO
PATH_TRANSLATED
QUERY_STRING
REMOTE_ADDR
REMOTE_HOST
REMOTE_USER
REQUEST_METHOD
SCRIPT_NAME
SERVER_NAME
SERVER_PORT
SERVER_PROTOCOL
SERVER_SOFTWARE
and all the attribute in HTTP header such as
HTTP_HOST
HTTP_REFERER
HTTP_USER_AGENT
with original header
HTTP_HEADER
Additionally we have
REQUEST_PATH # requested path (see Note)
REQUEST_URI # requested path (see Note)
home # /doc
query # same as QUERY_STRING
target # requested path in service space
name # basename of target
cputype # 386
objtype # 386
date # date such as "{Mon, 04 Mar 2002 07:32:40 GMT}"
REQUEST_URI might end with "/" if it is a directory. On the other hand REQUEST_PATH is a file that is effectively requested. target is expressed in the notation of rc. target = /doc$REQUEST_PATH
Other environment variables might be discarded or renamed in future.
If POST'ed data is once received by the server from the client, Content-Length is checked by the server in receiving the data. Then server passes the data to CGI using stdin.
Timeout is defined to prevent buggy programs waiting data so long time. The value can be specified in httpd option or /sys/lib/httpd.conf. The default is 5 second. I think the value is enough because the data is already held by the server.