PSGI - Perl Web Server Gateway Interface Specification
This document specifies a standard interface between web servers and Perl web applications or frameworks, to promote web application portability and reduce the duplicated efforts by web application framework developers.
Keep in mind that PSGI is not Yet Another web application framework. PSGI is a specification to decouple web server environments from web application framework code. PSGI is also not the web application API. Web application developers (end users) are not supposed to run their web applications directly using the PSGI interface, but instead are encouraged to use frameworks that support PSGI, or use the helper implementations like Plack (more on that later).
Servers are web servers that accept HTTP requests, dispatch the requests to the web applications and return the HTTP response to the clients. In PSGI specification it's a Perl process that's running inside an HTTP server (e.g. mod_perl in Apache), a daemon process called from a web server (e.g. FastCGI daemon) or a pure perl HTTP server.
Servers are also called PSGI implementations as well as Backends.
Applications are web applications that actually get HTTP requests and return HTTP response. In PSGI it's a code reference: see below.
Middleware is a PSGI application, which is a code reference, but also runs like a server to run other applications. It can be thought of a plugin to extend PSGI application: see below.
Framework developers are authors of web application frameworks. They need to write adapters (or engines) to read PSGI input, then run the application logic and returns PSGI response to the server.
Web application developers are developers who write code that uses one of the web application framework that uses PSGI interface. They usually don't need to deal with nor care about PSGI protocol at all.
A PSGI application is a Perl code reference. It takes exactly one argument, the environment and returns an array reference of exactly three values.
sub app { my $env = shift; return [ '200', [ 'Content-Type' => 'text/plain' ], [ "Hello World" ], # or IO::Handle-like object ]; }
The environment MUST be a hash reference that includes CGI-like headers. The application is free to modify the environment. The environment is required to include these variables (adopted from PEP333, Rack and JSGI) except when they'd be empty, but see below:
REQUEST_METHOD: The HTTP request method, such as "GET" or "POST". This cannot ever be an empty string, and so is always required.
REQUEST_METHOD
SCRIPT_NAME: The initial portion of the request URL's path that corresponds to the application, so that the application knows its virtual "location". This may be an empty string if the application corresponds to the "root" of the server.
SCRIPT_NAME
PATH_INFO: The remainder of the request URL's "path", designating the virtual "location" of the request's target within the application. This may be an empty string if the request URL targets the application root and does not have a trailing slash. This value should be URI decoded by servers to be compatible to RFC 3875.
PATH_INFO
REQUEST_URI: The undecoded, raw request URL line. It is the raw URI path and query part that appears in the HTTP GET /... HTTP/1.x line and doesn't contain URI scheme and host names.
REQUEST_URI
GET /... HTTP/1.x
Unlike PATH_INFO, this value SHOULD NOT be decoded by servers and hence it is an application's responsibility to properly decode paths to map URL to application handlers, when using REQUEST_URI over PATH_INFO.
QUERY_STRING: The portion of the request URL that follows the ?, if any. May be empty, but is always required.
QUERY_STRING
?
SERVER_NAME, SERVER_PORT: When combined with SCRIPT_NAME and PATH_INFO, these variables can be used to complete the URL. Note, however, that HTTP_HOST, if present, should be used in preference to SERVER_NAME for reconstructing the request URL. SERVER_NAME and SERVER_PORT can never be empty strings, and so are always required.
SERVER_NAME
SERVER_PORT
HTTP_HOST
SERVER_PROTOCOL: The version of the protocol the client used to send the request. Typically this will be something like "HTTP/1.0" or "HTTP/1.1" and may be used by the application to determine how to treat any HTTP request headers.
SERVER_PROTOCOL
HTTP_ Variables: Variables corresponding to the client-supplied HTTP request headers (i.e., variables whose names begin with HTTP_). The presence or absence of these variables should correspond to the presence or absence of the appropriate HTTP header in the request.
HTTP_
If there are multiple header lines sent with the same key, the server should treat them as if they're sent in one line, i.e. combine them with , as in RFC 2616.
,
In addition to this, the PSGI environment MUST include these PSGI-specific variables:
psgi.version: An array ref [1,0] representing this version of PSGI.
psgi.version
psgi.url_scheme: A string http or https, depending on the request URL.
psgi.url_scheme
http
https
psgi.input: the input stream. See below.
psgi.input
psgi.errors: the error stream. See below.
psgi.errors
psgi.multithread: true if the application may be simultaneously invoked by another thread in the same process, false otherwise.
psgi.multithread
psgi.multiprocess: true if an equivalent application object may be simultaneously invoked by another process, false otherwise.
psgi.multiprocess
The PSGI environment MAY include these optional PSGI variables:
psgi.run_once: true if the server expects (but does not guarantee!) that the application will only be invoked this one time during the life of its containing process. Normally, this will only be true for a server based on CGI (or something similar).
psgi.run_once
psgi.nonblocking: true if the server is calling the application in an non-blocking event loop.
psgi.nonblocking
The server or the application can store its own data in the environment, too. The keys MUST contain at least one dot, and should be prefixed uniquely. The prefix psgi. is reserved for use with the PSGI core implementation and other accepted extensions and MUST NOT be used otherwise. The environment MUST NOT contain the keys HTTP_CONTENT_TYPE or HTTP_CONTENT_LENGTH (use the versions without HTTP_). The CGI keys (named without a period) MUST have a scalar variable containing strings. There are the following restrictions:
psgi.
HTTP_CONTENT_TYPE
HTTP_CONTENT_LENGTH
psgi.version MUST be an array of integers.
psgi.url_scheme MUST be a scalar variable containing either the string http or https.
There MUST be a valid input stream in psgi.input.
There MUST be a valid error stream in psgi.errors.
The REQUEST_METHOD MUST be a valid token.
The SCRIPT_NAME, if non-empty, MUST start with /
/
The PATH_INFO, if non-empty, MUST start with /
The CONTENT_LENGTH, if given, MUST consist of digits only.
CONTENT_LENGTH
One of SCRIPT_NAME or PATH_INFO MUST be set. PATH_INFO should be / if SCRIPT_NAME is empty. SCRIPT_NAME should never be /, but should instead be empty.
The input stream in psgi.input is an IO::Handle-like object which streams the raw HTTP POST or PUT data. If it is a file handle then it MUST be opened in binary mode. The input stream MUST respond to read and MAY implement seek.
read
seek
The built-in filehandle or IO::Handle based objects should work fine everywhere. Application developers SHOULD NOT inspect the type or class of the stream, but instead just call read to duck type.
Application developers SHOULD NOT use the built-in read function to read from the input stream, because read function only works with the real IO object (a glob ref based file handle or PerlIO) and makes duck typing difficult. Web application framework developers, if they know the input stream will be used with the built-in read() in any upstream code they can't touch, SHOULD use PerlIO or tie handle to work around with this problem.
$input->read($buf, $len [, $offset ]);
Returns the number of characters actually read, 0 at end of file, or undef if there was an error.
$input->seek($pos, $whence);
Returns 1 on success, 0 otherwise.
The error stream in psgi.errors is an IO::Handle-like object to print errors. The error stream must implement print.
print
The built-in filehandle or IO::Handle based objects should work fine everywhere. Application developers SHOULD NOT inspect the type or class of the stream, but instead just call print to duck type.
$errors->print($error);
Returns true if successful.
HTTP status code, is an integer and MUST be greater than or equal to 100.
The headers must be an array reference (and NOT a hash reference!) containing key and value pairs. Its number of elements MUST be even. The header MUST NOT contain a Status key, contain keys with : or newlines in their name, contain keys that end in - or _ but only contain keys that consist of letters, digits, _ or - and start with a letter. The value of the header must be a scalar value that contain a string. The value string MUST NOT contain characters below chr(37) except chr(32) (whitespace).
Status
:
-
_
If the same key name appears multiple times in an array ref, those header lines MUST be sent to the client separately (e.g. multiple Set-Cookie lines).
Set-Cookie
There MUST be a Content-Type except when the Status is 1xx, 204 or 304, in which case there MUST be none given.
Content-Type
There MUST NOT be a Content-Length header when the Status is 1xx, 204 or 304.
Content-Length
If the Status is not 1xx, 204 or 304 and there is no Content-Length header, servers MAY calculate the content length by looking at Body, in case it can be calculated (i.e. if it's an array ref of body chunk or a real file handle), and append to the outgoing headers.
The response body is returned from the application in one of following two types of scalar variable.
An array reference containing body as lines.
my $body = [ "Hello\n", "World\n" ];
Note that the elements in an array reference are NOT REQUIRED to end in a newline. The servers SHOULD just write each elements as is to the client, and SHOULD NOT care if the line ends with newline or not.
So, when you have a big chunk of HTML in a single scalar $body,
$body
[ $body ]
is a valid response body.
An IO::Handle-like object or a built-in filehandle.
open my $body, "</path/to/file"; open my $body, "<:via(SomePerlIO)", ...; my $body = IO::File->new("/path/to/file"); my $body = SomeClass->new(); # mock class that implements getline() and close()
Servers SHOULD NOT check the type or class of the body but instead just call getline (i.e. duck type) to iterate over the body and call close when done.
getline
close
Servers MAY check if the body is a real filehandle using fileno and Scalar::Util::reftype and if it's a real filehandle that has a file descriptor, it MAY optimize the file serving using techniques like sendfile(2).
fileno
Scalar::Util::reftype
The body object MAY respond to path method to return the local file system path, which MAY be used by some servers to switch to more efficient file serving method using the file path instead of a file descriptor.
path
Servers are RECOMMENDED to set $/ special variable to the buffer size when reading content from $body using getline method, in case it's a binary filehandle. Applications, when it returns a mock object that implements getline are NOT REQUIRED to respect the $/ value.
$/
Middleware is itself a PSGI application but it takes an existing PSGI application and runs it like a server, mostly to do pre-processing on $env or post-processing on the response objects.
$env
Here's a simple example that appends special HTTP header X-PSGI-Used to any PSGI application.
# $app is a simple PSGI application my $app = sub { my $env = shift; return [ '200', [ 'Content-Type' => 'text/plain' ], [ "Hello World" ] ]; }; # $xheader is a middleware to wrap $app my $xheader = sub { my $env = shift; my $res = $app->($env); push @{$res->[1]}, 'X-PSGI-Used' => 1; return $res; };
Middleware itself MUST behave exactly like a PSGI application: take $env and return $res.
$res
Some parts of this specification are adopted from the following specifications.
PEP333 Python Web Server Gateway Interface http://www.python.org/dev/peps/pep-0333
Rack http://rack.rubyforge.org/doc/SPEC.html
JSGI Specification http://jackjs.org/jsgi-spec.html
I'd like to thank authors of these great documents.
Tatsuhiko Miyagawa <miyagawa@bulknews.net>
The following people have contributed to the PSGI specification and Plack implementation by commiting their code, sending patches, reporting bugs, asking questions, suggesting useful advices, nitpicking, chatting on IRC or commenting on my blog (in no particular order):
Tokuhiro Matsuno Kazuhiro Osawa Yuval Kogman Kazuho Oku Alexis Sukrieh Takatoshi Kitano Stevan Little Daisuke Murase mala Pedro Melo Jesse Luehrs John Beppu Shawn M Moore Mark Stosberg Matt S Trout Jesse Vincent Chia-liang Kao Dave Rolsky Hans Dieter Pearcey Randy J Ray Benjamin Trott Max Maischein Slaven Rezić Marcel Grünauer Masayoshi Sekimura Brock Wilcox Piers Cawley Daisuke Maki Kang-min Liu Yasuhiro Matsumoto Ash Berlin Artur Bergman Simon Cozens Scott McWhirter Jiro Nishiguchi Masahiro Chiba Patrick Donelan
Copyright Tatsuhiko Miyagawa, 2009.
This document is licensed under the Creative Commons license by-sa.
To install PSGI, copy and paste the appropriate command in to your terminal.
cpanm
cpanm PSGI
CPAN shell
perl -MCPAN -e shell install PSGI
For more information on module installation, please visit the detailed CPAN module installation guide.