View on
MetaCPAN
Yoran Heling > TUWF > TUWF

Download:
TUWF-1.2.tar.gz

Dependencies

Annotate this POD

Website

View/Report Bugs
Module Version: 1.2   Source  

NAME ^

TUWF - The Ultimate Website Framework

DESCRIPTION ^

TUWF is a small framework designed for writing websites. It provides an abstraction layer to various environment-specific tasks and has common functions to ease the creation of both small and large websites.

For a gentle introduction to TUWF, see TUWF::Intro.

Main features and limitations

TUWF may be The Ultimate Website Framework, but it is not the perfect solution to every problem. This section introduces you to some main features and limitations you will want to know about before using TUWF.

TUWF is small.

I have seen many frameworks being advertised as "small" or "minimal", yet they either require loads of dependencies or are not small at all. TUWF, on the other hand, is quite small. Its total codebase is significantly smaller than the primary code of CGI.pm, and TUWF requires absolutely no extra dependencies to run.

Some optional features, however, do require extra modules:

  • DBI: For the TUWF::DB database handling methods.
  • FCGI: To run TUWF in a FastCGI environment.
  • HTTP::Server::Simple: To run the standalone HTTP server.
  • JSON::XS: If you need to handle requests with a JSON body, or wish to output JSON yourself.
  • PerlIO::gzip: For output compression.
The generated response is buffered.

This allows you to change the response completely while generating an other one, which is extremely useful if your code decides to throw an error while a part of the response has already been generated. In such a case your visitor will properly see your error page and not some messed up page that does not make sense. Thanks to this buffering, you will also be able to set cookies and send other headers after generating the contents of the page. And as an added bonus, your pages will be compressed more efficiently when output compression is enabled.

On the other hand, this means that you can't use TUWF for applications that require Websockets or other forms of streaming dynamic content (e.g. a chat application), and you may get into memory issues when sending large files.

Everything is UTF-8.

All TUWF functions (with some exceptions) will only accept and return Unicode strings in Perls native encoding. All incoming data is assumed to be encoded in UTF-8 and all outgoing data will be encoded in UTF-8. This is generally what you want when developing new applications. If, for some very strange reason, you want all I/O with the browser to be in anything other than UTF-8, you won't be able to use TUWF. It is possible to use external resources which use other encodings, but you will have to decode() that into Perls native encoding before passing it to any TUWF function.

Designed for FastCGI environments.

TUWF is designed and optimized to be run a FastCGI environment, but it can also be used for plain old CGI scripts or run as a standalone HTTP server.

Due to the singleton design of TUWF, you should avoid running TUWF websites in persistent environments that allow multiple websites to share the same process, such as mod_perl.

One (sub)domain is one website.

TUWF assumes that the website you are working on resides directly under a (sub)domain. That is, the homepage of your website has a URI like http://example.com/, and all sub-pages are directly beneath it. (e.g. http://example.com/about would be your "about" page).

While it is possible to run a TUWF website in a subdirectory (i.e. the homepage of the site would be http://example.com/mysite/), you will have to prefix all HTML links and registered URIs with the name of the subdirectory. This is neither productive, nor will it be fun when you wish to rename that directory later on.

One website is one (sub)domain.

In the same way as the previous point, TUWF is not made to handle websites that span multiple (sub)domains and have different behaviour for each one. It is possible - quite simple, even - to have a different subdomain affect some configuration parameter while keeping the structure and behaviour of the website the same as for the other domains. An example of this could be a language setting embedded in a subdomain: en.example.com could show to the English version of your site, while de.example.com will have the German translation.

Things will become messy as soon as you want (sub)domains to behave differently. If you want forum.example.com to host a forum and wiki.example.com to be a wiki, you will want to avoid programming both subdomains in the same TUWF script. A common solution is to write a separate script for each subdomain. It is still possible to share code among both sites by means of modules.

General structure of a TUWF website

A website written using TUWF consists of a single Perl script, optionally accompanied by several modules. The script is responsible for loading, initializing and running TUWF, and can be used as a CGI script, FastCGI script, or standalone server. For small and simple websites, this script may contain the code for the entire website. Usually, however, the actual implementation of the website is spread among the various modules.

The script can load the modules by calling TUWF::load() or TUWF::load_recursive(). TUWF configuration variables can be set using TUWF::set(), and routing handlers can be registered through TUWF::get() and friends. These functions can also be called by the loaded modules. In fact, for larger websites it is common for the script to only initialize TUWF and load the modules, while routing handlers are registered in the modules.

The framework is based on callbacks: At initialization, your code registers callbacks to the framework and then passes the control to TUWF using TUWF::run(). TUWF will then handle requests and call the appropriate functions you registered.

The TUWF Object

While TUWF can not really be called object oriented, it does use one major object, called the TUWF object. This object can be accessed using the tuwf() function that is exported by default, or by accessing $TUWF::OBJ directly. Even though it is an "instance" of TUWF::Object, you are encouraged to use it as if it is the main object for your website: You can use it to store global configuration settings and other shared data.

All modules loaded using TUWF::load() and its recursive counterpart can add their own methods to the TUWF Object. This makes it easy to split the functionality of your website among different functions and files, without having to constantly load and import your utility modules in each file that uses them.

Of course, with all these methods being imported into a single namespace, this does call for some method naming conventions to avoid name conflicts and other confusing issues. The main TUWF methods use camelCase and are often prefixed with a short identifier to indicate to which module or section they belong. For example, the TUWF::Request methods all start with req and TUWF::Response with res. It is a good idea to adopt this style when you write your own methods.

Be warned that the data in the TUWF object may or may not persist among multiple requests, depending on whether your script is running in CGI, FastCGI or standalone mode. In particular, it is a bad idea to store session data in this object, assuming it to be available on the next request. Storing data specific to a single request in the object is fine, as long as you make sure to reset or re-initialize the data at the beginning of the request. The before hook is useful for such practice.

Utility functions

Besides the above mentioned methods, TUWF also provides various handy functions. These functions are implemented in the TUWF submodules (e.g. TUWF::Misc) and can be imported manually through these modules. Check out the SEE ALSO below for the list of submodules.

An alternative, and more convenient, approach to importing these functions into your code is also available: you can import functions from multiple submodules at once by adding their names and/or tags to the use TUWF; line.

The following two examples are equivalent:

  # the simple approach
  use TUWF ':xml', 'uri_escape', 'sqlprint';

  # the classic approach
  use TUWF;
  use TUWF::XML ':xml';
  use TUWF::Misc 'uri_escape';
  use TUWF::DB 'sqlprint';

The first use TUWF; line of the classic approach is not required if all you need is to import the functions. Omitting this line from your main website script, however, will cause the main TUWF code to not be loaded into memory, and the global functions (listed below) will then not be available. The simple approach does not suffer from this problem and is therefore recommended.

EXPORTED FUNCTIONS ^

By default, TUWF exports a single function.

tuwf()

Returns the global TUWF object. This allows for convenient DSL-like access to methods in the TUWF object:

  # Get the client IP:
  my $remote_ip = tuwf->reqIP;

  # Send a 404 response:
  tuwf->resNotFound;

GLOBAL FUNCTIONS ^

The main TUWF namespace contains several functions used to initialize the framework and register the callbacks for your website.

TUWF::any(\@methods, $path, $sub)

Register a method and path to a subroutine. @methods is an array of HTTP methods to accept. $path can either be a literal string or a regexp. $sub is called on incoming HTTP requests if the method is in the given array and if the $path fully matches reqPath(). It is common to use the qr{} operator to quote the regex, which prevents you from having to escape slashes in the path as would be required with qr//. If there are multiple route handlers that match a single request, then the one that has been registered first will be called. HTTP methods are case-insensitive. If the $path is a regex, then the pattern groups will be available through tuwf->capture .

  TUWF::any ['post'], '/', sub {
    # This code will be called on a "POST /"
  };

  TUWF::any ['get'], '/literal.json', sub {
    # This code will be called on "GET /literal.json"
    # But NOT on "GET /literalXjson"
  };

  TUWF::any ['get'], qr{/regex.json}, sub {
    # This code will be called on "GET /regex.json"
    # And also on "GET /regexXjson"
  };

  TUWF::any ['get','head'], qr{/user/(?<uid>\d+)}, sub {
    # This code will be called on a "GET /user/123".
    # The user's ID will be available in
    #   tuwf->capture('uid')
    # And
    #   tuwf->capture(1);
  };

TUWF::del($path, $sub)

Register a handler for DELETE $path. Equivalent to TUWF::any(['delete'], @_).

TUWF::get($path, $sub)

Register a handler for GET and HEAD on $path. Equivalent to TUWF::any(['get','head'], @_).

  TUWF::get '/', sub { ... };

  TUWF::get qr{/game/(.*)}, sub {
    my $gameid = tuwf->capture(1);
  };

TUWF::hook($hook, $sub)

Add a hook. This allows you to run a piece of code before or after a request handler. Hooks are run in the same order as they are registered. The following hooks are supported:

before

The subroutine will be called before the request handler. If this handler calls tuwf->done, then TUWF will assume that the handler has generated a suitable response, and any subsequent before handlers and request handlers will not be called.

This replaces the pre_request_handler setting.

after

Called after the request handler has run, but before the result has been sent to the client. This hook is always called, even if a before hook has called tuwf->done or if a route handler threw an exception. (The only time an after hook may not be called is when a preceding after hook threw an exception).

This replaces the post_request_handler setting.

TUWF::load(@modules)

Loads the listed module names and optionally imports their exported functions to the TUWF::Object namespace (see the "import_modules" setting). The modules must be available in any subdirectory in @INC.

  # make sure the website modules are available from @INC
  use lib 'mylib';
  
  # load mylib/MyWebsite/HomePage.pm
  TUWF::load('MyWebsite::HomePage');
  
  # load two other modules
  TUWF::load('MyWebsite::Forum', 'MyUtilities');

Note that your modules must be proper Perl modules. That is, they should return a true value (usually done by adding 1; to the end of the file) and they should have the correct namespace definition.

TUWF::load_recursive(@modules)

Works the same as TUWF::load(), but this also loads all submodules.

  # the following will load MyWebsite.pm (if it exists) and all modules below
  # the MyWebsite/ directory (if any).
  TUWF::load_recursive('MyWebsite');

Note that all submodules must be in the same parent directory in @INC.

TUWF::options($path, $sub)

Register a handler for OPTIONS $path. Equivalent to TUWF::any(['options'], @_).

TUWF::patch($path, $sub)

Register a handler for PATH $path. Equivalent to TUWF::any(['patch'], @_).

TUWF::post($path, $sub)

Register a handler for POST $path. Equivalent to TUWF::any(['post'], @_).

TUWF::put($path, $sub)

Register a handler for PUT $path. Equivalent to TUWF::any(['put'], @_).

TUWF::register(regex => subroutine, ..)

This is a legacy function, and only exists for backwards compatibility. It is similar to TUWF::any(), with the following differences:

TUWF::set(key => value, ..)

Get or set TUWF configuration variables. When called with only one argument, will return the configuration variable with that key. Otherwise the number of arguments must be a multiple of 2, setting the configuration parameters.

Note that most settings don't like to be changed from within route handlers and other callbacks. You should only need to set configuration variables during initialization.

content_encoding

Set the default output encoding. Supported values are none, gzip, deflate, auto. See TUWF::Response for more information. Default: auto.

cookie_defaults

When set to a hashref, will be used as the default options to resCookie(). These options can still be overruled by each individual call to resCookie(). This can be useful when globally setting the cookie domain:

  TUWF::set(cookie_defaults => { domain => '.example.org' });
  tuwf->resCookie(foo => 'bar');
  
  # is equivalent to:
  tuwf->resCookie(foo => 'bar', domain => '.example.org');
  # for each call to resCookie()

Default: undef (disabled).

cookie_prefix

When set to a non-empty string, its value will be used as prefix to all cookie names used by reqCookie() and resCookie(). reqCookie() will act as if all cookies not having the configured prefix never existed, and removes the prefix when used in list context. resCookie() will simply add the prefix to all outgoing cookies. Default: undef (disabled).

db_login

Sets the login information for the TUWF::DB functions. Can be set to either an arrayref or a subroutine reference.

In the case of an arrayref, the array should have three elements, containing the first three arguments to DBI::connect(). Do not include the last options argument, TUWF will set the appropriate options itself. When necessary, however, it is still possible to set options using the DSN string itself, see the DBI documentation for more information. TUWF::DB will automatically enable the unicode/utf8 flag for DBD::mysql, DBD::Pg and DBD::SQLite.

When setting this to a subroutine reference, the subroutine will be called when connecting to the database, with the main TUWF object as only argument. The subroutine is expected to return a DBI instance. It is the responsibility of the subroutine to set the correct DBI options. In particular, it is important to have RaiseError enabled and AutoCommit disabled. It is also recommended to enable unicode support if your database driver has such an option.

Default: undef (disabled).

debug

Set to a true value to enable debug mode. When debug mode is enabled and logfile is specified, TUWF will log page generation times for each request. This flag can be easily read through the debug() method, so you can also use is in your own code. Default: 0 (disabled).

error_400_handler

Similar to error_404_handler, but is called when something in the request data did not make sense to TUWF. In the current implementation, this only happens when the request data contains non-UTF8-encoded text. A warning is written to the log file when this happens.

WARNING: The warning of "error_500_handler" applies here as well.

error_404_handler

Set this to a subroutine reference if you want to write your own 404 error page. The subroutine will be called with TUWF object as only argument, and is expected to generate a response.

error_405_handler

Similar to error_404_handler, but is called when the HTTP request method is something other than HEAD, GET or POST. These requests are usually generated by bots or applications which don't actually read the response contents, so overriding the default 405 error page makes little sense in most situations. If you do override it, do not forget to add an Allow HTTP header to the response, as required by the HTTP standard.

WARNING: The warning of "error_500_handler" applies here as well.

error_413_handler

Similar to error_404_handler, but is called when the POST body exceeds the configured max_post_body.

WARNING: The warning of "error_500_handler" applies here as well.

error_500_handler

Set this to a subroutine reference if you want to write your own 500 error page. The subroutine will be called with the TUWF object as first argument and the error message as second argument. When logfile is set, a detailed error report will be written to the log. It is recommended to ignore the error message passed to your subroutine and to enable the log file, so you won't risk sending sensitive information to your visitors.

WARNING: When this handler is called, the database and request objects may not be in a usable state. This handler may also be called before any of the hooks have been run. It's best to keep this handler as simple as possible, and only have it generate a friendly response.

http_server_port

Port to listen on when running the standalone HTTP server. This defaults to the TUWF_HTTP_SERVER_PORT environment variable, or 3000 if it is not set.

import_modules

This setting controls whether TUWF::load() and TUWF::load_recursive() will import all public functions into the TUWF::Object namespace.

For example, with import_modules enabled, a module can add a tuwf-htmlFramework()> method as follows:

  package My::Package;
  use TUWF;
  use Exporter 'import';

  our @EXPORT = ('htmlFramework');
  sub htmlFramework {
    # ..
  }

The same can still be done with import_modules set to a false value, but it will work slightly differently:

  package My::Package;
  use TUWF;

  sub TUWF::Object::htmlFramework {
    # ..
  }
  # Or:
  *TUWF::Object::htmlFramework = sub {
    # ..
  };

This setting defaults to true, but that's only for historical reasons. The latter style is recommended for new projects.

logfile

To enable logging, set this to a string that indicates the path to your log file. The file must of course be writable by your script. TUWF automatically logs all Perl warnings, and when one of your callbacks throws an exception a full request dump with useful information will be logged, allowing you to easily locate and fix the problem. You can also write information to the log yourself using the log() method. Default: undef (disabled).

log_format

Set to a subroutine reference to influence the default log format. The subroutine is passed three arguments: the main TUWF object, the URI of the current request (or '[init]' if log() was called outside of a request), and the log message. The subroutine should return the string to be written to the log, including trailing newline.

Be warned that your subroutine can be called even when no request is being processed or before some resources have been initialized, so you should avoid using such resources. In paticular, do not call any database functions from this subroutine, as the database connection may not be in a defined state. Any Perl warnings generated by this subroutine will not be logged in order to avoid infinite recursion.

log_slow_pages

Setting this to a number will log all pages that took longer to generate than the time indicated in the number, in milliseconds. The format of the log line is the same as used when the debug option is enabled. This option is ignored when debug is enabled, since in that case all pages will be logged anyway. Default: 0 (disabled).

log_queries

Setting this to a true value will write all database queries to the log file. Useful when debugging queries, but can generate a lot of data. Default: 0 (disabled).

mail_from

The default value of the From: header of mail sent using mail(). Default: <noreply-yawf@blicky.net>.

mail_sendmail

The path to the sendmail command, used by mail(). Default: /usr/sbin/sendmail.

max_post_body

Maximum length of the contents of POST requests, in bytes. This disallows people to upload large files and potentially cause your script to run out of memory. Set to 0 to disable this limit. Default: 10MB.

mime_types

Hash of file extension to MIME type. This is used by resFile() to set the appropriate Content-type header. The default already comes with a few extensions, but you can easily add over override your own using:

  TUWF::set('mime_types')->{exe} = 'application/x-msdownload';
mime_default

The default MIME type for extensions not covered in mime_types.

pre_request_handler

(Deprecated) Similar to a before hook, see TUWF::hook(). Unlike the before hook, this subroutine should not call tuwf->done or tuwf->pass, but instead return a false value to send the response and prevent further processing, or return a true value to continue processing further handlers.

post_request_handler

(Deprecated) Equivalent to an after hook, see TUWF::hook(). One notable difference is that this callback will not run when a before hook or request handler threw an exception or called tuwf->done().

validate_templates

Hashref, templates for the kv_validate() function when called using formValidate(). The recommended way to add new templates is to call TUWF::set() with a single argument:

  TUWF::set('validate_templates')->{$key} = \%validate_options;
xml_pretty

Passed to the pretty option of TUWF::XML->new(). See TUWF::XML. Default: 0 (disabled).

TUWF::run()

After TUWF has been initialized, all modules have been loaded and all URIs have been registered, the last thing that remains is to execute TUWF::run(). This function will start processing requests and calls the appropriate callbacks at the appropriate stages.

Whether this function ever returns or not depends on the environment your script is running in; if you're running your script in a CGI environment, TUWF::run() will return as soon as the request has been processed. If, on the other hand, you are running the script as a FastCGI script or standalone server, it will keep waiting for new incoming requests and will therefore never return. It is a bad idea to assume either way, so you want to avoid putting any run-time code after calling TUWF::run().

BASIC METHODS ^

This section documents the basic TUWF object methods provided by TUWF.pm. The TUWF object provides many other methods as well, which are implemented and documented in the various sub-modules. See the documentation of each sub-module for the methods it provides.

capture(key)

Returns the capture group from the route regex. Both positional captures and named captures can be used, for example:

  TUWF::get qr{/user/(?<uid>[1-9][0-9]*)} => sub {
    # capture(1) is equivalent to $1 from the regex.
    tuwf->capture(1) == $uid;

    # capture('uid') is equivalent to $+{uid} from the regex.
    tuwf->capture('uid') == $uid;
  };

Note that captures(0) is not available; This would be equivalent to tuwf->reqPath, save for the leading slash.

debug()

Returns the value of the debug setting.

done()

Calling this method will immediately abort the current handler, run the after hooks, and output the response. When called from a before hook, this will prevent running any further before hooks or request handlers.

Calling this method from a request handler is equivalent to a normal return from the handler, but it can still be useful to force a response from a nested function call, e.g.:

  sub require_admin {
    if(!user_is_admin()) {
      # Generate a friendly error page here.
      # ...and send it to the client:
      tuwf->done;
    }
  }

  TUWF::get '/admin' => sub {
    require_admin();
    # At this point we can be sure that the user is an administrator, and
    # continue to generate our page.
  };

Calling this method from a after hook has no effect other than prematurely aborting that particular hook.

This method calls die(), so be sure to re-throw the error when run inside an eval block.

log(message)

Writes a message to the log file configured with logfile. When no log file is configured, log() will do nothing. The message argument may contain newlines, which will be nicely (re-)formatted before logging, in order to avoid ambiguity with other log entries. By default the log message will be prefixed with the date and URI of the request, but this can be changed with the log_format setting.

This function is not used very often in practice, since it is easier to simply use Perl's warn() function instead. TUWF automatically writes all warnings to the log file.

pass()

Calling this method will immediately abort the current handler and move on to the next handler (if any). Any side effects (e.g. setting response headers, generating output) will remain. If the final handler calls pass(), then any response data is discarded and a 404 response is generated instead.

Calling this method from a after hook has no effect other than prematurely aborting that particular hook.

This method calls die(), so be sure to re-throw the error when run inside an eval block.

SERVER CONFIGURATION ^

Since a website written using TUWF consists of a single Perl script that acts as the main script for your site, the only thing you have to do is to run the script, or tell your webserver to run it for you. There are generally two things you should take care of:

  1. Make sure you have the right modules installed: CGI is supported out of the box, but for FastCGI you will need FCGI and for the standalone web server, you'll need HTTP::Server::Simple.
  2. You have to make sure all requests to non-existing files are passed to your script, in order for the URI rewriting in TUWF to work.

The following examples show how to configure your server to run the examples/singlefile.pl script from the TUWF distribution. I assume the TUWF distribution is unpacked in /tuwf and the site runs on the hostname test.example.com.

Standalone

The easiest way to run a TUWF project is to simply... run it. TUWF will automatically start a HTTP server on port 3000, and your site will be accessible on http://localhost:3000/.

This mode is intended for development use only, and it can only process a single request at a time.

Examples for Apache (2.2 or 2.4)

CGI mode:

  <VirtualHost *:*>
    ServerName test.example.com
    DocumentRoot /tuwf/examples
    AddHandler cgi-script .pl

    # %{REQUEST_FILENAME} does not seem to always work inside a <VirtualHost>
    # But it should be equivalent to "%{DOCUMENT_ROOT}/%{REQUEST_URI}"
    RewriteEngine On
    RewriteCond "%{DOCUMENT_ROOT}/%{REQUEST_URI}" !-s
    RewriteRule ^/ /singlefile.pl
  </VirtualHost>

It is possible to move the mod_rewrite statements into a .htaccess file, in which case you can remove the Rewrite* lines in the above example and put the following in your .htaccess file:

  RewriteEngine On
  RewriteCond %{REQUEST_FILENAME} !-s
  RewriteRule ^/ /singlefile.pl

FastCGI mode, using mod_fcgid. With this configuration it is possible to have the documentroot point to a different directory than where the TUWF script resides, which could improve security.

  <VirtualHost *:*>
    ServerName test.example.com
    DocumentRoot /whatever/you/want
    AddHandler fcgid-script .pl
    FcgidWrapper /tuwf/examples/singlefile.pl virtual

    # same as above example, except 'singlefile.pl' can be anything,
    # as long as it ends with '.pl'
    RewriteEngine On
    RewriteCond "%{DOCUMENT_ROOT}/%{REQUEST_URI}" !-s
    RewriteRule ^/ /singlefile.pl
  </VirtualHost>

Again, it is possible to move the rewrites into a .htaccess. All of the above examples assume the referenced directories have the appropriate options set using a <Directory> clause.

Examples for lighttpd (1.4)

CGI mode:

  $HTTP["host"] == "test.example.com" {
    server.document-root = "/tuwf/examples"
    cgi.assign = ( ".cgi" => "" )
    server.error-handler-404 = "/singlefile.pl"
  }

FastCGI:

  fastcgi.server = (
    ".singlefile" => ((
      "socket"            => "/tmp/perl-tuwf-singlefile.socket",
      "bin-path"          => "/tuwf/examples/singlefile.pl",
      "check-local"       => "disable"
    ))
  )

  $HTTP["host"] == "test.example.com" {
    server.document-root = "/whatever/you/want"
    cgi.assign = ( ".cgi" => "" )
    server.error-handler-404 = "/something.singlefile"
  }

SEE ALSO ^

TUWF::Intro, TUWF::DB, TUWF::Misc, TUWF::Request, TUWF::Response, TUWF::XML.

The homepage of TUWF can be found at https://dev.yorhel.nl/tuwf.

TUWF is available on a git repository at https://g.blicky.net/tuwf.git/.

COPYRIGHT ^

Copyright (c) 2008-2018 Yoran Heling.

This module is part of the TUWF framework and is free software available under the liberal MIT license. See the COPYING file in the TUWF distribution for the details.

AUTHOR ^

Yoran Heling <projects@yorhel.nl>

syntax highlighting: