The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Daizu::Gen - default generator class

DESCRIPTION

This class, and subclasses of it, are responsible for deciding which URLs should be created (generated) from each file or directory in a working copy, and generating the output which will be served for those URLs. This class itself is used by default, but you can use a different generator class by setting the daizu:generator property to the name of a Perl class. If you set it on a file, it will affect only that file. If you set it on a directory then it will affect that directory and all its descendants, unless they themselves have a daizu:generator property.

The name of the generator class used for each file and directory is stored in the generator column of the wc_file table in the database.

When an object of a generator class is instantiated, it must be given a 'root file', which is the file on which the daizu:generator property was set (or a top-level file or directory, if no such property applies).

This class creates URLs based on the daizu:url property, and the names of files and directories. The results will be similar to the URLs that the filesystem would have if they were served directly from a webserver. Files with names like _index.html (anything starting with _index followed by a dot) are special in that the filename will not appear as part of the URL. Instead the URL will end with a trailing slash (/).

With this generator class only files generate URLs. Directories are ignored, except when a sitemap XML file is configured as described below.

CONFIGURATION

The only configuration information which this generator currently makes use of is the xml-sitemap element shown here:

    <config path="example.com">
     <generator class="Daizu::Gen">
      <xml-sitemap />
     </generator>
    </config>

The sitemap URL will be generated from the directory at the path indicated. It must be a directory, not a plain file. In this case, the sitemap is likely to have a URL like http://example.com/sitemap.xml.gz. You can give this URL to Google, or any other search engine which supports the sitemaps format, to help their robots find URLs on your website.

The xml-sitemap element may an optional url attribute, which should be a relative or absolute URL at which to publish the sitemap file. Its default value is sitemap.xml.gz

SUBCLASSING

To write your own generator class, inherit from this one and override some of the following methods:

custom_base_url

If you want to modify the basic URL scheme then you might want to provide your own algorithm for deciding what URLs to use. You could instead override base_url itself, but usually it's best to leave that alone. It will handle things like URLs explicitly set with the daizu:url property, and ignoring things in _hide directories, and just call your custom_base_url method for the rest.

custom_urls_info

You would only need to override this if you want to make fairly big changes to the URL scheme. If you just want to change the URLs of a particular type of file then you might be able to do that by overriding one of the simpler *_urls_info functions listed next. The base-class implementation of this function just chooses between.

You almost certainly don't want to override urls_info, since that's just a wrapper around this function which tidies up the results.

article_urls_info, unprocessed_urls_info, dir_urls_info, root_dir_urls_info

Override one or more of these to change which URLs are produced for particular types of files, such as articles or directories. For example the blog generator overrides root_dir_urls_info to add URLs for the blog homepage, feeds, etc.

article_template_overrides, article_template_variables

These are called by the article method. The base-class ones don't do anything, but you can override them to provide extra information to the templates or to replace a standard template with a different one (if you want to change one aspect of the page structure for your articles). Doing this should allow you to avoid writing your own article generator method.

Override this to change the menu items which will be displayed by the nav_menu.tt template. Of course if you want to provide a radically different kind of navigation then you may need to rewrite that template to a different one. If you do that, it's probably a good idea to override this method with one that does no work, to avoid generating menu items which won't be used.

The constructor can accept additional options, and will just store them in the object hash, so you probably won't need to override that.

METHODS

Daizu::Gen->new(%options)

Return a new generator object. Requires the following options:

cms

A Daizu object.

root_file

A Daizu::File object for the file on which this generator was specified, or a top-level directory if there was no specification of which generator was in use. So usually this file will have a daizu:generator property naming this class.

config_elem

The XML DOM node (an XML::LibXML::Element object) of a generator element in the Daizu CMS configuration file, or undef if there is no appropriate configuration provided.

$gen->base_url($file)

Return a single URL for $file, as a URI object. This 'base URL' is typically used as the basis for any other URLs the file might generate.

Files with a daizu:url property will take that as their base URL.

Directories can have base URLs even if they don't actually generate any URLs in the publication process, since those URLs are used to build URLs for any content they contain. Directory URLs end in a forward slash.

Files with names starting with _index. have a base URL identical to their parent directory.

Returns undef if there is no URL for this file. This can happen if the file's name is _hide or _template, or if it is contained in a directory with a name like that, or if there is no daizu:url property for the file or any of its ancestors.

Subclasses should typically not override this, but instead override custom_base_url(), as the blog generator does for example.

$gen->custom_base_url($file)

Override this method in a subclass if you want to use a custom URL scheme, for example one based on publication dates instead of file and directory names.

This method is called by base_url(). By the time it has been called, checks have already been done for the daizu:url property, the special names like _hide, and the base URL of the parent directory, if any. If these don't determine the URL, or absence of one, then the custom_base_url() method should supply one, or return undef if the file shouldn't have a base URL.

If this is called then $file is guaranteed to have a parent, but its parent's base URL hasn't been determined, so it may not have one.

The default implementation just uses the base URL of the parent and the name of the file or directory in the obvious way.

$gen->urls_info($file)

Return a list of URLs generated by $file (a Daizu::File object). May return nothing if the file doesn't generate any URLs.

This method calls the base_url() and custom_urls_info() methods to do the actual work. All it does is resolve relative URLs and fill in some missing information, so you're more likely to need to override those two, or one of the *_urls_info methods below, if you want to build a new generator class with a differnet URL scheme. This is what the Daizu::Gen::Blog generator does.

Each URL value returned is actually a reference to a hash containing the following keys, which are all required:

url

The actual URL as a URI object. This will always be an absolute URL.

generator

The name of the class of generator which was used to create these URLs.

method

The name of the method which should be called to generate the output for this file at this URL.

TODO - reference to docs for API of generator methods

argument

Some value which determines exactly which one of a set of URLs of the same basic type this is. For example if there were several URLs for an article, one for each of several pages, then they would probably have the same generator and method, but the page number would be stored as the argument.

The argument is always defined. It will be the empty string if custom_urls_info() didn't supply an argument value.

type

The MIME type which the resource should be served with.

This method returns nothing if the file has no URLs, for example if it has no base URL (which might happen if it is in an _hide directory).

$gen->custom_urls_info($file)

This is called by the urls_info() method above, and does the actual work of supplying the URLs. It should also return a list of hashes for the URLs generated by $file, but is allowed to be a bit more lazy. The following are the differences it may make in return value (although note that it is permissible for this method to return exactly the same values as for urls_info() if it wishes):

  • The url value doesn't have to be an absolute URL, and doesn't have to be a URI object. If the URL desired is the same as the value returned by the base_url() method, then this value can simply be the empty string.

  • The generator value may be omitted or undefined, in which case it will default to the class name of $gen.

  • The argument value may be omitted or undefined, in which case it will default to the empty string.

The Daizu::Gen implementation of the method simply calls the four *_urls_info methods listed next as appropriate, so usually subclasses should override those instead of this method.

$gen->article_urls_info($file)

Return a list of URLs for an article. $file must be a Daizu::File object for a file which is an article. Uses the article_urls() method in Daizu::File to do the work, so this is just a simple wrapper to allow subclasses to override it.

The return value is as specified for custom_urls_info().

$gen->unprocessed_urls_info($file)

Return a list of URLs for the non-article non-directory file in $file, which must be a Daizu::File object.

This base-class implementation returns a single URL which uses the unprocessed() method in this class.

The return value is as specified for custom_urls_info().

The content type, if not defined by the file, will default to application/octet-stream.

$gen->dir_urls_info($file)

Return a list of URLs for the directory specified by $file, which should be a Daizu::File object. This base-class implementation returns no URLs.

The return value is as specified for custom_urls_info().

$gen->root_dir_urls_info($file)

Return a list of URLs for the directory $file, which should be a Daizu::File object for the root directory of the generator (the directory which has the daizu:generator property or a top-level directory). This base-class implementation returns no URLs unless the configuration specifies that an XML sitemap should be published, in which case it returns a single URL for the sitemap file, using the xml_sitemap() method.

If a file, rather than a directory, has a daizu:generator property, then this method isn't called and the file isn't distinguished in any way for being the 'root file'.

The return value is as specified for custom_urls_info().

If you override this to add other URLs you can still allow sitemaps to be published from the root directory by calling the superclass version, like this:

    sub root_dir_urls_info
    {
        my ($self, $file) = @_;
        my @url = $self->SUPER::root_dir_urls_info($file);

        # Add your own URLs here:
        push @url, { ... };

        return @url;
    }
$gen->generate_web_page($file, $url, $template_overrides, $template_vars)

Use Template Toolkit to do the generation of the content for $file into the URL in $url (which must be a reference to a URL info hash). $template_vars should be a reference to a hash, and is passed to the template, as are the values 'cms' (the Daizu object), 'file' ($file), and 'url' ($url).

If $template_overrides is defined it should be a reference to a hash containing template rewriting instructions. Whenever a template is loaded its name will be looked up in the hash. If an entry is found, the template named by the corresponding value is loaded instead of the original template.

Daizu::TTProvider is used for loading the templates, so they will get loaded directly from the working copy $file is from.

TODO - exactly what format do these URL hashes have to be in? There are several alternatives in use in various places now. Ah, no, these ones need a filehandle at least.

$gen->article_template_overrides($file, $url_info)

Returns a reference to a hash of template rewriting instructions for articles. Each key should be the name of a template which is expected to be loaded (perhaps by a Template Toolkit INCLUDE directive), and the value is the name of a different template which should be loaded instead.

These rewrites will be done for all articles generated by the article() method.

The base-class implementation returns an empty hash reference.

$gen->article_template_variables($file, $url_info)

Returns a reference to a hash of template variable values which should be passed to Template Toolkit when an article page is generated. The keys should be the names of variables which are expected to be present by a template, and the values are passed in as-is.

The base-class implementation returns an empty hash reference.

$gen->url_updates_for_file_change($wc_id, $guid_id, $file_id, $status, $changes)

This is called by the publishing code in Daizu::Publish when a file has been changed. It should return a reference to an array of GUID IDs for files which should have their own URLs updated. The URLs for the file which has changed are always updated anyway.

This is used in the Daizu::Gen::Blog generator, for example, to ensure that new URLs appear for archive pages the first time a new article is published in a given month.

$status will be A when a new file has been added to the content repository, M when an existing file has been modified in some way, and D when it has been deleted. If the status is D then the live working copy will no longer have information about this file, so $file_id will be undef, and this method will be called on a generator object with a 'fake' root file (so don't expect to be able to do anything with the root_file value in the generator object).

Note that there must always be an array reference returned, even if it's an empty array.

$changes will be a reference to a hash containing various keys with information about the changes that were made to the file since the last time the sites were updated. Most keys are the names of Subversion properties which have been changed. The values for those will be the old value of the property. Unless a file has been deleted, the new values can be looked up in the live working copy. For files which have been added, the property values supplied will all be undef, since there were no old values.

There are also some values in the $changes hash with special names. These all start with underscores. (If there are any real properties whose names start with underscore, changes to them won't be registered.) The following special values are available:

_status

Same as $status.

_new_issued

A DateTime value, containing the publication time of the file in its new state. Will only be present for files which have been newly added or modified files for which the value has changed. The value will be based on either the dcterms:issued property or the time at which the file was first committed.

_old_issued

Same as _new_issued, except that it refers to the publication time before the changes we are considering. Available only for deleted files or modified files where the dcterms:issued property was changed.

_article_url and _urls

TODO - these aren't implemented yet

An entry for _urls is present (with a value which is always undef) if any of the URLs for the file have been changed. The same applies to _article_url except that it is only present if the URL for the first page of an article URL has been changed (one with a method of article and no argument).

_old_article and _new_article

These keys are always present, no matter what value $status has. The value is either 0 or 1, to indicate false or true respectively. They are true only if the article was or now is an article.

_old_path and _new_path

The full path of the file in Daizu working copies before and after the changes. If the file has been added or deleted then only one of these will be present.

_content

TODO - this may be removed in the future for performance reasons, and some other way of getting the information provided.

This method is called before any URL updating has actually been done, even for the file it is called for.

This particular implementation of the method forces URL updates when the file has had its daizu:url or daizu:generator properties changed.

$gen->publishing_for_file_change($wc_id, $guid_id, $file_id, $status, $changes)

This method is called by the publishing code when a file has been changed, to see if any extra URLs need to be republished to reflect the changes made. All the URLs for any modified files are republished anyway.

The return value should be a reference to an array of URLs (either as strings or URI objects) which Daizu knows how to publish. It should always return an array reference even if it's empty.

$changes is a reference to a hash, in the same format as for the url_updates_for_file_change() method.

This method is called after the URLs for all modified files have been updated, but before any publication takes place.

This particular implementation publishes files which may reference the changed file in their navigation menu, if the file's title, short-title, or URL have been changed. It won't always get every file which could be affected though.

$gen->publishing_for_url_change($wc_id, $status, $old_url_info, $new_url_info)

This is called by the publishing code when a URL has been changed. It should indicate any URLs which need publishing in addition to the ones which have actually changed.

The return value should be a reference to an array of URLs (either as strings or URI objects) which Daizu knows how to publish. It should always return an array reference even if it's empty.

The values of $old_url_info and $new_url_info will be either undef (if not available) or a reference to a URL info hash, including the actual URL as a URI object in the url key.

This method will be called on the generator specified for the new URL, except when an old URL has been deactivated.

The value of $status will be one of the following:

A for 'activated'

A new URL has appeared which wasn't previously published. In this case the new URL's information will be supplied, and there will be no old URL info.

M for 'modified'

A URL has been changed (as in, Daizu thinks that what was previously available at the old URL is now being published at the new one). In this case Daizu will generate a redirect. It will supply both the previous and new URL information to this method.

D for 'deactivated'

A URL which previously had content published by Daizu is no longer generated. Daizu will delete its content. The old URL information will be passed in, but obviously there isn't any new information.

This method is called after the URLs for all modified files have been updated, but before any publication takes place.

This base-class implementation always returns an empty array.

$gen->article($file, $urls)

A standard generator method for generating an article (a file with its daizu:type attribute set to article). It calls the generate_web_page() method to handle the templating. You can pass all the URLs for the different pages of a multi-page article in at once.

Subclasses can provide template rewriting and extra template variables by overriding the methods article_template_overrides() and article_template_variables().

$gen->unprocessed($file, $urls)

Generate an 'unprocessed' file. This is a standard generator method which simply prints the file's data to each of the URL's file handles

$gen->xml_sitemap($file, $urls)

A standard generator method which generates a XML sitemap file, gzip compressed.

The XML namespace URL used in XML sitemaps is available in the variable $Daizu::Gen::SITEMAP_NS.

The format of XML sitemaps is documented here:

http://www.sitemaps.org/protocol.html

$gen->scaled_image($file, $urls)

A standard generator method which generates a scaled version of an image file. $file must represent an image in a format which can be understood by Image::Magick, unless the GUID ID value is included in the argument, in which case there must be a file with that GUID ID in the working copy which is of an appropriate type.

The argument should consist of two or three numbers: the desired width and height of the resulting image, and optionally the GUID ID of the image file if it isn't the file the URL is actually generated from. These should be separated by single spaces.

$gen->navigation_menu($file, $url)

Return a recursive data structure describing a suitable menu for displaying on a page associated with $file, which must be a Daizu::File object. $url is the URL info for the page being generated.

This is called from the default nav_menu.tt template to generate the menu to put in the right-hand column.

The menu will not include the homepage (because that is presumably already linked from the top of the page or something, and it would be a waste of an extra level in the menu), and will not include any 'retired' articles.

The return value is a reference to an array of zero or more hashes, each of which will contain the following keys:

The URL of the page the menu item refers to, relative to $url. That is, this may not be an absolute URL, but it should get you to the right place from the page this menu was intended for.

This value will not be present for a menu item which refers to the current URL, because that shouldn't be a link (it's bad usability practice to link to the current page, because people might wonder why nothing happened).

title

The full title of the page the item refers to, if any.

short_title

An alternative title which might be more suitable for display in a menu. It will usually be the same as title, but sometimes the user (or a plugin) might provide an abbreviated title which is better in this kind of context.

children

A reference to an array of zero or more hashes, in the same format as the top-level ones, for items which should be presented as 'children' of this menu item, typically as a nested list.

COPYRIGHT

This software is copyright 2006 Geoff Richards <geoff@laxan.com>. For licensing information see this page:

http://www.daizucms.org/license/