The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

 Document::PageId - Page Identifier Class.

SYNOPSIS

 use Document::PageId;

 $obj            = Document::PageId->new   ($string);
 ($pageid,$rest) = Document::PageId->parse ($lexeme,$pflag);

 $string         = $obj->get;
 $string         = $obj->canonical;
 $pagenum        = $obj->pagenumber;
 $subpagenum     = $obj->subpagenumber;
 $side           = $obj->side;
 $side           = $obj->pagealpha;
 $flg            = $obj->ispairable;
 $flg            = $obj->haspagenumber;
 $flg            = $obj->hassubpagenumber;
 $flg            = $obj->haspagealpha;

Inheritance

 UNIVERSAL

Description

Page identifiers will usually be a simple format like "001", but over the years I've found need for variants such as "010.01" and "000a". Not yet even thought about are the many locational descriptions for pageid's which should be moved out of the basic filename: foo-a-toprt, foo-b-botleft, and all sorts of naming conventions I've used for cases where a single document page has been scanned as multiple sub-pages. It will probably be a major effort to sort this out.

Just as a warning... do not get in the habit of directly accessing the pageid ivar. This class and subclasses do lazy evaluation, so if you do not retrieve the value through the approved methods, you may not get what you expect.

Examples

 None.

Class Variables

 None.

Instance Variables

 pageid            Page identification string, eg 050,  000a , 10.1, 100.01b. 
                   Default is undef.
 pagenum           Integer page number part of page id, eg 100.01b
 subpagenum        Integer subpage number, eg 100.01b
 side              Page side, , eg 100.01b
 pn_digits         Number of digits in the page number. Not accessible at
                   present.
 spn_digits        Number of digits in the subpage number. Not accessible at
                   present.
 type              Type of format of this pageid. Internal use.

Class Methods

$obj = Document::PageId->new ($string)

Create a new pageid object around $string. It returns undef and does not create an object if $string cannot be parsed as a pageid.

Will I need a last page Class as well? Perhaps subclass first and last? I might either detect the p here and set a switch or outside and not worry about it. That might be the cleanest option. I will need a way to autodetect the field widths of pagenum and subpagenum. I still have not started to deal with segmented pages, top/mid/bot/left/right nomenclature. Sometimes a and b are front and back; sometimes they are a fill in for publications with page 1 many pages inside: so I have 00a...00g,01 as page numbers in the front.

($pageid,$rest) = Document::PageId->parse ($lexeme,$pflag)

Parse the string contained in $lexeme.. If $pflag is set, page number formats assume a leading 'p', eg. 'p001'. The default is to not require the leading 'p'.

If $lexeme contains a right justified pageid string, that is returned as $pageid and any remaining chars are placed in $rest. If no pageid is found, $pageid is and all of $lexeme is return in $rest; if $lexeme is empty or undef to start with, both values are .

Instance Methods

$string = $obj->canonical

Return the pageid in a canonical format. Currently the same as verbatim. This will be useful when I start constructing new pageid's on the fly and want to generate the "perfect" string according to current rules.

[Also useful if I someday want to update all files to use a standard format. Use get for the original page and canonical for the way it should be.]

$string = $obj->get

Return the original pageid.

$flg = $obj->haspagealpha

True if there is an alpha part to the pageid.

$flg = $obj->haspagenumber

True if there is a page number part to the pageid.

$flg = $obj->hassubpagenumber

True if there is a subpage number part to the pageid..

$side = $obj->ispairable

Returns true if the pageid contained in this object may be part of a sequential pair like p001-002. At present the only non-pairable pageid is 000.spine.

$side = $obj->pagealpha

Return the page side or alphabetic used for otherwise unnumbered pages within a page numbered document. In most cases a single sheet of paper or a flyer has a side a and a side b; however I also use this as a way of handling publications whose page 1 is 10 pages from the front cover. So I just have a lot of page 000's: 000a,000b,000c....001.

$pagenum = $obj->pagenumber

Return the page number.

$side = $obj->side

A synonym for the pagealpha method.

$subpagenum = $obj->subpagenumber

Return the sub-page number. This is often used for inserted booklets in magazines, or updates in manuals.

Private Instance Methods

None.

Private Instance Methods

$obj = $obj->_init ($pageid)

Subclass use only. This method does the bulk of the work for the new class method. Parses $pageid into components and sets up all the associated ivars. Returns undef if the pageid is not parseable.

$obj = $obj->_update

Subclass use only. Using the format information recorded by _init, recreate the pageid from its parts. This is a way to rebuild the pageid after a subclass modifies one of its components.

KNOWN BUGS

 See TODO.

SEE ALSO

 None.

AUTHOR

Dale Amon <amon@vnl.com>

3 POD Errors

The following errors were encountered while parsing the POD:

Around line 245:

=back doesn't take any parameters, but you said =back 4

Around line 303:

=back doesn't take any parameters, but you said =back 4

Around line 325:

=back doesn't take any parameters, but you said =back 4