Zed Pobre > EBook-Tools-v0.4.9 > EBook::Tools::IMP

Download:
EBook-Tools-v0.4.9.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  1
View/Report Bugs
Module Version: v0.4.8   Source   Latest Release: EBook-Tools-v0.5.4

NAME ^

EBook::Tools::IMP - Object class for manipulating the SoftBook/GEB/REB/eBookWise .IMP and .RES e-book formats

SYNOPSIS ^

 use EBook::Tools::IMP qw(:all)
 my $imp = EBook::Tools::IMP->new();
 $imp->load('myfile.imp');

CONSTRUCTOR AND INITIALIZATION ^

new($filename)

Instantiates a new EBook::Tools::IMP object. If $filename is specified, it will also immediately initialize itself via the load method.

load($filename)

Loads a .imp file, parsing it into the various object attributes. Returns 1 on success, or undef on failure.

load_resdir($dirname)

Loads a .RES resource directory, parsing it into the object attributes. Returns 1 on success, or undef on failure.

author()

Returns the full name of the author of the book.

Author information can either be found entirely in the $self->{firstname} attribute or split up into $self->{firstname}, $self->{middlename}, and $self->{lastname}. If the last name is found separately, the full name is returned in the format "Last, First Middle". Otherwise, the full name is returned in the format "First Middle".

bookproplength()

Returns the total length in bytes of the book properties data, including the trailing null used to pack the C-style strings, but excluding any ETI server data appended to the end of the standard book properties.

filecount()

Returns the number of resource files as stored in $self->{filecount}. Note that this does NOT recompute that value from the actual number of resources in $self->{resources}. To do that, use "create_toc_from_resources()".

find_image_type($id,@excluded)

Goes through all stored images searching for one with the specified id value, returning the first image type found or undef if there were no matches or if no image id was specified. If the optional argument @excluded is specified, any types in the list will be skipped during the search.

Expected types are 'png', 'jpg', 'gif', and 'pic', searched for in that order.

This can be used to attempt to locate an alternate image for an undisplayable PICT image.

find_resource_by_name($name)

Takes as a single argument a resource name and if a resource with that name exists in $self->{resources} returns the resource type used as the hash key.

Returns undef if no match was found or a name was not specified.

image($type,$id)

Returns the image data stored in the resource of the specified type (specifically, stored in $self->{$type}->{$id}->{data} as parsed from the JPEG resource) corresponding to the 16-bit identifier provided as $id.

Valid values for $type are 'gif','jpg', and 'png'.

Carps a warning and returns undef if $type is not provided or is not valid, or if $id is not provided.

image_hashref($type,$id)

Returns the raw object hashref used to store parsed image data for the specified type, as stored in $self->{$type}. Valid types are 'gif', 'jpg', and 'png'.

Carps a warning and returns undef if $type is not provided or is not valid.

If $id is not specified, the keys of the returned hash are the image IDs for the specified image type, and the values are hashrefs pointing to hashes containing the following keys:

If the optional argument $id is specified, only the hash for that specific ID is returned, rather than the entire hash of hashrefs.

image_ids($type)

Returns a list of the 16-bit integer IDs of the the specified type of image data stored in the associated resource (specifically, stored in $self->{$type} as parsed from the JPEG resource).

Valid types are 'gif', 'jpg', and 'png'. The method will carp a warning and return undef if another type is specified, or no type is specified.

is_1150()

Returns 1 if $self->{device} == 2, returns 0 if it is some other value, and undef it is undefined. This has value because resources packed for a EBW 1150 or GEB 1150 are in a different format than resources packed for other IMP readers.

offsetelement($offset)

Returns the text of the element corresponding to the given text offset as stored in $self->{offsetelements}, or undef if no such element exists.

pack_imp_book_properties()

Packs object attributes into the 7 null-terminated strings that constitute the book properties section of the header. Returns that string.

Note that this does NOT pack the ETI server data appended to this section in encrypted books downloaded directly from the ETI servers, even if that data was found when the .imp file was loaded. This is because the extra data can confuse the GEBLibrarian application, and is not needed to read the book. The "bookproplength()" and "pack_imp_header()" methods also assume that this data will not be present.

pack_imp_header()

Packs object attributes into the 48-byte string representing the IMP header. Returns that string on success, carps a warning and returns undef if a required attribute did not contain valid data.

Note that in the case of an encrypted e-book with ETI server data in it, this header will not be identical to the original -- the resdiroffset value is recalculated for the position with the ETI server data stripped. See "bookproplength()" and "pack_imp_book_properties()".

pack_imp_resource(%args)

Packs the specified resource stored in $self->{resources} into a a data string suitable for writing into a .imp file, with a header format determined by $self->{version}.

Returns a reference to that string if the resource was found, or undef it was not.

Arguments

pack_imp_rsrc_inf()

Packs object attributes into the data string that would be the content of the RSRC.INF file. Returns that string.

pack_imp_toc()

Packs the $self->{toc} object attribute into a data string suitable for writing into a .imp file. The format is determined by $self->{version}.

Returns that string, or undef if valid version or TOC data is not found.

resdirbase()

In scalar context, this returns the basename of $self->{resdirname}. In list context, it actually returns the basename, directory, and extension as per fileparse from File::Basename.

resdirlength()

Returns the length of the .RES directory name as stored in $self->{resdirlength}. Note that this does NOT recompute the length from the actual name stored in $self->{resdirname} -- for that, use "set_resdirlength()".

resdirname()

Returns the .RES directory name stored in $self->{resdirname}.

resource($type)

Returns a hashref containing the resource data for the specified resource type, as stored in $self->{resources}->{$type}.

Returns undef if $type is not specified, or if the specified type is not found.

resources()

Returns a hashref of hashrefs containing all of the resource data keyed by type, as stored in $self->{resources}.

text()

Returns the uncompressed text originally stored in the DATA.FRK (' ') resource. This will only work if the text was unencrypted.

title()

Returns the book title as stored in $self->{title}.

tocentry($index)

Takes as a single argument an integer index to the table of contents data stored in $self->{toc}. Returns the hashref corresponding to that TOC entry, if it exists, or undef otherwise.

version()

Returns the version of the IMP format used to determine TOC and resource metadata size as stored in $self->{version}. Expected values are 1 (10-byte metadata) and 2 (20-byte metadata).

write_images(%args)

Writes the images, if any, to the specified output directory. Filenames are in the format JPEG_XXXX.jpg or PNG_XXXX.png where XXXX is the image ID for that image type formatted as four hexadecimal characters.

Arguments

write_imp($filename)

Takes as a sole argument the name of a file to write to, and writes a .imp file to that filename using the object attribute data.

Returns 1 on success, or undef if required data (including the filename) was invalid or missing, or the file could not be written.

write_resdir()

Writes a .RES resource directory from the object attribute data, using $self->{resdirname} as the directory name.

write_text(%args)

Writes the uncompressed text, if any, to the specified output directory and file.

Arguments

create_toc_from_resources()

Creates appropriate table of contents data from the metadata in $self->{resources}, in the format specified by $self->{version}. This will also set $self->{filecount} to match the actual number of resources.

Returns the number of resources found.

parse_eti_server_data($data)

Parses ETI server data, as potentially found appended to the end of .imp book properties or a RSRC.INF resource file on encrypted books downloaded directly from ETI servers.

Takes as a single argument a string containing just the extra appended data, and stores the parsed values in $self->{etiserverdata} as a hash. Note that parsing requires knowledge of the length of the book properties at the time this data was inserted; if the book properties have not been properly parsed or have been modified, the resulting behaviour of this method is not defined.

Returns the number of bytes handled, zero if no data was provided.

The data has the following format and keys:

parse_imp_book_properties($propdata)

Takes as a single argument a string containing the book properties data. Sets the object variables from its contents, which should be seven null-terminated strings in the following order:

Note that the entire name is frequently placed into the "First Name" component, and the "Last Name" and "Middle Name" components are left blank.

In addition, ETI server data may be appended to this data on encrypted books downloaded from ETI servers. If present, that data will be stored in the hash $self->{etiserverdata}. See "parse_eti_server_data($data)" for details.

A warning will be carped if the length of the parsed properties (including the C null string terminators) is not equal to the length of the data passed.

parse_imp_header()

Parses the first 48 bytes of a .IMP file, setting object variables. The method croaks if it receives any more or less than 48 bytes.

Header Format

parse_resource_cm()

Parses the !!cm resource loaded into $self->{resources}, if present, extracting the LZSS uncompression parameters into $self->{lzssoffsetbits} and $self->{lzsslengthbits}.

Returns 1 on success, or undef if no !!cm resource has been loaded yet or the resource data is invalid.

parse_resource_images()

Parses the image data resources loaded into $self->{resources}, if present, placing the image data and metadata of each image found into $self->{jpg} and $self->{png}, keyed by 16-bit image resource ID.

Returns the total number of images found and parsed.

This method is called automatically by "load()" and "load_resdir()".

See also accessor methods "image(%args)" and "image_hashrefs(%args)".

parse_resource_imrn()

Parses the index of text offsets to all images as stored in $self->{resources}->{'ImRn'}, if present, storing them in $self->{imrn} as a hash of hashrefs indexed by its 32-bit integer offset to the 0x0F control code in the uncompressed text stored in the DATA.FRK resource.

Returns the total number of offsets found and parsed.

The hash keys of each offset hash are:

This method is called automatically by "load()" and "load_resdir()".

parse_text()

Parses the ' ' (DATA.FRK) resource loaded into $self->{resources}, if present, extracting the text into $self->{text}, uncompressing it if necessary. LZSS uncompression will use the $self->{lzsslengthbits} and $self->{lzssoffsetbits} attributes if present, and default to 3 length bits and 14 offset bits otherwise.

HTML headers and footers are then applied, and control codes replaced with appropriate tags.

Returns the length of the raw uncompressed text before any HTML modification was done, or undef if no text resource was found or the text was encrypted.

parse_imp_toc_v1($tocdata)

Takes as a single argument a string containing the table of contents data, and parses it into object attributes following the version 1 format (10 bytes per entry).

Format

parse_imp_toc_v2($tocdata)

Takes as a single argument a string containing the table of contents data, and parses it into object attributes following the version 2 format (20 bytes per entry).

Format

set_book_properties(%args)

Sets the specified book properties. Returns 1 on success, or undef if no properties were specified.

Arguments

Example

 $imp->set_book_properties(title => 'My Best Book',
                           category => 'Fiction',
                           firstname => 'John Q. Public');

PROCEDURES ^

All procedures are exportable, but none are exported by default.

detect_resource_type(\$data)

Takes as a sole argument a reference to the data component of a resource. Returns a 4-byte string containing the resource type if detected successfully, or undef otherwise.

Detection will not work on the DATA.FRK (' ') resource. That one must be detected separately by name/type.

parse_imp_resource_v1()

Takes as a sole argument a string containing the data (including the 10-byte header) of a version 1 IMP resource.

Returns a hashref containing that data separated into the following keys:

parse_imp_resource_v2()

Takes as a sole argument a string containing the data (including the 20-byte header) of a version 2 IMP resource.

Returns a hashref containing that data separated into the following keys:

BUGS AND LIMITATIONS ^

AUTHOR ^

Zed Pobre <zed@debian.org>

THANKS ^

Thanks are due to Nick Rapallo <nrapallo@yahoo.ca> for invaluable assistance in understanding the .IMP format and testing this code.

Thanks are also due to Jeffrey Kraus-yao <krausyaoj@ameritech.net> for his work reverse-engineering the .IMP format to begin with, and the documentation at http://krausyaoj.tripod.com/reb1200.htm.

LICENSE AND COPYRIGHT ^

Copyright 2008 Zed Pobre

Licensed to the public under the terms of the GNU GPL, version 2.

syntax highlighting: