Search results for "distribution:WARC JACOB"
WARC - Web ARChive support for Perl
The "WARC" module is a convenience module for loading basic WARC support. After loading this module, the "WARC::Volume" and "WARC::Collection" classes are available. Overview of the WARC reader support modules WARC::Collection A "WARC::Collection" ob...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Date - datestamp objects for WARC library
"WARC::Date" objects encapsulate the details of the required format for timestamps in WARC headers. These objects have overloaded string and number conversions. As a string, a "WARC::Date" object produces the [W3C-NOTE-datetime] format, while convers...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Volume - Web ARChive file access for Perl
A "WARC::Volume" object represents a WARC file in the filesystem and provides access to the WARC records within as "WARC::Record" objects. Methods $volume = mount WARC::Volume ($filename) Construct a "WARC::Volume" object. The parameter is the name o...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Index - base class for WARC index classes
"WARC::Index" is an abstract base class for indexes on WARC files and WARC-alike files. This class establishes the expected interface and provides a simple interface for building indexes. Methods $index = attach WARC::Index::File::* (...) Construct a...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record - one record from a WARC file
"WARC::Record" objects come in two flavors with a common interface. Records read from WARC files are read-only and have meaningful return values from the methods listed in "Methods on records from WARC files". Records constructed in memory can be upd...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Fields - WARC record headers and application/warc-fields
The "WARC::Fields" class encapsulates information in the "application/warc-fields" format used for WARC record headers. This is a simple key-value format closely analogous to HTTP headers, however differences are significant enough that the "HTTP::He...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Builder - Web ARChive construction support for Perl
The "WARC::Builder" class is the high-level interface for writing WARC archives. It is a very simple interface, because, at this level, WARC is a very simple format: a simple sequence of WARC records, which "WARC::Builder" accepts as "WARC::Record" o...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record::Stub - WARC record delayed loading stub
This is an internal class used to delay loading of "WARC::Record::FromVolume" objects returned from searching indexes. All but the most trivial of accesses to these objects result in loading the actual record and replacing the object with the full ob...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Collection - Interface to a group of WARC files
The "WARC::Collection" class is the primary means by which user code is expected to use the WARC library. This class uses indexes to efficiently search for records in one or more WARC files. Search Keys The "search" method accepts a list of parameter...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record::Block - data block from a WARC file
This is an internal class used to implement the "open_block" instance method on "WARC::Record" objects. This class provides tied filehandles and the methods are documented in "Tying FileHandles" in perltie and perlfunc....
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Index::Entries - combine information from multiple WARC::Index entries
See "Common Methods" in WARC::Index::Entry for accessor methods. Constructor $combined_entry = coalesce WARC::Index::Entries ( [ ... ] ) Return a coalesced index entry by combining multiple "WARC::Index::Entry" objects, presumably from different inde...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record::Replay - WARC record replay registry and autoloading
This is an internal module that provides a registry of protocol replay support modules and an autoloading facility. POD ERRORS Hey! The above document had some coding errors, which are explained below: Around line 6: =over is the last thing in the do...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record::Payload - tied filehandle for reading decoded record payload
This is an internal class used to implement the "open_payload" instance method on "WARC::Record" objects. This class provides tied filehandles and the methods are documented in "Tying FileHandles" in perltie and perlfunc....
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Index::File::SDBM - SDBM index support for WARC library
......
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Index::Entry - abstract base class for WARC::Index entries
Common Methods Entries from all index systems support these methods: @report = $entry->distance( ... ) $distance = $entry->distance( ... ) In list context, return a detailed report mapping each search *key* to a distance value. In scalar context, ret...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Index::Volatile - in-memory volume index for WARC library
The "WARC::Index::Volatile" class provides an in-memory index implementation suitable for small-scale applications. Unusally for index systems, a volatile index object is also its own index builder. Loading this module also registers a handler that a...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Index::Builder - abstract base class for building indexes
"WARC::Index::Builder" is an abstract base class for constructing indexes on WARC files. The interface is documented here, but implemented in specialized classes for each index type. Some common code has also been moved to this class, and is also doc...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record::Sponge - data sponge for WARC records
"WARC::Record::Sponge" objects provide a streaming interface for constructing WARC records as data is received using a temporary file to store the record content. This allows recording records that exceed available memory. This class provides objects...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record::Logical - reassemble multi-segment records
This is an internal class used to implement "WARC::Record" objects representing continued records in WARC files. A "continued record" is also referred to as a "logical record" in the WARC specification and is a record that has one or more "continuati...
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC
WARC::Record::FromVolume - WARC record from a WARC file
This is an internal class used to implement "WARC::Record" objects representing records in WARC files. Methods in this class are documented as part of "WARC::Record"....
JCB/WARC-v0.0.1 - 17 Apr 2020 01:48:44 UTC