The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parse::Matroska::Reader - a low-level reader for EBML files

VERSION

version 0.003

SYNOPSIS

    use Parse::Matroska::Reader;
    my $reader = Parse::Matroska::Reader->new($path);
    $reader->close;
    $reader->open(\$string_with_matroska_data);

    my $elem = $reader->read_element;
    print "Element ID: $elem->{elid}\n";
    print "Element name: $elem->{name}\n";
    if ($elem->{type} ne 'sub') {
        print "Element value: $elem->get_value\n";
    } else {
        while (my $child = $elem->next_child) {
            print "Child element: $child->{name}\n";
        }
    }
    $reader->close;

DESCRIPTION

Reads EBML data, which is used in Matroska files. This is a low-level reader which is meant to be used as a backend for higher level readers. TODO: write the high level readers :)

METHODS

new

Creates a new reader. Calls "open($arg)" with its arguments if provided.

open($arg)

Creates the internal filehandle. The argument can be:

  • An open filehandle or IO::Handle object.

    The filehandle is not dup()ed, so calling "close" in this object will close the given filehandle as well.

  • A scalar containing a path to a file.

  • On perl v5.14 or newer, a scalarref pointing to EBML data.

    For similar functionality in older perls, give an IO::String object or the handle to an already opened scalarref.

close

Closes the internal filehandle.

readlen($length)

Reads $length bytes from the internal filehandle.

read_id

Reads an EBML ID atom in hexadecimal string format, suitable for passing to "elem_by_hexid($id)" in Parse::Matroska::Definitions.

read_size

Reads an EBML Data Size atom, which immediately follows an EBML ID atom.

This returns an array consisting of:

  1. The length of the Data Size atom.

  2. The value encoded in the Data Size atom, which is the length of all the data following it.

read_str($length)

Reads a string of length $length bytes from the internal filehandle. The string is already "decode" in Encoded from UTF-8, which is the standard Matroska string encoding.

read_uint($length)

Reads an unsigned integer of length $length bytes from the internal filehandle.

Returns a Math::BigInt object if $length is greater than 4.

read_sint($length)

Reads a signed integer of length $length bytes from the internal filehandle.

Returns a Math::BigInt object if $length is greater than 4.

read_float($length)

Reads an IEEE floating point number of length $length bytes from the internal filehandle.

Only lengths 4 and 8 are supported (C float and double).

read_ebml_id($length)

Reads an EBML ID when it's encoded as the data inside another EBML element, that is, when the enclosing element's type is ebml_id.

This returns a hashref with the EBML element description as defined in Parse::Matroska::Definitions.

skip($length)

Skips $length bytes in the internal filehandle.

getpos

Wrapper for "$io->getpos" in IO::Seekable in the internal filehandle.

Returns undef if the internal filehandle can't getpos.

setpos($pos)

Wrapper for "$io->setpos" in IO::Seekable in the internal filehandle.

Returns undef if the internal filehandle can't setpos.

Croaks if setpos does not seek to the requested position, that is, if calling getpos does not yield the same object as the $pos argument.

read_element($read_bin)

Reads a full EBML element from the internal filehandle.

Returns a Parse::Matroska::Element object initialized with the read data. If read_bin is not present or is false, will delay-load the contents of binary type elements, that is, they will only be loaded when calling get_value on the returned Parse::Matroska::Element object.

Does not read the children of the element if its type is sub. Look into the Parse::Matroska::Element interface for details in how to read children elements.

Pass a true $read_bin if the stream being read is not seekable (getpos is undef) and the contents of binary elements is desired, otherwise seeking errors or internal filehandle corruption might occur.

NOTE

The API of this module is not yet considered stable.

CAVEATS

Children elements have to be processed as soon as an element with children is found, or their children ignored with "skip" in Parse::Matroska::Element. Not doing so doesn't cause errors but results in an invalid structure, with constant 0 depth.

To work correctly in unseekable streams, either the contents of binary-type elements has to be ignored or the read_bin flag to read_element has to be true.

AUTHOR

Kovensky <diogomfranco@gmail.com>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2012 by Diogo Franco.

This is free software, licensed under:

  The (two-clause) FreeBSD License