The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::BitStream::Base - A Role implementing the API for Data::BitStream

SYNOPSIS

  use Moo;
  with 'Data::BitStream::Base';

DESCRIPTION

A role written for Data::BitStream that provides the basic API, including generic code for almost all functionality.

This is used by particular implementations such as Data::BitStream::String and Data::BitStream::WordVec.

DATA

pos

A read-only non-negative integer indicating the current position in a read stream. It is advanced by read, get, and skip methods, as well as changed by to, from, rewind, and erase methods.

len

A read-only non-negative integer indicating the current length of the stream in bits. It is advanced by write and put methods, as well as changed by from and erase methods.

writing

A read-only boolean indicating whether the stream is open for writing or reading. Methods for read such as read, get, skip, rewind, skip, and exhausted are not allowed while writing. Methods for write such as write and put are not allowed while reading.

The write_open and erase_for_write methods will set writing to true. The write_close and rewind_for_read methods will set writing to false.

The read/write distinction allows implementations more freedom in internal caching of data. For instance, they can gather writes into blocks. It also can be helpful in catching mistakes such as reading from a target stream.

mode

The stream mode. Especially useful when given a file. The mode may be one of

  r    (read)
  ro   (readonly)
  w    (write)
  wo   (writeonly)
  rdwr (readwrite)
  a    (append)
file

The name of a file to read or write (depending on the mode).

fheaderlines

Only applicible when reading a file. Indicates how many header lines exist before the data.

fheader

When writing a file, this is the header to write before the data. When reading a file, this will be set to the header, if fheaderlines was given.

CLASS METHODS

maxbits

Returns the number of bits in a word, which is the largest allowed size of the bits argument to read and write. This will be either 32 or 64.

maxval

Returns the maximum value we can handle. This should be 2 ** maxbits - 1 , or 0xFFFF_FFFF for 32-bit, and 0xFFFF_FFFF_FFFF_FFFF for 64-bit.

OBJECT METHODS (reading)

These methods are only valid while the stream is in reading state.

rewind

Moves the position to the stream beginning.

exhausted

Returns true is the stream is at the end. Rarely used.

read($bits [, 'readahead'])

Reads $bits from the stream and returns the value. $bits must be between 1 and maxbits.

Returns undef if the current position is at the end of the stream.

Croaks with an off stream error if not enough bits are left in the stream.

The position is advanced unless the second argument is the string 'readahead'.

Note for implementations: You have to implement this.

readahead($bits) >

Identical to calling read with 'readahead' as the second argument. Returns the value of the next $bits bits (between 1 and maxbits). Returns undef if the current position is at the end. Allows reading past the end of the stream (fills with zeros as necessary). Does not advance the position.

skip($bits)

Advances the position $bits bits. Typically used in conjunction with readahead.

get_unary([$count])

Reads one or more values from the stream in 0000...1 unary coding. If $count is 1 or not supplied, a single value will be read. If $count is positive, that many values will be read. If $count is negative, values are read until the end of the stream.

In list context this returns a list of all values read. In scalar context it returns the last value read.

Note for implementations: You should have efficient code for this.

get_unary1([$count])

Like get_unary, but using 1111...0 unary coding. Less common.

get_binword($bits, [$count])

Reads one or more values from the stream as fixed-length binary numbers, each using $bits bits. The treatment of count and return values is identical to get_unary.

read_string($bits)

Reads $bits bits from the stream and returns them as a binary string, such as '0011011'.

OBJECT METHODS (writing)

These methods are only valid while the stream is in writing state.

write($bits, $value)

Writes $value to the stream using $bits bits. $bits must be between 1 and maxbits, unless value is 0 or 1, in which case bits may be larger than maxbits.

The stream length will be increased by $bits bits. Regardless of the contents of $value, exactly $bits bits will be used. If $value has more non-zero bits than $bits, the lower bits are written. In other words, $value will be masked before writing.

Note for implementations: You have to implement this.

put_unary(@values)

Writes the values to the stream in 0000...1 unary coding. Unary coding is only appropriate for relatively small numbers, as it uses $value + 1 bits.

Note for implementations: You should have efficient code for this.

put_unary1(@values)

Like put_unary, but using 1111...0 unary coding. Less common.

put_binword($bits, @values)

Writes the values to the stream as fixed-length binary values. This is just a loop inserting each value with write($bits, $value).

put_string(@strings)

Takes one or more binary strings, such as '1001101', '001100', etc. and writes them to the stream. The number of bits used for each value is equal to the string length.

put_stream($source_stream)

Writes the contents of $source_stream to the stream. This is a helper method that might be more efficient than doing it in one of the many other possible ways. Some functionally equivalent methods:

  $self->put_string( $source_stream->to_string );  # The default for put_stream

  $self->put_raw( $source_stream->to_raw, $source_stream->len );

  my $bits = $source_stream->len;
  $source_stream->rewind_for_read;
  while ($bits > 0) {
    my $wbits = ($bits >= 32) ? 32 : $bits;
    $self->write($wbits, $source_stream->read($wbits));
    $bits -= $wbits;
  }

OBJECT METHODS (conversion)

These methods may be called at any time, and will adjust the state of the stream.

to_string

Returns the stream as a binary string, e.g. '00110101'.

to_raw

Returns the stream as packed big-endian data. This form is portable to any other implementation on any architecture.

to_store

Returns the stream as some scalar holding the data in some implementation specific way. This may be portable or not, but it can always be read by the same implementation. It might be more efficient than the raw format.

from_string($string)

The stream will be set to the binary string $string.

from_raw($packed [, $bits])

The stream is set to the packed big-endian vector $packed which has $bits bits of data. If $bits is not present, then length($packed) will be used as the byte-length. It is recommended that you include $bits.

from_store($blob [, $bits])

Similar to from_raw, but using the value returned by to_store.

OBJECT METHODS (other)

erase

Erases all the data, while the writing state is left unchanged. The position and length will both be 0 after this is finished.

Note for implementations: You need an 'after' method to actually erase the data.

read_open

Reads the current input file, if one exists.

write_open

Changes the state to writing with no other API-visible changes.

write_close

Changes the state to reading, and the position is set to the end of the stream. No other API-visible changes happen.

erase_for_write

A helper function that performs erase followed by write_open.

rewind_for_read

A helper function that performs write_close followed by rewind.

INTERNAL METHODS

These methods are used by roles. As a stream user you should not be using these.

code_pos_start
code_pos_end
code_pos_set

Used to handle exceptions for codes that call other codes. Generally used in get_* methods. The primary reasoning for this is that we want to unroll the stream location back to where the caller tried to read the code on an error. That way they can try again with a different code, or examine the bits that resulted in an incorrect code. code_pos_start starts a new stack entry, code_pos_set sets the start of the current code so we know where to go back to, and code_pos_end indicates we're done so the code stack entry can be removed.

code_pos_is_set

Returns the code stack or undef if not in a code. This should always be undef for users. If it is not, it means some code routine finished abnormally and didn't remove their error stack.

error_off_stream

Croaks with a message about reading or skipping off the stream. If this happens inside a get_ method, it should indicate the outermost code that was used. The stream position is restored to the start of the outer code.

error_stream_mode

Croaks with a message about the wrong mode being used. This is what happens when an attempt is made to write to a stream opened for reading, or read from a stream opened for writing.

error_code

Central routine that captures code errors, including incorrect parameters, values out of range, overflows, range errors, etc. All errors cause a croak except assertions, which will confess (since they indicate a serious internal issue). Some additional information is also included if possible (e.g. the outermost code being used, the allowed range, the value, etc.).

SEE ALSO

Data::BitStream
Data::BitStream::String
Data::BitStream::WordVec

AUTHORS

Dana Jacobsen <dana@acm.org>

COPYRIGHT

Copyright 2011-2012 by Dana Jacobsen <dana@acm.org>

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.