The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parse::Readelf::Debug::Info - handle readelf's debug info section with a class

SYNOPSIS

  use Parse::Readelf::Debug::Info;

  my $debug_info = new Parse::Readelf::Debug::Info($executable);

  my @item_ids = $debug_info->item_ids('l_object2a');
  my @structure_layout1 = $debug_info->structure_layout($item_ids[0]);
  my @some_item_ids = $debug_info->item_ids_matching('^var', 'variable');
  my @all_item_ids = $debug_info->item_ids_matching('');
  my @all_struct_ids = $debug_info->item_ids_matching('', '.*structure.*');

ABSTRACT

Parse::Readelf::Debug::Info parses the output of readelf --debug-dump=info and stores its interesting details in an object to ease access.

DESCRIPTION

Normally an object of this class is constructed with the file name of an object file to be parsed. Upon construction the file is analysed and all relevant information about its debug info section is stored inside of the object. This information can be accessed afterwards using a bunch of getter methods, see "METHODS" for details.

AT THE MOMENT ONLY INFORMATION REGARDING THE BINARY ARRANGEMENT OF VARIABLES (STRUCTURE LAYOUT) IS SUPPORTED. Other data is ignored for now.

Currently only output for Dwarf versions 2 and 4 is supported. Please contact the author for other versions and provide some example readelf outputs.

EXPORT

Nothing is exported by default as it's normally not needed to modify any of the variables declared in the following export groups:

:all

all of the following groups

:command

$command

is the variable holding the command to run readelf to get the information relevant for this module, normally readelf --debug-dump=line.

:config

$display_nested_items

is a variable which controls if nested items (e.g. sub-structures) are not displayed unless actually used (e.g. as data type of members of their parent) or if they are always displayed - which might confuse the reader. The default is 0, any other value switches on the unconditional display.

$re_substructure_filter

is a regular expression that allows you to cut away the details of all substructures whose type names match the filter. This is useful if you have a bunch of types that you consider so basic that you like to blend out their details, e.g. the internal representation of a complex number datatype. The filter has the value ^string$ for C++ standard strings as default.

:constants

The following constants can be used to access the elements of the result of the method "structure_layout" (see below).

$LEVEL
$NAME
$TYPE
$SIZE
$LOCATION
$OFFSET
$BITSIZE
$BITOFFSET

:fixed_regexps

$re_section_start

is the regular expression that recognises the start of the info debug output of readelf.

$re_section_stop

is the regular expression that recognises the start of another debug output of readelf.

$re_unit_offset

is the regular expression that recognises the first line of a compilation unit in an info debug output of readelf. This line states the offset of the compilation unit itself. So this offset must be a hexadecimal string which will (must) be stored in $1 without any leading 0x. Usually it's 0 for the first unit.

$re_dwarf_version

is the regular expression that recognises the Dwarf version line in an info debug output of readelf. The version number must be an integer number which will (must) be stored in $1.

$re_unit_signature

is the regular expression that recognises the hexadecimal signature line at the start of a compilation unit in an info debug output of readelf. The signature ID must be a string which will (must) be stored in $1.

$re_type_offset

is the regular expression that recognises the type offset line at the start of a compilation unit in an info debug output of readelf. The offset must be a string which will (must) be stored in $1 without any leading 0x.

:versioned_regexps

These regular expressions are those that recognise the (yet) supported tags of the item nodes of a readelf debug info output. Each of them is actually a list using the Dwarf version as index:

@re_item_start

recognises the start of a new item in the debug info list. $1 is the level, $2 the internal (unique) item ID, $3 the numeric type ID and $4 the type tag.

@re_bit_offset

recognises the bit offset tag of an item. $1 will contain the offset.

@re_bit_size

recognises the bit size tag of an item. $1 will contain the size.

@re_byte_size

recognises the byte size tag of an item. $1 will contain the size.

@re_comp_dir

recognises the compilation directory tag of an item. $1 will contain the compilation directory as string.

@re_const_value

recognises the const value tag of an item. $1 will contain the value.

@re_containing_type

recognises the containing type tag of an item. Either $1 will contain the normal internal item ID or S2 will contain the Dwarf-4 signature of the containing type.

@re_decl_file

recognises the declaration file tag of an item. $1 will contain the number of the file name (see Parse::Readelf::Debug::Line).

@re_decl_line

recognises the declaration line tag of an item. $1 will contain the line number.

@re_declaration

recognises the declaration tag of an item. $1 will usually contain a 1 indicating that it is set.

@re_encoding

recognises the encoding tag of an item. $1 will contain the encoding as text.

@re_external

recognises the external tag of an item. $1 will usually contain a 1 indicating that it is set.

@re_language

recognises the language tag of an item. $1 will contain the language as text.

@re_linkage_name_tag

recognises the linkage name tag of an item. $1 will contain the name.

@re_location

recognises the data member location tag of an item. $1 will contain the offset.

@re_member_location

recognises the data location tag of an item. $1 will contain the hex value (with spaces between each byte).

@re_name_tag

recognises the name tag of an item. $1 will contain the name.

@re_producer

recognises the producer tag of an item. $1 will contain the producer as string.

@re_signature_tag

recognises the signature tag of an item. $1 will contain the leading <0x in case of a signature refering to the same compilation unit, $2 will contain the hexadecimal signature.

@re_specification

recognises the specification tag of an item. $1 will contain the internal item ID of the specification.

@re_type

recognises the type tag of an item. Either $1 will contain the normal internal item ID or S2 will contain the Dwarf-4 signature of the type.

@re_upper_bound

recognises the upper bound tag of a subrange item. $1 will contain the upper bound.

@re_ignored_attributes

recognises all attributes that are simply ignored (yet).

The last two lists are a bit different, they control what is parsed by this module. They are also arrays using the Dwarf version as index. What is inside each of this arrays is described below:

@tag_needs_attributes

holds hashes of the type tags that are processed. Each element points to a list of the absolutely needed attributes for that type of item.

@ignored_tags

is a list of the type tags (see @re_item_start above) that are currently ignored.

new - get readelf's debug info section into an object

    $debug_info = new Parse::Readelf::Debug::Info($file_name,
                                                 [$line_info]);

example:

    $debug_info1 = new Parse::Readelf::Debug::Info('program');
    $line_info = new Parse::Readelf::Debug::Line('module.o');
    $debug_info2 = new Parse::Readelf::Debug::Info('module.o',
                                                   $line_info);

parameters:

    $file_name          name of executable or object file
    $line_info          a L<Parse::Readelf::Debug::Line> object

description:

    This method parses the output of C<readelf --debug-dump=info> and
    stores its interesting details internally to be accessed later by
    getter methods described below.

    If no L<Parse::Readelf::Debug::Line> object is passed as second
    parameter the method creates one internally at it is needed to
    locate the source files.

global variables used:

    The method uses all of the variables described above in the
    L</"EXPORT"> section.

returns:

    The method returns the blessed Parse::Readelf::Debug::Info object
    or an exception in case of an error.

item_ids - get object ID(s) of (named) item

    @item_ids = $debug_info->item_ids($identifier);

example:

    @item_ids = $debug_info->item_ids('my_variable');

parameters:

    $identifier         name of item (e.g. variable name)

description:

    This method returns the internal item ID of all identifiers with
    the given name as array.

returns:

    If a name is unique, the method returns an array with exactly one
    element, if a name does not exist it returns an empty array and
    otherwise an array containing the IDs of all matching itmes is
    returned.

item_ids_matching - get object IDs of items matching constraints

    @item_ids = $debug_info->item_ids_matching($re_name, [$re_type_tag]);

example:

    @some_item_ids = $debug_info->item_ids_matching('^var', 'variable');
    @all_item_ids = $debug_info->item_ids_matching('');
    @all_structure_ids = $debug_info->item_ids_matching('', '.*structure.*');

parameters:

    $re_name            regular expression matching name of items
    $re_type_tag        regular expression matching type tag of items

description:

    This method returns an array containing the internal item ID of
    all identifiers that match both the regular expression for their
    name and their type tags.  Note that an empty string will match
    any name or type tag, even missing ones.  Also note that type tags
    in Dwarf 2 always begin with C<DW_TAG_>.

returns:

    If a name is unique, the method returns an array with exactly one
    element, if a name does not exist it returns an empty array and
    otherwise an array containing the IDs of all matching itmes is
    returned.  The IDs are sorted alphabetically according to their
    names.

structure_layout - get structure layout of variable or data type

    @structure_layout =
        $debug_info->structure_layout($id, [$initial_offset]);

example:

    @structure_layout1 =
        $debug_info->structure_layout('1a8');
    @structure_layout2 =
        $debug_info->structure_layout('2f0', 4);

parameters:

    $id                 internal ID of item
    $initial_offset     offset to be used for the beginning of the layout

description:

    This method returns the structure layout of a variable or data
    type with the given item ID (which can be found with the method
    L<"item_ids"> or L<"item_ids_matching">).  For each element of a
    structure it returns a sextuple containing (in that order)
    I<relative level>, I<name>, I<data type>, I<size>, I<location in
    source file> and I<offset> allthough some of the information might
    be missing (which is indicated by an empty string).  For bit
    fields two additional fields are added: I<bit-size> and
    I<bit-offset> (either both are defined or none at all).

    I<location in source file> is a triplet.  The first two elements
    (object ID of module and source number) are needed to get the file
    name from
    L<Parse::Readelf::Debug::Line::file|Parse::Readelf::Debug::Line/file>.
    The third is the line number within the source.  If in Dwarf 4 the
    last two elements are not provided, they will be replaced by the
    fixed string C<signature> and the signature ID of the compilation
    unit instead.

    Note that named indices for the result are defined in the
    L</":constants"> export (see above).

returns:

    The method returns an array of the sextuples described above.

KNOWN BUGS

For references as well as pointers outside of structures the size of the referenced data is shown, not the internal size of the reference self. This is a feature. (Note that this means that pointers to functions outside of structures always have the size 0.)

Only Dwarf versions 2 and 4 are currently supported. Please contact the author for other versions and provide some example readelf outputs. Without examples support of other versions will not be possible.

This has only be tested in a Unix like environment, namely Linux and Solaris.

SEE ALSO

Parse::Readelf, Parse::Readelf::Debug::Line and the readelf man page

AUTHOR

Thomas Dorner, <dorner (AT) cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2007-2020 by Thomas Dorner

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.6.1 or, at your option, any later version of Perl 5 you may have available.