Parse::Readelf::Debug::Info - handle readelf's debug info section with a class
use Parse::Readelf::Debug::Info; my $debug_info = new Parse::Readelf::Debug::Info($executable); my @item_ids = $debug_info->item_ids('l_object2a'); my @structure_layout1 = $debug_info->structure_layout($item_ids[0]); my @some_item_ids = $debug_info->item_ids_matching('^var', 'variable'); my @all_item_ids = $debug_info->item_ids_matching(''); my @all_struct_ids = $debug_info->item_ids_matching('', '.*structure.*');
Parse::Readelf::Debug::Info parses the output of readelf --debug-dump=info and stores its interesting details in an object to ease access.
readelf --debug-dump=info
Normally an object of this class is constructed with the file name of an object file to be parsed. Upon construction the file is analysed and all relevant information about its debug info section is stored inside of the object. This information can be accessed afterwards using a bunch of getter methods, see "METHODS" for details.
AT THE MOMENT ONLY INFORMATION REGARDING THE BINARY ARRANGEMENT OF VARIABLES (STRUCTURE LAYOUT) IS SUPPORTED. Other data is ignored for now.
Currently only output for Dwarf version 2 is supported. Please contact the author for other versions and provide some example readelf outputs.
readelf
Nothing is exported by default as it's normally not needed to modify any of the variables declared in the following export groups:
all of the following groups
is the variable holding the command to run readelf to get the information relevant for this module, normally readelf --debug-dump=line.
readelf --debug-dump=line
is a variable which controls if nested items (e.g. sub-structures) are not displayed unless actually used (e.g. as data type of members of their parent) or if they are always displayed - which might confuse the reader. The default is 0, any other value switches on the unconditional display.
is a regular expression that allows you to cut away the details of all substructures whose type names match the filter. This is useful if you have a bunch of types that you consider so basic that you like to blend out their details, e.g. the internal representation of a complex number datatype. The filter has the value ^string$ for C++ standard strings as default.
^string$
The following constants can be used to access the elements of the result of the method "structure_layout" (see below).
is the regular expression that recognises the start of the info debug output of readelf.
is the regular expression that recognises the start of another debug output of readelf.
is the regular expression that recognises the first line of a compilation unit in an info debug output of readelf. This line states the offset of the compilation unit itself. So this offset must be a hexadecimal string which will (must) be stored in $1 without any leading 0x. Usually it's 0 for the first unit.
$1
0x
is the regular expression that recognises the Dwarf version line in an info debug output of readelf. The version number must be an integer number which will (must) be stored in $1.
is the regular expression that recognises the hexadecimal signature line at the start of a compilation unit in an info debug output of readelf. The signature ID must be a string which will (must) be stored in $1.
is the regular expression that recognises the type offset line at the start of a compilation unit in an info debug output of readelf. The offset must be a string which will (must) be stored in $1 without any leading 0x.
These regular expressions are those that recognise the (yet) supported tags of the item nodes of a readelf debug info output. Each of them is actually a list using the Dwarf version as index:
recognises the start of a new item in the debug info list. $1 is the level, $2 the internal (unique) item ID, $3 the numeric type ID and $4 the type tag.
$2
$3
$4
recognises the bit offset tag of an item. $1 will contain the offset.
recognises the bit size tag of an item. $1 will contain the size.
recognises the byte size tag of an item. $1 will contain the size.
recognises the compilation directory tag of an item. $1 will contain the compilation directory as string.
recognises the const value tag of an item. $1 will contain the value.
recognises the containing type tag of an item. Either $1 will contain the normal internal item ID or S2 will contain the Dwarf-4 signature of the containing type.
S2
recognises the declaration file tag of an item. $1 will contain the number of the file name (see Parse::Readelf::Debug::Line).
recognises the declaration line tag of an item. $1 will contain the line number.
recognises the declaration tag of an item. $1 will usually contain a 1 indicating that it is set.
recognises the encoding tag of an item. $1 will contain the encoding as text.
recognises the external tag of an item. $1 will usually contain a 1 indicating that it is set.
recognises the language tag of an item. $1 will contain the language as text.
recognises the linkage name tag of an item. $1 will contain the name.
recognises the data member location tag of an item. $1 will contain the offset.
recognises the data location tag of an item. $1 will contain the hex value (with spaces between each byte).
recognises the name tag of an item. $1 will contain the name.
recognises the producer tag of an item. $1 will contain the producer as string.
recognises the signature tag of an item. $1 will contain the leading <0x in case of a signature refering to the same compilation unit, $2 will contain the hexadecimal signature.
<0x
recognises the specification tag of an item. $1 will contain the internal item ID of the specification.
recognises the type tag of an item. Either $1 will contain the normal internal item ID or S2 will contain the Dwarf-4 signature of the type.
recognises the upper bound tag of a subrange item. $1 will contain the upper bound.
recognises all attributes that are simply ignored (yet).
The last two lists are a bit different, they control what is parsed by this module. They are also arrays using the Dwarf version as index. What is inside each of this arrays is described below:
holds hashes of the type tags that are processed. Each element points to a list of the absolutely needed attributes for that type of item.
is a list of the type tags (see @re_item_start above) that are currently ignored.
@re_item_start
$debug_info = new Parse::Readelf::Debug::Info($file_name, [$line_info]);
$debug_info1 = new Parse::Readelf::Debug::Info('program'); $line_info = new Parse::Readelf::Debug::Line('module.o'); $debug_info2 = new Parse::Readelf::Debug::Info('module.o', $line_info);
$file_name name of executable or object file $line_info a L<Parse::Readelf::Debug::Line> object
This method parses the output of C<readelf --debug-dump=info> and stores its interesting details internally to be accessed later by getter methods described below. If no L<Parse::Readelf::Debug::Line> object is passed as second parameter the method creates one internally at it is needed to locate the source files.
The method uses all of the variables described above in the L</"EXPORT"> section.
The method returns the blessed Parse::Readelf::Debug::Info object or an exception in case of an error.
@item_ids = $debug_info->item_ids($identifier);
@item_ids = $debug_info->item_ids('my_variable');
$identifier name of item (e.g. variable name)
This method returns the internal item ID of all identifiers with the given name as array.
If a name is unique, the method returns an array with exactly one element, if a name does not exist it returns an empty array and otherwise an array containing the IDs of all matching itmes is returned.
@item_ids = $debug_info->item_ids_matching($re_name, [$re_type_tag]);
@some_item_ids = $debug_info->item_ids_matching('^var', 'variable'); @all_item_ids = $debug_info->item_ids_matching(''); @all_structure_ids = $debug_info->item_ids_matching('', '.*structure.*');
$re_name regular expression matching name of items $re_type_tag regular expression matching type tag of items
This method returns an array containing the internal item ID of all identifiers that match both the regular expression for their name and their type tags. Note that an empty string will match any name or type tag, even missing ones. Also note that type tags in Dwarf 2 always begin with C<DW_TAG_>.
If a name is unique, the method returns an array with exactly one element, if a name does not exist it returns an empty array and otherwise an array containing the IDs of all matching itmes is returned. The IDs are sorted alphabetically according to their names.
@structure_layout = $debug_info->structure_layout($id, [$initial_offset]);
@structure_layout1 = $debug_info->structure_layout('1a8'); @structure_layout2 = $debug_info->structure_layout('2f0', 4);
$id internal ID of item $initial_offset offset to be used for the beginning of the layout
This method returns the structure layout of a variable or data type with the given item ID (which can be found with the method L<"item_ids"> or L<"item_ids_matching">). For each element of a structure it returns a sextuple containing (in that order) I<relative level>, I<name>, I<data type>, I<size>, I<location in source file> and I<offset> allthough some of the information might be missing (which is indicated by an empty string). For bit fields two additional fields are added: I<bit-size> and I<bit-offset> (either both are defined or none at all). I<location in source file> is a triplet. The first two elements (object ID of module and source number) are needed to get the file name from L<Parse::Readelf::Debug::Line::file|Parse::Readelf::Debug::Line/file>. The third is the line number within the source. If in Dwarf 4 the last two elements are not provided, they will be replaced by the fixed string C<signature> and the signature ID of the compilation unit instead. Note that named indices for the result are defined in the L</":constants"> export (see above).
The method returns an array of the sextuples described above.
For references as well as pointers outside of structures the size of the referenced data is shown, not the internal size of the reference self. This is a feature. (Note that this means that pointers to functions outside of structures always have the size 0.)
Only Dwarf versions 2 and 4 are currently supported. Please contact the author for other versions and provide some example readelf outputs. Without examples support of other versions will not be possible. Note that the support of Dwarf version 4 is still experimental.
This has only be tested in a Unix like environment, namely Linux and Solaris.
Parse::Readelf, Parse::Readelf::Debug::Line and the readelf man page
Thomas Dorner, <dorner (AT) cpan.org>
Copyright (C) 2007-2013 by Thomas Dorner
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.6.1 or, at your option, any later version of Perl 5 you may have available.
To install Parse::Readelf, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parse::Readelf
CPAN shell
perl -MCPAN -e shell install Parse::Readelf
For more information on module installation, please visit the detailed CPAN module installation guide.