HTML::TreeBuilder::XPath - add XPath support to HTML::TreeBuilder
use HTML::TreeBuilder::XPath; my $tree= HTML::TreeBuilder::XPath->new; $tree->parse_file( "mypage.html"); my $nb=$tree->findvalue( '/html/body//p[@class="section_title"]/span[@class="nb"]'); my $id=$tree->findvalue( '/html/body//p[@class="section_title"]/@id'); my $p= $html->findnodes( '//p[@id="toto"]')->[0]; my $link_texts= $p->findvalue( './a'); # the texts of all a elements in $p
This module adds typical XPath methods to HTML::TreeBuilder, to make it easy to query a document.
Extra methods added both to the tree object and to each element:
Returns a list of nodes found by $path. In scalar context returns an Tree::XPathEngine::NodeSet object.
$path
Tree::XPathEngine::NodeSet
Returns the text values of the nodes
Returns either a Tree::XPathEngine::Literal, a Tree::XPathEngine::Boolean or a Tree::XPathEngine::Number object. If the path returns a NodeSet, $nodeset->xpath_to_literal is called automatically for you (and thus a Tree::XPathEngine::Literal is returned). Note that for each of the objects stringification is overloaded, so you can just print the value found, or manipulate it in the ways you would a normal perl value (e.g. using regular expressions).
Tree::XPathEngine::Literal
Tree::XPathEngine::Boolean
Tree::XPathEngine::Number
Returns true if the given path exists.
Returns true if the element matches the path.
The find function takes an XPath expression (a string) and returns either a Tree::XPathEngine::NodeSet object containing the nodes it found (or empty if no nodes matched the path), or one of XML::XPathEngine::Literal (a string), XML::XPathEngine::Number, or XML::XPathEngine::Boolean. It should always return something - and you can use ->isa() to find out what it returned. If you need to check how many nodes it found you should check $nodeset->size. See XML::XPathEngine::NodeSet.
HTML::TreeBuilder
XML::XPathEngine
Michel Rodriguez, <mirod@cpan.org>
Copyright (C) 2006 by Michel Rodriguez
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
To install HTML::TreeBuilder::XPath, copy and paste the appropriate command in to your terminal.
cpanm
cpanm HTML::TreeBuilder::XPath
CPAN shell
perl -MCPAN -e shell install HTML::TreeBuilder::XPath
For more information on module installation, please visit the detailed CPAN module installation guide.