YAX::Query - Query the YAX DOM
use YAX::Query; $q = YAX::Query->new( $node ); $q->select( $expr ); # method interface $q->parent(); $q->descendants(); $q->children( $type ); $q->child( $tag_name ); $q->attributes; $q->attribute( $name ); $q->filter( \&code );
This module implements a tool for querying a YAX DOM tree. It supports an expression parser for simple querying of the DOM using an E4X-ish syntax, as well as a method interface.
It is useful to note that a YAX::Query object is a blessed array reference and that the resulting nodes matching the query are stored in this array reference. Therefore all query methods return the query object itself, and to access the results you simply inspect this object. For example, the following searches for all text nodes which are children of `em' elements, which in turn are children of all `div' descendants:
my $q = YAX::Query->new( $node ); $q->select(q{..div.em.#text}); for my $found ( @$q ) { # $found is a YAX::Text node }
The select method returns the query object itself, so the following, which selects all `li' descendants which have an `foo' attribute equal to "bar", also works:
for my $item ( @{ $q->select(q{..li.(@foo eq "bar")}) } ) { ... }
A query expression is constructed of a sequence of tokens separated by a literal `.' (dot). Each successive token represents an operation on the resulting set of the application of the previous token's operation.
In the initial state, the set of nodes contains only the context node passed to the constructor: YAX::Query-new( $node )>.
YAX::Query-
Filters are enclosed in `(' and `)', and generally contain Perl expressions with the exception that tokens of the form /\@(\w+)/ are replaced with $_->{$1} where `$_' is the current node in the loop which is applying the filter.
The following is a list of valid tokens:
descendants of
all element children of
all elements named element_name
element_name
all attributes of
NOTE: This adds the hash reference of the element itself, and not a list of attribute values. Moreover, adding a node selector after this in sequence is meaningless since attributes cannot have children. An exception will be raised if this occurs.
all attributes named attribute_name
attribute_name
NOTE: This adds a list of attribute values to the set. As above, node selectors following this are meaningless, and will raise and exception.
parent nodes of the set
all text children
all processing instruction children
all CDATA children
all child nodes of
all comment children of
Apply the filter $expr by turning it into a Perl code reference. Expressions are Perl with the exception that tokens of the form /\@(\w+)/ are replaced with $_->{$1} where `$_' is the current node in the loop which is applying the filter.
$expr
the n-th element of the set
Constructor.
Evaluates $expr and returns the query object itself. The results are simply the elements in the query object which is a blessed array reference. This allows for chaining and piecemeal querying. The follow shows some different ways of achieving the same thing:
my $q = YAX::Query->new( $node ); $q->select('..div.*'); # get all children of all `div' descendants $q->filter( \&filter ); # filter the set obtained on the live above $q->select('..div.*')->filter( \&filter ); # same as the two lines above # or the equivalent @ids = grep { filter( $_ ) } @{ $q->select('..div.*') };
See `.parent()' above
Selects child nodes of type $type (see YAX::Constants for valid types). The `#text', `#cdata', `#processing-instruction' and `#comment' selectors are implemented with children(...).
children(...)
Selects elements named $name.
Selects attribute values named $name.
Selects the attributes hash for each element in the set.
Selects descendants for each element in the set.
Applies the passed code reference to each element in the set, adding the element to the resulting set iff the code reference returns a true value.
Syntax errors in the expressions are currently not handled very well. If the expression doesn't parse, an exception is raised, but because of the simplicity of the lexer, the information required to inform the user of exactly what went wrong is unavailable.
Changing this requires a more complex parser which will significantly impact performance, and so I'm reluctant to implement this since query expressions tend to be short enough for debugging by inspection.
Result sets from a query are not "live". That is, if a node is removed from or added to the DOM tree after the query is performed, these changes will not be reflected in the query result set.
t/03-query.t in the test suite for an extensive list of examples
Richard Hundt
This program is free software and may be used and distributed under the same terms as Perl itself.
To install YAX, copy and paste the appropriate command in to your terminal.
cpanm
cpanm YAX
CPAN shell
perl -MCPAN -e shell install YAX
For more information on module installation, please visit the detailed CPAN module installation guide.