The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTML::Element::Library - HTML::Element convenience functions

SYNOPSIS

  use HTML::Element::Library;
  use HTML::TreeBuilder;

DESCRIPTION

This method provides API calls for common actions on trees when using HTML::Tree.

METHODS

The test suite contains examples of each of these methods in a file t/$method.t

Positional Querying Methods

$elem->siblings

Return a list of all nodes under the same parent.

$elem->sibdex

Return the index of $elem into the array of siblings of which it is a part. HTML::ElementSuper calls this method addr but I don't think that is a descriptive name. And such naming is deceptively close to the address function of HTML::Element. HOWEVER, in the interest of backwards compatibility, both methods are available.

$elem->addr

Same as sibdex

$elem->position()

Returns the coordinates of this element in the tree it inhabits. This is accomplished by succesively calling addr() on ancestor elements until either a) an element that does not support these methods is found, or b) there are no more parents. The resulting list is the n-dimensional coordinates of the element in the tree.

Element Decoration Methods

HTML::Element::Library::super_literal($text)

In HTML::Element, Sean Burke discusses super-literals. They are text which does not get escaped. Great for includng Javascript in HTML. Also great for including foreign language into a document.

So, you basically toss super_literal your text and back comes your text wrapped in a ~literal element.

One of these days, I'll around to writing a nice EXPORT section.

Tree Rewriting Methods

$elem->replace_content(@new_elem)

Replaces all of $elem's content with @new_elem.

$elem->wrap_content($wrapper_element)

Wraps the existing content in the provided element. If the provided element happens to be a non-element, a push_content is performed instead.

$elem->set_child_content(@look_down, $content)

  This method looks down $tree using the criteria specified in @look_down using the the HTML::Element look_down() method.

After finding the node, it detaches the node's content and pushes $content as the node's content.

$tree->content_handler($sid_value , $content)

This is a convenience method. Because the look_down criteria will often simply be:

   id => 'fixme'

to find things like:

   <a id=fixme href=http://www.somesite.org>replace_content</a>

You can call this method to shorten your typing a bit. You can simply type

   $elem->content_handler( fixme => 'new text' )

Instead of typing:

  $elem->set_child_content(sid => 'fixme', 'new text') 

$tree->highlander($subtree_span_id, $conditionals, @conditionals_args)

This allows for "if-then-else" style processing. Highlander was a movie in which only one would survive. Well, in terms of a tree when looking at a structure that you want to process in if-then-else style, only one child will survive. For example, given this HTML template:

 <span klass="highlander" id="age_dialog"> 
    <span id="under10"> 
       Hello, does your mother know you're  
       using her AOL account? 
    </span> 
    <span id="under18"> 
       Sorry, you're not old enough to enter  
       (and too dumb to lie about your age) 
    </span> 
    <span id="welcome"> 
       Welcome 
    </span> 
 </span> 
 

We only want one child of the span tag with id age_dialog to remain based on the age of the person visiting the page.

So, let's setup a call that will prune the subtree as a function of age:

 sub process_page {
  my $age = shift;
  my $tree = HTML::TreeBuilder->new_from_file('t/html/highlander.html');

  $tree->highlander
    (age_dialog =>
     [
      under10 => sub { $_[0] < 10} , 
      under18 => sub { $_[0] < 18} ,
      welcome => sub { 1 }
     ],
     $age
    );

And there we have it. If the age is less than 10, then the node with id under10 remains. For age less than 18, the node with id under18 remains. Otherwise our "else" condition fires and the child with id welcome remains.

$tree->highlander2($subtree_span_id, $conditionals, @conditionals_args)

Right around the same time that table2() came into being, as Seamstress began to tackle tougher and tougher processing problems. It became clear that a more powerful highlander was needed... one that not only snipped the tree of the nodes that should not survive, but one that allows for post-processing of the survivor node.

Thus highlander2().

So let's look at our HTML which requires post-selection processing:

  <span id=book_status>
    <span id=checked_out>
      Checked out to <a id=borrower_url href="">borrower_name</a><br/>
      <a id=return_url href="">Return it</a>
    </span>
    <form id=not_checked_out id=checkout_url action="" method="post">
          <input type="submit" value="Check out to"/>
    </form>
  </span>

In this case, it is not enough to have the checked_out or not_checked_out branch survive. We must take either segment of HTML and rewrite the URLs in either. Here is how we use highlander2 to do so:

  $tree->highlander2(book_status => [
    checked_out     => [
      sub { $stash->{item}->borrower },
      sub { 
        my $branch = shift; 
        my $url = $c->uri_for('borrower/view', $borrower_id);
        $branch->look_down(id => 'borrower_url')->replace_content($url);
      }
     ],
    not_checked_out => [
      sub { 1 },
      sub { more_rewriting }
     ]
   ]);

Each lookdown_id to highlander2() takes an arrayref of subs (as shown above) or a single sub.

$tree->overwrite_attr($mutation_attr => $mutating_closures)

This method is designed for taking a tree and reworking a set of nodes in a stereotyped fashion. For instance let's say you have 3 remote image archives, but you don't want to put long URLs in your img src tags for reasons of abstraction, re-use and brevity. So instead you do this:

  <img src="/img/smiley-face.jpg" fixup="src lnc">
  <img src="/img/hot-babe.jpg"    fixup="src playboy">
  <img src="/img/footer.jpg"      fixup="src foobar">

and then when the tree of HTML is being processed, you make this call:

  my %closures = (
     lnc     => sub { my ($tree, $mute_node, $attr_value)= @_; "http://lnc.usc.edu$attr_value" },
     playboy => sub { my ($tree, $mute_node, $attr_value)= @_; "http://playboy.com$attr_value" }
     foobar  => sub { my ($tree, $mute_node, $attr_value)= @_; "http://foobar.info$attr_value" }
  )

  $tree->overwrite_attr(fixup => \%closures) ;

and the tags come out modified like so:

  <img src="http://lnc.usc.edu/img/smiley-face.jpg" fixup="src lnc">
  <img src="http://playboy.com/img/hot-babe.jpg"    fixup="src playboy">
  <img src="http://foobar.info/img/footer.jpg"      fixup="src foobar">

$tree->mute_elem($mutation_attr => $mutating_closures, [ $post_hook ] )

This is a generalization of overwrite_attr. overwrite_attr assumes the return value of the closure is supposed overwrite an attribute value and does it for you. mute_elem is a more general function which does nothing but hand the closure the element and let it mutate it as it jolly well pleases :)

In fact, here is the implementation of overwrite_attr to give you a taste of how mute_attr is used:

 sub overwrite_action {
   my ($mute_node, %X) = @_;

   $mute_node->attr($X{local_attr}{name} => $X{local_attr}{value}{new});
 }


 sub HTML::Element::overwrite_attr {
   my $tree = shift;
  
   $tree->mute_elem(@_, \&overwrite_action);
 }

Tree-Building Methods: Unrolling an array via a single sample element

This is best described by example. Given this HTML:

 <strong>Here are the things I need from the store:</strong>
 <ul>
   <li class="store_items">Sample item</li>
 </ul>

We can unroll it like so:

  my $li = $tree->look_down(class => 'store_items');

  my @items = qw(bread butter vodka);

  $tree->iter($li => @items);

To produce this:

 <html>
  <head></head>
  <body>Here are the things I need from the store:
    <ul>
      <li class="store_items">bread</li>
      <li class="store_items">butter</li>
      <li class="store_items">vodka</li>
    </ul>
  </body>
 </html>

Tree-Building Methods: Unrolling an array via n sample elements

iter() was fine for awhile, but some things (definition lists, e.g.) need a more general function to make them easy to do. Hence iter2(). This function will be explained by example of unrolling a simple definition list.

So here's our start HTML:

 Here are the type of people you meet at XYZ, inc:

    <dl>

      <dt>
        Artist
      </dt>
      <dd>
        A person who draws blood.
      </dd>

      <dt>
        Musician
      </dt>
      <dd>
        A clone of Iggy Pop.
      </dd>

      <dt>
        Poet
      </dt>
      <dd>
        A relative of Edgar Allan Poe.
      </dd>


    </dl>

And we want to unroll our data set, preserving the last two elements of the initial definition list (Poet and its definition). Here's how it's done:

 my @items = (
  [ Programmer => 'one who likes Perl and Seamstress', ],
  [ DBA        => 'one who does business as', ],
  [ Admin      => 'one who plays Tetris all day' ]
 );

 $tree->iter2(
  # default wrapper_ld ok. 
  # It defaults to ['_tag' => 'dl']
  wrapper_data => \@items,
  wrapper_proc => sub {
    my ($container) = @_;

    # only keep the last 2 dts and dds
    my @content_list = $container->content_list;
    $container->splice_content(0, @content_list - 2); 
  },

  # default item_ld is fine. It looks like this:
  # sub { 
  #                        my $tree = shift;
  #                        [
  #                          $tree->look_down('_tag' => 'dt'),
  #                          $tree->look_down('_tag' => 'dd')
  #                         ];
  #                      }

  # default item_data is fine. It looks like this:
  # sub { my ($wrapper_data) = @_;
  #       shift(@{$wrapper_data}) ;
  # }},

  # default item_proc is fine. 
  # Note that this subroutine MUST return the new items. This is done
  # So that more items than were passed in can be returned. This is 
  # useful when, for example, you must return 2 dts for an input data item. 
  # And when would you do this? When a single term has multiple spellings
  # for instance.
  # The default item_proc looks like this:
  # sub {
  #       my ($item_elems, $item_data, $row_count) = @_;
  #       $item_elems->[$_]->replace_content($item_data->[$_]) for (0,1) ;
  #       $item_elems;
  #     }},

  splice       => sub {
    my ($container, @item_elems) = @_;
    $container->unshift_content(@item_elems);
  },

  debug => 1,
 );

Tree-Building Methods: Select Unrolling

The unroll_select method has this API:

   $tree->unroll_select(
      select_label    => $id_label,
      option_value    => $closure, # how to get option value from data row
      option_content  => $closure, # how to get option content from data row
      option_selected => $closure, # boolean to decide if SELECTED
      data         => $data        # the data to be put into the SELECT
      data_iter    => $closure     # the thing that will get a row of data
    );

Here's an example:

 $tree->unroll_select(
   select_label     => 'clan_list', 
   option_value     => sub { my $row = shift; $row->clan_id },
   option_content   => sub { my $row = shift; $row->clan_name },
   option_selected  => sub { my $row = shift; $row->selected },
   data             => \@query_results, 
   data_iter        => sub { my $data = shift; $data->next }
 )

Tree-Building Methods: Table Generation

Matthew Sisk has a much more intuitive (imperative) way to generate tables via his module HTML::ElementTable. However, for those with callback fever, the following method is available. First, we look at a nuts and bolts way to build a table using only standard HTML::Tree API calls. Then the table method available here is discussed.

Sample Model

 package Simple::Class;
 
 use Set::Array;
 
 my @name   = qw(bob bill brian babette bobo bix);
 my @age    = qw(99  12   44    52      12   43);
 my @weight = qw(99  52   80   124     120  230);
 
 
 sub new {
     my $this = shift;
     bless {}, ref($this) || $this;
 }
 
 sub load_data {
     my @data;
 
     for (0 .. 5) {
        push @data, { 
            age    => $age[rand $#age] + int rand 20,
            name   => shift @name,
            weight => $weight[rand $#weight] + int rand 40
            }
     }
 
   Set::Array->new(@data);
 }
 
 
 1;

Sample Usage:

       my $data = Simple::Class->load_data;
       ++$_->{age} for @$data

Inline Code to Unroll a Table

HTML

 <html>
 
   <table id="load_data">
 
     <tr>  <th>name</th><th>age</th><th>weight</th> </tr>
 
     <tr id="iterate">
 
         <td id="name">   NATURE BOY RIC FLAIR  </td>
         <td id="age">    35                    </td>
         <td id="weight"> 220                   </td>
 
     </tr>
 
   </table>
 
 </html>

The manual way (*NOT* recommended)

 require 'simple-class.pl';
 use HTML::Seamstress;
 
 # load the view
 my $seamstress = HTML::Seamstress->new_from_file('simple.html');
 
 # load the model
 my $o = Simple::Class->new;
 my $data = $o->load_data;
 
 # find the <table> and <tr> 
 my $table_node = $seamstress->look_down('id', 'load_data');
 my $iter_node  = $table_node->look_down('id', 'iterate');
 my $table_parent = $table_node->parent;
 
 
 # drop the sample <table> and <tr> from the HTML
 # only add them in if there is data in the model
 # this is achieved via the $add_table flag
 
 $table_node->detach;
 $iter_node->detach;
 my $add_table;
 
 # Get a row of model data
 while (my $row = shift @$data) {
 
   # We got row data. Set the flag indicating ok to hook the table into the HTML
   ++$add_table;
 
   # clone the sample <tr>
   my $new_iter_node = $iter_node->clone;
 
   # find the tags labeled name age and weight and 
   # set their content to the row data
   $new_iter_node->content_handler($_ => $row->{$_}) 
     for qw(name age weight);
 
   $table_node->push_content($new_iter_node);
 
 }
 
 # reattach the table to the HTML tree if we loaded data into some table rows
 
 $table_parent->push_content($table_node) if $add_table;
 
 print $seamstress->as_HTML;
 

$tree->table() : API call to Unroll a Table

 require 'simple-class.pl';
 use HTML::Seamstress;
 
 # load the view
 my $seamstress = HTML::Seamstress->new_from_file('simple.html');
 # load the model
 my $o = Simple::Class->new;
 
 $seamstress->table
   (
    # tell seamstress where to find the table, via the method call
    # ->look_down('id', $gi_table). Seamstress detaches the table from the
    # HTML tree automatically if no table rows can be built
 
      gi_table    => 'load_data',
 
    # tell seamstress where to find the tr. This is a bit useless as
    # the <tr> usually can be found as the first child of the parent
 
      gi_tr       => 'iterate',
      
    # the model data to be pushed into the table
 
      table_data  => $o->load_data,
 
    # the way to take the model data and obtain one row
    # if the table data were a hashref, we would do:
    # my $key = (keys %$data)[0]; my $val = $data->{$key}; delete $data->{$key}
 
      tr_data     => sub { my ($self, $data) = @_;
                          shift(@{$data}) ;
                        },
 
    # the way to take a row of data and fill the <td> tags
 
      td_data     => sub { my ($tr_node, $tr_data) = @_;
                          $tr_node->content_handler($_ => $tr_data->{$_})
                            for qw(name age weight) }
 
   );
 
 
 print $seamstress->as_HTML;

Looping over Multiple Sample Rows

* HTML

 <html>
 
   <table id="load_data" CELLPADDING=8 BORDER=2>
 
     <tr>  <th>name</th><th>age</th><th>weight</th> </tr>
 
     <tr id="iterate1" BGCOLOR="white" >
 
         <td id="name">   NATURE BOY RIC FLAIR  </td>
         <td id="age">    35                    </td>
         <td id="weight"> 220                   </td>
 
     </tr>
     <tr id="iterate2" BGCOLOR="#CCCC99">
 
         <td id="name">   NATURE BOY RIC FLAIR  </td>
         <td id="age">    35                    </td>
         <td id="weight"> 220                   </td>
 
     </tr>
 
   </table>
 
 </html>

* Only one change to last API call.

This:

        gi_tr       => 'iterate',

becomes this:

        gi_tr       => ['iterate1', 'iterate2']

$tree->table2() : New API Call to Unroll a Table

After 2 or 3 years with table(), I began to develop production websites with it and decided it needed a cleaner interface, particularly in the area of handling the fact that id tags will be the same after cloning a table row.

First, I will give a dry listing of the function's argument parameters. This will not be educational most likely. A better way to understand how to use the function is to read through the incremental unrolling of the function's interface given in conversational style after the dry listing. But take your pick. It's the same information given in two different ways.

Dry/technical parameter documentation

$tree->table2(%param) takes the following arguments:

  • table_ld => $look_down : optional

    How to find the table element in $tree. If $look_down is an arrayref, then use look_down. If it is a CODE ref, then call it, passing it $tree.

    Defaults to ['_tag' => 'table'] if not passed in.

  • table_data => $tabular_data : required

    The data to fill the table with. Must be passed in.

  • table_proc => $code_ref : not implemented

    A subroutine to do something to the table once it is found. Not currently implemented. Not obviously necessary. Just created because there is a tr_proc and td_proc.

  • tr_ld => $look_down : optional

    Same as table_ld but for finding the table row elements. Please note that the tr_ld is done on the table node that was found below instead of the whole HTML tree. This makes sense. The trs that you want exist below the table that was just found.

    Defaults to ['_tag' => 'tr'] if not passed in.

  • tr_base_id => $id_name : optional

    Ok, think for a second. You've got a sample table row which is about to be unrolled several times. Each row needs a unique id. The value here (if passed in), can be useful in abstractly forming this unique tr id.

    The default tr_proc method will work perfectly if you pass in a tr_base_id for it to chew on.

    See t/table2.t in the test suite for an example of its use.

  • tr_data => $code_ref : optional

    How to take the table_data and return a row. Defaults to:

     sub { my ($self, $data) = @_;
          shift(@{$data}) ;
     }
                                    
  • tr_proc => $code_ref : optional

    Something to do to the table row we are about to add to the table we are making. Defaults to a routine which makes the id attribute unique:

     sub {
            my ($self, $tr, $tr_data, $tr_base_id, $row_count) = @_;
            $tr->attr(id => sprintf "%s_%d", $tr_base_id, $row_count);
     }
  • td_proc => $code_ref : required

    This coderef will take the row of data and operate on the td cells that are children of the tr. See t/table2.t for several usage examples.

    Here's a sample one:

     sub {
          my ($tr, $data) = @_;
          my @td = $tr->look_down('_tag' => 'td');
          for my $i (0..$#td) {
            $td[$i]->splice_content(0, 1, $data->[$i]);
          }
        }

Conversational parameter documentation

The first thing you need is a table. So we need a look down for that. If you don't give one, it defaults to

  ['_tag' => 'table']

What good is a table to display in without data to display?! So you must supply a scalar representing your tabular data source. This scalar might be an array reference, a nextable iterator, a DBI statement handle. Whatever it is, it can be iterated through to build up rows of table data. These two required fields (the way to find the table and the data to display in the table) are table_ld and table_data respectively. A little more on table_ld. If this happens to be a CODE ref, then execution of the code ref is presumed to return the HTML::Element representing the table in the HTML tree.

Next, we get the row or rows which serve as sample tr elements by doing a look_down from the table_elem. While normally one sample row is enough to unroll a table, consider when you have alternating table rows. This API call would need one of each row so that it can cycle through the sample rows as it loops through the data. Alternatively, you could always just use one row and make the necessary changes to the single tr row by mutating the element in tr_proc, discussed below. The default tr_ld is ['_tag' => 'tr'] but you can overwrite it. Note well, if you overwrite it with a subroutine, then it is expected that the subroutine will return the HTML::Element(s) which are tr element(s). The reason a subroutine might be preferred is in the case that the HTML designers gave you 8 sample tr rows but only one prototype row is needed. So you can write a subroutine, to splice out the 7 rows you don't need and leave the one sample row remaining so that this API call can clone it and supply it to the tr_proc and td_proc calls.

Now, as we move through the table rows with table data, we need to do two different things on each table row:

  • get one row of data from the table_data via tr_data

    The default procedure assumes the table_data is an array reference and shifts a row off of it:

       sub { my ($self, $data) = @_;
             shift(@{$data}) ;
           }

    Your function MUST return undef when there is no more rows to lay out.

  • take the tr element and mutate it via tr_proc

    The default procedure simply makes the id of the table row unique:

      sub { my ($self, $tr, $tr_data, $row_count, $root_id) = @_;
            $tr->attr(id => sprintf "%s_%d", $root_id, $row_count);
          }

Now that we have our row of data, we call td_proc so that it can take the data and the td cells in this tr and process them. This function must be supplied.

Whither a Table with No Rows

Often when a table has no rows, we want to display a message indicating this to the view. Use conditional processing to decide what to display:

        <span id=no_data>
                <table><tr><td>No Data is Good Data</td></tr></table>
        </span>
        <span id=load_data>
 <html>
 
   <table id="load_data">
 
     <tr>  <th>name</th><th>age</th><th>weight</th> </tr>
 
     <tr id="iterate">
 
         <td id="name">   NATURE BOY RIC FLAIR  </td>
         <td id="age">    35                    </td>
         <td id="weight"> 220                   </td>
 
     </tr>
 
   </table>
 
 </html>

        </span>

SEE ALSO

AUTHOR

Terrence Brannon, <tbone@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2004 by Terrence Brannon

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 1311:

You forgot a '=back' before '=head4'

Around line 1437:

You forgot a '=back' before '=head1'