HTML::Element::Library(3) HTML::Element convenience functions

SYNOPSIS


use HTML::Element::Library;
use HTML::TreeBuilder;

DESCRIPTION

HTML:::Element::Library provides extra methods for HTML::Element.

METHODS

Aliases

These are short aliases for common operations:
$el->fid($id)
Finds an element given its id. Equivalent to "$el->look_down(id => $id)".
$el->fclass($class)
Finds one or more elements given one of their classes. Equivalent to "$el->look_down(class => qr/\b$class\b/s)"

Positional Querying Methods

$elem->siblings

Return a list of all nodes under the same parent.

$elem->sibdex

Return the index of $elem into the array of siblings of which it is a part. HTML::ElementSuper calls this method "addr" but I don't think that is a descriptive name. And such naming is deceptively close to the "address" function of "HTML::Element". HOWEVER, in the interest of backwards compatibility, both methods are available.

$elem->addr

Same as sibdex

$elem->position()

Returns the coordinates of this element in the tree it inhabits. This is accomplished by succesively calling addr() on ancestor elements until either a) an element that does not support these methods is found, or b) there are no more parents. The resulting list is the n-dimensional coordinates of the element in the tree.

Element Decoration Methods

HTML::Element::Library::super_literal($text)

In HTML::Element, Sean Burke discusses super-literals. They are text which does not get escaped. Great for includng Javascript in HTML. Also great for including foreign language into a document.

So, you basically toss "super_literal" your text and back comes your text wrapped in a "~literal" element.

One of these days, I'll around to writing a nice "EXPORT" section.

Tree Rewriting Methods

``de-prepping'' HTML

Oftentimes, the HTML to be worked with will have multiple sample rows:

  <OL>
   <LI>bread
   <LI>butter
   <LI>beer
   <LI>bacon
  </OL>

But, before you begin to rewrite the HTML with your model data, you typically only want 1 or 2 sample rows.

Thus, you want to ``crunch'' the multiple sample rows to a specified amount. Hence the "crunch" method:

  $tree->crunch(look_down => [ '_tag' => 'li' ], leave => 2) ;

The "leave" argument defaults to 1 if not given. The call above would ``crunch'' the above 4 sample rows to:

  <OL>
   <LI>bread
   <LI>butter
  </OL>

Simplifying calls to HTML::FillInForm

Since HTML::FillInForm gets and returns strings, using HTML::Element instances becomes tedious:

   1. Seamstress has an HTML tree that it wants the form filled in on
   2. Seamstress converts this tree to a string
   3. FillInForm parses the string into an HTML tree and then fills in the form
   4. FillInForm converts the HTML tree to a string
   5. Seamstress re-parses the HTML for additional processing

I've filed a bug about this: <https://rt.cpan.org/Ticket/Display.html?id=44105>

This function, fillinform, allows you to pass a tree to fillinform (along with your data structure) and get back a tree:

  my $new_tree = $html_tree->fillinform($data_structure);

Mapping a hashref to HTML elements

It is very common to get a hashref of data from some external source - flat file, database, XML, etc. Therefore, it is important to have a convenient way of mapping this data to HTML.

As it turns out, there are 3 ways to do this in HTML::Element::Library. The most strict and structured way to do this is with "content_handler". Two other methods, "hashmap" and "datamap" require less manual mapping and may prove even more easy to use in certain cases.

As is usual with Perl, a practical example is always best. So let's take some sample HTML:

  <h1>user data</h1>
  <span id="name">?</span> 
  <span id="email">?</span> 
  <span id="gender">?</span>

Now, let's say our data structure is this:

  $ref = { email => '[email protected]', gender => 'lots' } ;

And let's start with the most strict way to get what you want:

 $tree->content_handler(email => $ref->{email} , gender => $ref->{gender}) ;

In this case, you manually state the mapping between id tags and hashref keys and then "content_handler" retrieves the hashref data and pops it in the specified place.

Now let's look at the two (actually 2 and a half) other hash-mapping methods.

 $tree->hashmap(id => $ref);

Now, what this function does is super-destructive. It finds every element in the tree with an attribute named id (since 'id' is a parameter, it could find every element with some other attribute also) and replaces the content of those elements with the hashref value.

So, in the case above, the

   <span id="name">?</span>

would come out as

  <span id="name"></span>

(it would be blank) - because there is nothing in the hash with that value, so it substituted

  $ref->{name}

which was blank and emptied the contents.

Now, let's assume we want to protect name from being auto-assigned. Here is what you do:

 $tree->hashmap(id => $ref, ['name']);

That last array ref is an exclusion list.

But wouldnt it be nice if you could do a hashmap, but only assigned things which are defined in the hashref? "defmap()" to the rescue:

 $tree->defmap(id => $ref);

does just that, so

   <span id="name">?</span>

would be left alone.

$elem->hashmap($attr_name, \%hashref, \@excluded, $debug)

This method is designed to take a hashref and populate a series of elements. For example:

  <table>
    <tr sclass="tr" class="alt" align="left" valign="top">
      <td smap="people_id">1</td>
      <td smap="phone">(877) 255-3239</td>
      <td smap="password">*********</td>
    </tr>
  </table>

In the table above, there are several attributes named "smap". If we have a hashref whose keys are the same:

  my %data = (people_id => 888, phone => '444-4444', password => 'dont-you-dare-render');

Then a single API call allows us to populate the HTML while excluding those ones we don't:

  $tree->hashmap(smap => \%data, ['password']);

Note: the other way to prevent rendering some of the hash mapping is to not give that element the attr you plan to use for hash mapping.

Also note: the function "hashmap" has a simple easy-to-type API. Interally, it calls "hash_map" (which has a more verbose keyword calling API). Thus, the above call to "hashmap()" results in this call:

  $tree->hash_map(hash => \%data, to_attr => 'sid', excluding => ['password']);

$elem->defmap($attr_name, \%hashref, $debug)

"defmap" was described above.

$elem->replace_content(@new_elem)

Replaces all of $elem's content with @new_elem.

$elem->wrap_content($wrapper_element)

Wraps the existing content in the provided element. If the provided element happens to be a non-element, a push_content is performed instead.

$elem->set_child_content(@look_down, $content)

This method looks down $tree using the criteria specified in @look_down using the the HTML::Element look_down() method.

After finding the node, it detaches the node's content and pushes $content as the node's content.

$tree->content_handler(%id_content)

This is a convenience method. Because the look_down criteria will often simply be:

  id => 'fixme'

to find things like:

  <a id=fixme href=http://www.somesite.org>replace_content</a>

You can call this method to shorten your typing a bit. You can simply type

  $elem->content_handler( fixme => 'new text' )

Instead of typing:

  $elem->set_child_content(sid => 'fixme', 'new text')

ALSO NOTE: you can pass a hash whose keys are "id"s and whose values are the content you want there and it will perform the replacement on each hash member:

  my %id_content = (name => "Terrence Brannon",
                    email => '[email protected]',
                    balance => 666,
                    content => $main_content);
  $tree->content_handler(%id_content);

$tree->highlander($subtree_span_id, $conditionals, @conditionals_args)

This allows for ``if-then-else'' style processing. Highlander was a movie in which only one would survive. Well, in terms of a tree when looking at a structure that you want to process in "if-then-else" style, only one child will survive. For example, given this HTML template:

 <span klass="highlander" id="age_dialog">
    <span id="under10">
       Hello, does your mother know you're
       using her AOL account?
    </span>
    <span id="under18">
       Sorry, you're not old enough to enter
       (and too dumb to lie about your age)
    </span>
    <span id="welcome">
       Welcome
    </span>
 </span>

We only want one child of the "span" tag with id "age_dialog" to remain based on the age of the person visiting the page.

So, let's setup a call that will prune the subtree as a function of age:

 sub process_page {
  my $age = shift;
  my $tree = HTML::TreeBuilder->new_from_file('t/html/highlander.html');
  $tree->highlander
    (age_dialog =>
     [
      under10 => sub { $_[0] < 10},
      under18 => sub { $_[0] < 18},
      welcome => sub { 1 }
     ],
     $age
    );

And there we have it. If the age is less than 10, then the node with id "under10" remains. For age less than 18, the node with id "under18" remains. Otherwise our ``else'' condition fires and the child with id "welcome" remains.

$tree->passover(@id_of_element)

In some cases, you know exactly which element(s) should survive. In this case, you can simply call "passover" to remove it's (their) siblings. For the HTML above, you could delete "under10" and "welcome" by simply calling:

  $tree->passover('under18');

Because passover takes an array, you can specify several children to preserve.

$tree->highlander2($tree, $conditionals, @conditionals_args)

Right around the same time that "table2()" came into being, Seamstress began to tackle tougher and tougher processing problems. It became clear that a more powerful highlander was needed... one that not only snipped the tree of the nodes that should not survive, but one that allows for post-processing of the survivor node. And one that was more flexible with how to find the nodes to snip.

Thus (drum roll) "highlander2()".

So let's look at our HTML which requires post-selection processing:

 <span klass="highlander" id="age_dialog">
    <span id="under10">
       Hello, little <span id=age>AGE</span>-year old,
    does your mother know you're using her AOL account?
    </span>
    <span id="under18">
       Sorry, you're only <span id=age>AGE</span>
       (and too dumb to lie about your age)
    </span>
    <span id="welcome">
       Welcome, isn't it good to be <span id=age>AGE</span> years old?
    </span>
</span>

In this case, a branch survives, but it has dummy data in it. We must take the surviving segment of HTML and rewrite the age "span" with the age. Here is how we use "highlander2()" to do so:

  sub replace_age {
    my $branch = shift;
    my $age = shift;
    $branch->look_down(id => 'age')->replace_content($age);
  }
  my $if_then = $tree->look_down(id => 'age_dialog');
  $if_then->highlander2(
    cond => [
      under10 => [
        sub { $_[0] < 10} ,
        \&replace_age
       ],
      under18 => [
        sub { $_[0] < 18} ,
        \&replace_age
       ],
      welcome => [
        sub { 1 },
        \&replace_age
       ]
     ],
    cond_arg => [ $age ]
  );

We pass it the tree ($if_then), an arrayref of conditions ("cond") and an arrayref of arguments which are passed to the "cond"s and to the replacement subs.

The "under10", "under18" and "welcome" are id attributes in the tree of the siblings of which only one will survive. However, should you need to do more complex look-downs to find the survivor, then supply an array ref instead of a simple scalar:

  $if_then->highlander2(
    cond => [
      [class => 'r12'] => [
        sub { $_[0] < 10} ,
        \&replace_age
       ],
      [class => 'z22'] => [
        sub { $_[0] < 18} ,
        \&replace_age
       ],
      [class => 'w88'] => [
        sub { 1 },
        \&replace_age
       ]
     ],
    cond_arg => [ $age ]
  );

$tree->overwrite_attr($mutation_attr => $mutating_closures)

This method is designed for taking a tree and reworking a set of nodes in a stereotyped fashion. For instance let's say you have 3 remote image archives, but you don't want to put long URLs in your img src tags for reasons of abstraction, re-use and brevity. So instead you do this:

  <img src="/img/smiley-face.jpg" fixup="src lnc">
  <img src="/img/hot-babe.jpg"    fixup="src playboy">
  <img src="/img/footer.jpg"      fixup="src foobar">

and then when the tree of HTML is being processed, you make this call:

  my %closures = (
     lnc     => sub { my ($tree, $mute_node, $attr_value)= @_; "http://lnc.usc.edu$attr_value" },
     playboy => sub { my ($tree, $mute_node, $attr_value)= @_; "http://playboy.com$attr_value" }
     foobar  => sub { my ($tree, $mute_node, $attr_value)= @_; "http://foobar.info$attr_value" }
  )
  $tree->overwrite_attr(fixup => \%closures) ;

and the tags come out modified like so:

  <img src="http://lnc.usc.edu/img/smiley-face.jpg" fixup="src lnc">
  <img src="http://playboy.com/img/hot-babe.jpg"    fixup="src playboy">
  <img src="http://foobar.info/img/footer.jpg"      fixup="src foobar">

$tree->mute_elem($mutation_attr => $mutating_closures, [ $post_hook ] )

This is a generalization of "overwrite_attr". "overwrite_attr" assumes the return value of the closure is supposed overwrite an attribute value and does it for you. "mute_elem" is a more general function which does nothing but hand the closure the element and let it mutate it as it jolly well pleases :)

In fact, here is the implementation of "overwrite_attr" to give you a taste of how "mute_attr" is used:

  sub overwrite_action {
    my ($mute_node, %X) = @_;
    $mute_node->attr($X{local_attr}{name} => $X{local_attr}{value}{new});
  }
  sub HTML::Element::overwrite_attr {
    my $tree = shift;
    $tree->mute_elem(@_, \&overwrite_action);
  }

Tree-Building Methods

Unrolling an array via a single sample element (<ul> container)

This is best described by example. Given this HTML:

 <strong>Here are the things I need from the store:</strong>
 <ul>
   <li class="store_items">Sample item</li>
 </ul>

We can unroll it like so:

  my $li = $tree->look_down(class => 'store_items');
  my @items = qw(bread butter vodka);
  $tree->iter($li => @items);

To produce this:

 <html>
  <head></head>
  <body>Here are the things I need from the store:
    <ul>
      <li class="store_items">bread</li>
      <li class="store_items">butter</li>
      <li class="store_items">vodka</li>
    </ul>
  </body>
 </html>

Now, you might be wondering why the API call is:

  $tree->iter($li => @items)

instead of:

  $li->iter(@items)

and there is no good answer. The latter would be more concise and it is what I should have done.

Unrolling an array via a single sample element and a callback (<ul> container)

This is a more advanced version of the previous method. Instead of cloning the sample element several times and calling "replace_content" on the clone with the array element, a custom callback is called with the clone and array element.

Here is the example from before.

 <strong>Here are the things I need from the store:</strong>
 <ul>
   <li class="store_items">Sample item</li>
 </ul>

Code:

  sub cb {
    my ($data, $li) = @_;
    $li->replace_content($data);
  }
  my $li = $tree->look_down(class => 'store_items');
  my @items = qw(bread butter vodka);
  $li->itercb(\@items, \&cb);

Output is as before:

 <html>
  <head></head>
  <body>Here are the things I need from the store:
    <ul>
      <li class="store_items">bread</li>
      <li class="store_items">butter</li>
      <li class="store_items">vodka</li>
    </ul>
  </body>
 </html>

Here is a more complex example (unrolling a table). HTML:

  <table><thead><th>First Name<th>Last Name<th>Option</thead>
  <tbody>
  <tr><td class="first">First<td class="last">Last<td class="option">1
  </tbody></table>

Code:

  sub tr_cb {
    my ($data, $tr) = @_;
    $tr->look_down(class => 'first')->replace_content($data->{first});
    $tr->look_down(class => 'last')->replace_content($data->{last});
    $tr->look_down(class => 'option')->replace_content($data->{option});
  }
  my @data = (
    {first => 'Foo', last => 'Bar', option => 2},
    {first => 'Bar', last => 'Bar', option => 3},
    {first => 'Baz', last => 'Bar', option => 4},
  );
  my $tr = $tree->find('table')->find('tbody')->find('tr');
  $tr->itercb(\@data, \&tr_cb);

Produces:

  <table><thead><th>First Name<th>Last Name<th>Option</thead>
  <tbody>
  <tr><td class="first">Foo<td class="last">Bar<td class="option">2
  <tr><td class="first">Bar<td class="last">Bar<td class="option">3
  <tr><td class="first">Baz<td class="last">Bar<td class="option">4
  </tbody></table>

Unrolling an array via n sample elements (<dl> container)

"iter()" was fine for awhile, but some things (e.g. definition lists) need a more general function to make them easy to do. Hence "iter2()". This function will be explained by example of unrolling a simple definition list.

So here's our mock-up HTML from the designer:

  <dl class="dual_iter" id="service_plan">
    <dt>Artist</dt>
    <dd>A person who draws blood.</dd>
    <dt>Musician</dt>
    <dd>A clone of Iggy Pop.</dd>
    <dt>Poet</dt>
    <dd>A relative of Edgar Allan Poe.</dd>
    <dt class="adstyle">sample header</dt>
    <dd class="adstyle2">sample data</dd>
</dl>

And we want to unroll our data set:

  my @items = (
    ['the pros'   => 'never have to worry about service again'],
    ['the cons'   => 'upfront extra charge on purchase'],
    ['our choice' => 'go with the extended service plan']
  );

Now, let's make this problem a bit harder to show off the power of "iter2()". Let's assume that we want only the last <dt> and it's accompanying <dd> (the one with ``sample data'') to be used as the sample data for unrolling with our data set. Let's further assume that we want them to remain in the final output.

So now, the API to "iter2()" will be discussed and we will explain how our goal of getting our data into HTML fits into the API.

  • wrapper_ld

    This is how to look down and find the container of all the elements we will be unrolling. The <dl> tag is the container for the dt and dd tags we will be unrolling.

    If you pass an anonymous subroutine, then it is presumed that execution of this subroutine will return the HTML::Element representing the container tag. If you pass an array ref, then this will be dereferenced and passed to "HTML::Element::look_down()".

    default value: "['_tag' => 'dl']"

    Based on the mock HTML above, this default is fine for finding our container tag. So let's move on.

  • wrapper_data

    This is an array reference of data that we will be putting into the container. You must supply this. @items above is our "wrapper_data".

  • wrapper_proc

    After we find the container via "wrapper_ld", we may want to pre-process some aspect of this tree. In our case the first two sets of dt and dd need to be removed, leaving the last dt and dd. So, we supply a "wrapper_proc" which will do this.

    default: undef

  • item_ld

    This anonymous subroutine returns an array ref of "HTML::Element"s that will be cloned and populated with item data (item data is a ``row'' of "wrapper_data").

    default: returns an arrayref consisting of the dt and dd element inside the container.

  • item_data

    This is a subroutine that takes "wrapper_data" and retrieves one ``row'' to be ``pasted'' into the array ref of "HTML::Element"s found via "item_ld". I hope that makes sense.

    default: shifts "wrapper_data".

  • item_proc

    This is a subroutine that takes the "item_data" and the "HTML::Element"s found via "item_ld" and produces an arrayref of "HTML::Element"s which will eventually be spliced into the container.

    Note that this subroutine MUST return the new items. This is done So that more items than were passed in can be returned. This is useful when, for example, you must return 2 dts for an input data item. And when would you do this? When a single term has multiple spellings for instance.

    default: expects "item_data" to be an arrayref of two elements and "item_elems" to be an arrayref of two "HTML::Element"s. It replaces the content of the "HTML::Element"s with the "item_data".

  • splice

    After building up an array of @item_elems, the subroutine passed as "splice" will be given the parent container HTML::Element and the @item_elems. How the @item_elems end up in the container is up to this routine: it could put half of them in. It could unshift them or whatever.

    default: "$container->splice_content(0, 2, @item_elems)" In other words, kill the 2 sample elements with the newly generated @item_elems

So now that we have documented the API, let's see the call we need:

 $tree->iter2(
  # default wrapper_ld ok.
  wrapper_data => \@items,
  wrapper_proc => sub {
    my ($container) = @_;
    # only keep the last 2 dts and dds
    my @content_list = $container->content_list;
    $container->splice_content(0, @content_list - 2);
  },
  # default item_ld is fine.
  # default item_data is fine.
  # default item_proc is fine.
  splice       => sub {
    my ($container, @item_elems) = @_;
    $container->unshift_content(@item_elems);
  },
  debug => 1,
 );

Select Unrolling

The "unroll_select" method has this API:

   $tree->unroll_select(
      select_label    => $id_label,
      option_value    => $closure, # how to get option value from data row
      option_content  => $closure, # how to get option content from data row
      option_selected => $closure, # boolean to decide if SELECTED
      data            => $data     # the data to be put into the SELECT
      data_iter       => $closure  # the thing that will get a row of data
      debug           => $boolean,
      append          => $boolean, # remove the sample <OPTION> data or append?
    );

Here's an example:

  $tree->unroll_select(
    select_label     => 'clan_list',
    option_value     => sub { my $row = shift; $row->clan_id },
    option_content   => sub { my $row = shift; $row->clan_name },
    option_selected  => sub { my $row = shift; $row->selected },
    data             => \@query_results,
    data_iter        => sub { my $data = shift; $data->next },
    append => 0,
    debug => 0
  );

Tree-Building Methods: Table Generation

Matthew Sisk has a much more intuitive (imperative) way to generate tables via his module HTML::ElementTable.

However, for those with callback fever, the following method is available. First, we look at a nuts and bolts way to build a table using only standard HTML::Tree API calls. Then the "table" method available here is discussed.

Sample Model

  package Simple::Class;
  use Set::Array;
  my @name   = qw(bob bill brian babette bobo bix);
  my @age    = qw(99  12   44    52      12   43);
  my @weight = qw(99  52   80   124     120  230);
  sub new {
    my $this = shift;
    bless {}, ref($this) || $this;
  }
  sub load_data {
    my @data;
    for (0 .. 5) {
      push @data, {
        age    => $age[rand $#age] + int rand 20,
        name   => shift @name,
        weight => $weight[rand $#weight] + int rand 40
      }
    }
    Set::Array->new(@data);
  }
  1;

Sample Usage:

  my $data = Simple::Class->load_data;
  ++$_->{age} for @$data

Inline Code to Unroll a Table

HTML

  <html>
    <table id="load_data">
      <tr>  <th>name</th><th>age</th><th>weight</th> </tr>
      <tr id="iterate">
          <td id="name">   NATURE BOY RIC FLAIR  </td>
          <td id="age">    35                    </td>
          <td id="weight"> 220                   </td>
      </tr>
    </table>
  </html>

The manual way (*NOT* recommended)

  require 'simple-class.pl';
  use HTML::Seamstress;
  # load the view
  my $seamstress = HTML::Seamstress->new_from_file('simple.html');
  # load the model
  my $o = Simple::Class->new;
  my $data = $o->load_data;
  # find the <table> and <tr>
  my $table_node = $seamstress->look_down('id', 'load_data');
  my $iter_node  = $table_node->look_down('id', 'iterate');
  my $table_parent = $table_node->parent;
  # drop the sample <table> and <tr> from the HTML
  # only add them in if there is data in the model
  # this is achieved via the $add_table flag
  $table_node->detach;
  $iter_node->detach;
  my $add_table;
  # Get a row of model data
  while (my $row = shift @$data) {
    # We got row data. Set the flag indicating ok to hook the table into the HTML
    ++$add_table;
    # clone the sample <tr>
    my $new_iter_node = $iter_node->clone;
    # find the tags labeled name age and weight and
    # set their content to the row data
    $new_iter_node->content_handler($_ => $row->{$_})
     for qw(name age weight);
    $table_node->push_content($new_iter_node);
  }
  # reattach the table to the HTML tree if we loaded data into some table rows
  $table_parent->push_content($table_node) if $add_table;
  print $seamstress->as_HTML;

$tree->table() : API call to Unroll a Table

  require 'simple-class.pl';
  use HTML::Seamstress;
  # load the view
  my $seamstress = HTML::Seamstress->new_from_file('simple.html');
  # load the model
  my $o = Simple::Class->new;
  $seamstress->table
    (
     # tell seamstress where to find the table, via the method call
     # ->look_down('id', $gi_table). Seamstress detaches the table from the
     # HTML tree automatically if no table rows can be built
       gi_table    => 'load_data',
     # tell seamstress where to find the tr. This is a bit useless as
     # the <tr> usually can be found as the first child of the parent
       gi_tr       => 'iterate',
     # the model data to be pushed into the table
       table_data  => $o->load_data,
     # the way to take the model data and obtain one row
     # if the table data were a hashref, we would do:
     # my $key = (keys %$data)[0]; my $val = $data->{$key}; delete $data->{$key}
       tr_data     => sub {
         my ($self, $data) = @_;
         shift @{$data} ;
       },
     # the way to take a row of data and fill the <td> tags
       td_data     => sub {
         my ($tr_node, $tr_data) = @_;
         $tr_node->content_handler($_ => $tr_data->{$_})
        for qw(name age weight)
       }
    );
  print $seamstress->as_HTML;

Looping over Multiple Sample Rows

* HTML

  <html>
    <table id="load_data" CELLPADDING=8 BORDER=2>
    <tr>  <th>name</th><th>age</th><th>weight</th> </tr>
    <tr id="iterate1" BGCOLOR="white" >
      <td id="name">   NATURE BOY RIC FLAIR  </td>
      <td id="age">    35                    </td>
      <td id="weight"> 220                   </td>
    </tr>
    <tr id="iterate2" BGCOLOR="#CCCC99">
      <td id="name">   NATURE BOY RIC FLAIR  </td>
      <td id="age">    35                    </td>
      <td id="weight"> 220                   </td>
    </tr>
  </table>
</html>

* Only one change to last API call.

This:

  gi_tr       => 'iterate',

becomes this:

  gi_tr       => ['iterate1', 'iterate2']

$tree->table2() : New API Call to Unroll a Table

After 2 or 3 years with "table()", I began to develop production websites with it and decided it needed a cleaner interface, particularly in the area of handling the fact that "id" tags will be the same after cloning a table row.

First, I will give a dry listing of the function's argument parameters. This will not be educational most likely. A better way to understand how to use the function is to read through the incremental unrolling of the function's interface given in conversational style after the dry listing. But take your pick. It's the same information given in two different ways.

Dry/technical parameter documentation

"$tree->table2(%param)" takes the following arguments:

  • "table_ld => $look_down" : optional

    How to find the "table" element in $tree. If $look_down is an arrayref, then use "look_down". If it is a CODE ref, then call it, passing it $tree.

    Defaults to "['_tag' => 'table']" if not passed in.

  • "table_data => $tabular_data" : required

    The data to fill the table with. Must be passed in.

  • "table_proc => $code_ref" : not implemented

    A subroutine to do something to the table once it is found. Not currently implemented. Not obviously necessary. Just created because there is a "tr_proc" and "td_proc".

  • "tr_ld => $look_down" : optional

    Same as "table_ld" but for finding the table row elements. Please note that the "tr_ld" is done on the table node that was found instead of the whole HTML tree. This makes sense. The "tr"s that you want exist below the table that was just found.

    Defaults to "['_tag' => 'tr']" if not passed in.

  • "tr_data => $code_ref" : optional

    How to take the "table_data" and return a row. Defaults to:

      sub { my ($self, $data) = @_;
        shift(@{$data}) ;
      }
    
  • "tr_proc => $code_ref" : optional

    Something to do to the table row we are about to add to the table we are making. Defaults to a routine which makes the "id" attribute unique:

      sub {
        my ($self, $tr, $tr_data, $tr_base_id, $row_count) = @_;
        $tr->attr(id => sprintf "%s_%d", $tr_base_id, $row_count);
      }
    
  • "td_proc => $code_ref" : required

    This coderef will take the row of data and operate on the "td" cells that are children of the "tr". See "t/table2.t" for several usage examples.

    Here's a sample one:

      sub {
        my ($tr, $data) = @_;
        my @td = $tr->look_down('_tag' => 'td');
        for my $i (0..$#td) {
          $td[$i]->splice_content(0, 1, $data->[$i]);
        }
      }
    

Conversational parameter documentation

The first thing you need is a table. So we need a look down for that. If you don't give one, it defaults to

  ['_tag' => 'table']

What good is a table to display in without data to display?! So you must supply a scalar representing your tabular data source. This scalar might be an array reference, a "next"able iterator, a DBI statement handle. Whatever it is, it can be iterated through to build up rows of table data. These two required fields (the way to find the table and the data to display in the table) are "table_ld" and "table_data" respectively. A little more on "table_ld". If this happens to be a CODE ref, then execution of the code ref is presumed to return the "HTML::Element" representing the table in the HTML tree.

Next, we get the row or rows which serve as sample "tr" elements by doing a "look_down" from the "table_elem". While normally one sample row is enough to unroll a table, consider when you have alternating table rows. This API call would need one of each row so that it can cycle through the sample rows as it loops through the data. Alternatively, you could always just use one row and make the necessary changes to the single "tr" row by mutating the element in "tr_proc", discussed below. The default "tr_ld" is "['_tag' => 'tr']" but you can overwrite it. Note well, if you overwrite it with a subroutine, then it is expected that the subroutine will return the "HTML::Element"(s) which are "tr" element(s). The reason a subroutine might be preferred is in the case that the HTML designers gave you 8 sample "tr" rows but only one prototype row is needed. So you can write a subroutine, to splice out the 7 rows you don't need and leave the one sample row remaining so that this API call can clone it and supply it to the "tr_proc" and "td_proc" calls.

Now, as we move through the table rows with table data, we need to do two different things on each table row:

  • get one row of data from the "table_data" via "tr_data"

    The default procedure assumes the "table_data" is an array reference and shifts a row off of it:

      sub {
         my ($self, $data) = @_;
         shift @{$data};
      }
    

    Your function MUST return undef when there is no more rows to lay out.

  • take the "tr" element and mutate it via "tr_proc"

    The default procedure simply makes the id of the table row unique:

      sub {
        my ($self, $tr, $tr_data, $row_count, $root_id) = @_;
        $tr->attr(id => sprintf "%s_%d", $root_id, $row_count);
      }
    

Now that we have our row of data, we call "td_proc" so that it can take the data and the "td" cells in this "tr" and process them. This function must be supplied.

Whither a Table with No Rows

Often when a table has no rows, we want to display a message indicating this to the view. Use conditional processing to decide what to display:

  <span id=no_data>
    <table><tr><td>No Data is Good Data</td></tr></table>
  </span>
  <span id=load_data>
     <html>
       <table id="load_data">
         <tr>  <th>name</th><th>age</th><th>weight</th> </tr>
         <tr id="iterate">
           <td id="name">   NATURE BOY RIC FLAIR  </td>
           <td id="age">    35                    </td>
           <td id="weight"> 220                   </td>
         </tr>
       </table>
     </html>
  </span>

Tree-Killing Methods

$tree->prune

This removes any nodes from the tree which consist of nothing or nothing but whitespace. See also delete_ignorable_whitespace in HTML::Element.

Loltree Functions

A loltree is an arrayref consisting of arrayrefs which is used by "new_from__lol" in HTML::Element to produce HTML trees. The CPAN distro XML::Element::Tolol creates such XML trees by parsing XML files, analogous to XML::Toolkit. The purpose of the functions in this section is to allow you manipulate a loltree programmatically.

These could not be methods because if you bless a loltree, then HTML::Tree will barf.

HTML::Element::newchild($lol, $parent_label, @newchild)

Given this initial loltree:

  my $initial_lol = [ note => [ shopping => [ item => 'sample' ] ] ];

This code:

  sub shopping_items {
    my @shopping_items = map { [ item => _ ] } qw(bread butter beans);
    @shopping_items;
  }
  my $new_lol = HTML::Element::newnode($initial_lol, item => shopping_items());
 will replace the single sample with a list of shopping items:
     [
          'note',
          [
            'shopping',
              [
                'item',
                'bread'
              ],
              [
                'item',
                'butter'
              ],
              [
                'item',
                'beans'
              ]
          ]
        ];

Thanks to kcott and the other Perlmonks in this thread: http://www.perlmonks.org/?node_id=912416

HTML::Tree

A perl package for creating and manipulating HTML trees.

HTML::ElementTable

An HTML::Tree - based module which allows for manipulation of HTML trees using cartesian coordinations.

HTML::Seamstress

An HTML::Tree - based module inspired by XMLC (<http://xmlc.enhydra.org>), allowing for dynamic HTML generation via tree rewriting.

Push-style templating systems

A comprehensive cross-language list of push-style templating systems <http://perlmonks.org/?node_id=674225>.

TODO

  • highlander2

    currently the API expects the subtrees to survive or be pruned to be identified by id:

      $if_then->highlander2([
        under10 => sub { $_[0] < 10} ,
        under18 => sub { $_[0] < 18} ,
        welcome => [
          sub { 1 },
          sub {
            my $branch = shift;
            $branch->look_down(id => 'age')->replace_content($age);
          }
         ]
       ], $age);
    

    but, it should be more flexible. the "under10", and "under18" are expected to be ids in the tree... but it is not hard to have a check to see if this field is an array reference and if it, then to do a look down instead:

      $if_then->highlander2([
        [class => 'under10'] => sub { $_[0] < 10} ,
        [class => 'under18'] => sub { $_[0] < 18} ,
        [class => 'welcome'] => [
          sub { 1 },
          sub {
            my $branch = shift;
            $branch->look_down(id => 'age')->replace_content($age);
          }
         ]
       ], $age);
    

AUTHOR

Original author Terrence Brannon, <[email protected]>.

Adopted by Marius Gavrilescu "<[email protected]>".

I appreciate the feedback from M. David Moussa Leo Keita regarding some issues with the test suite, namely (1) CRLF leading to test breakage in t/crunch.t and (2) using the wrong module in t/prune.t thus not having the right functionality available.

Many thanks to BARBIE for his RT bug report.

Many thanks to perlmonk kcott for his work on array rewriting: <http://www.perlmonks.org/?node_id=912416>. It was crucial in the development of newchild.

COPYRIGHT AND LICENSE

Coypright (C) 2014-2016 by Marius Gavrilescu

Copyright (C) 2004-2012 by Terrence Brannon

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.