HTML::Dashboard(3) Spreadsheet-like formatting for HTML tables, with data-dependent coloring and highlighting: formatted reports

SYNOPSIS


use HTML::Dashboard;
my $dash = HTML::Dashboard->new();
$dash->set_data_without_captions( [ [ 'A', 2, 'foo' ],
[ 'B', 0, 'bar' ],
[ 'C', 1, 'baz' ],
[ 'D', 8, 'mog' ],
[ 'E', 4, 'duh' ] ] );
$dash->set_captions( qw( Code Number Name ) );
$dash->set_cell_low( 1, sub { $_[0] < 1 }, 'lime' );
$dash->set_cell_hi( 1, sub { $_[0] > 5 },
style => "background-color: red; font-weight: bold" );
print $dash->as_HTML();

DESCRIPTION

This module tries to achieve spreadsheet-like formatting for HTML tables.

Rather than having to build up an HTML table from data, row by row and cell by cell, applying formatting rules at every step, this module allows the user to specify a set of simple rules with the desired formatting options. The module will evaluate the rules and apply the formatting options as necessary.

The following features are supported:

  • User-defined formatting of first, last, even, and odd rows or columns.
  • Conditional formatting, based on the contents of each cell.
  • Sorting (on any column or combination of columns, with user defined sort-order).
  • Pagination of the data set.
  • Definition of ``views'', i.e. restriction of the set of columns shown.
  • User-defined column captions.
  • On-the-fly formatting and collating of the data.

As an example, the code in the synopsis above yields the following HTML table (only visible in HTML):

More examples can be found on the author's project page: http://www.beyondcode.org/projects/dashboard/gallery.html

Please read the Rationale section below to understand the purpose of this module.

PUBLIC MEMBER FUNCTIONS

Constructor

HTML::Dashboard->new()
Constructs a new dashboard object. By default, this generates an HTML table with "border='1'" and sets the background color of all even rows to light grey (#eeeeee). These defaults can be overridden (cf. below).

Setting Data

$dash->set_data_without_captions( $data )
$dash->set_data_with_captions( $data )
Takes a reference to an array of array references of rows (i.e. a two-dimensional array). All rows must contain the same number of columns.

Use "set_data_without_captions" if the array contains only data, without captions. Use "set_data_with_captions" if the array contains captions in the first row (as is common, e.g., for data returned from database queries). Captions can be specified or overridden using "set_captions" (cf. below).

The data set is only accessed by reference, i.e. it is not copied. This should be advantageous for large data sets, but will lead to strange results if the data set changes after having been set, but before any one of the output routines is called.

Output

$dash->as_text()
$dash->as_text( $page )
Returns the data as tab-delimited text string, after content formatters (or collaters), sorting, views, and pagination have been applied. No other formatting directives (e.g. odd/even rows, or hi/med/low triggers) are applied. The string will include captions (if they have been set).

In the resulting text string, columns are separated by tabs (\t), rows are separated by single newlines (\n). Tabs, newlines, and backslashes in the data are escaped through a preceding backslash (\).

$dash->as_HTML()
$dash->as_HTML( $page )
Returns the data as a single HTML string. The string contains an HTML table, from the opening "<table>" to the closing "</table>" tag.

No HTML-escaping of data (i.e. of cell content) is performed. If required, specify an appropriate formatter for the data to perform any conversions.

Both functions can be called with an optional integer argument. If no argument is supplied, all rows are returned. If an integer argument in the range

  0 <= $page < $dash->pagecount()

is supplied, only the rows in the specified page (plus captions, if any) are returned. If a page outside the legal range is specified, a warning is emitted and all rows are returned. (Do not forget to call "$dash->set_pagesize(...)" before using this feature. By default, the pagesize is set to infinity, i.e. all rows are returned.)

Captions, Pagination, Views, Sorting

$dash->set_captions( @captions )
$array_ref = $dash->get_captions()
Sets captions for the columns. The captions will be rendered on every page (if pagination is used), using "<th>" tags. The number of captions provided must match the number of columns in the data. If captions have been set explicitly using this function, these captions will be used, even if the data itself contains captions in the first row (i.e. if the data has been set using "set_data_with_captions()").
$dash->set_pagesize( $rows_per_page )
$rows_per_page = $dash->get_pagesize()
$pages = $dash->get_pagecount()
Restricts the number of data rows per page (i.e. not counting captions). Setting the pagesize to anything but a positive integer turns pagination off, so that all rows will be returned.
$dash->set_view( @column_indices )
$array_ref = $dash->get_view()
The set of columns shown can be restricted using "set_view()". This function takes an array of column indices (0..$num_of_cols) to be shown. Defaults to all columns.
$dash->set_sort( sub { ... } )
Sets a comparator routine which will be used to sort the rows before rendering them. The comparator routine will be given two rows (as array references) and must return ``an integer less than, equal to, or greater than 0'', depending on how the rows are to be ordered (cf. Camel, entry on "sort"). Entire rows are passed to the comparator, before views (if any) are applied.

Note that the comparator will be called as a regular routine! This implies in particular that the comparator must parse @_ itself - arguments will not be passed through the ``global'' variables $a and $b as for the "sort" built-in.

Example:

  $dash->set_sort( sub { my ( $x, $y ) = @_; $x->[0] <=> $y->[0] } )

This sorts the rows numerically on the contents of the first column.

Formatting Options

There are three groups of formatting options:
  • Options applied to plain HTML tags (i.e. the "<table>", "<tr>", "<th>", and "<td>" tags).
  • Options to generate ``striped reports'' (i.e. tables, where the formatting is dependent on the row- or column-index).
  • Options which are only applied when a data-dependent condition is fulfilled.

The last group is more complicated, because not only do the actual formatting options have to be set, but also the ``trigger'' and the range of table cells to which it is supposed to be applied.

Formatting options can be set using three different ways:

1.
Single argument: e.g. "$dash->set_table( "border='1'" )" or "$dash->set_first_row( 'red' )".
2.
As explicit CSS style directive: e.g. "$dash->set_th( style => 'font-size: x-large' )" or "$dash->set_even_row( style => 'background-color: yellow' )".
3.
By naming a CSS class: e.g. "$dash->set_td( class => 'highlighted' )" or "$dash->set_even_col( class => 'evencol' )". (Obviously, the class set in this way should be defined in a stylesheet referenced by the HTML page containing the dashboard.)

When using the ``style'' and ``class'' methods, a ``style'' or ``class'' argument is included into the appropriate HTML tags, and set to the supplied value. Note that repeated calls to these functions are additive, not exclusive. In other words, the following two code samples are equivalent:

  $dash->set_even_row( style => 'background-color: yellow' );
  $dash->set_even_row( style => 'font-size: x-large' );

is equivalent to:

  $dash->set_even_row( style => 'background-color: yellow; font-size: x-large' );

(The module will supply semicolons between different style directives when merging the results from repeated calls.)

To erase previous style directives, assign "undef" explicitly: "$dash->set_even_row( style => undef )".

The single-argument version is intended as a short-cut and has a slightly different meaning, depending on the group of formatting option it is applied to. When applied to a direct HTML option (i.e. when used with "set_table()", "set_tr()", "set_th()", or "set_td()"), the argument is pasted unmodified into the corresponding HTML tag. When used with any other option, the argument is interpreted as the desired background color for the cell, row, or column. The specified background color will be applied as an explicit ``style'' argument, not as a ``bgcolor'' argument. In other words, the following calls are (almost) equivalent:

  $dash->set_first_row( 'cyan' );
  $dash->set_first_row( style => 'background-color: cyan' );

It is legal to set conflicting formatting options and will not prevent generation of HTML output. However, no guarantees are made about the appearance of the dashboard in the browser in this case.

In the following, "[format]" always stand for formatting options in any one of the three legal syntax variants as discussed above!

General HTML Options

$dash->set_table( "[format]" )
$dash->set_tr( "[format]" )
$dash->set_th( "[format]" )
$dash->set_td( "[format]" )
$hash_ref = $dash->get_table()
$hash_ref = $dash->get_tr()
$hash_ref = $dash->get_th()
$hash_ref = $dash->get_td()
If set, these options are always included into all tags. This is mostly useful to style the entire table, or cells in the header row.

Striped Reports

$dash->set_first_row( "[format]" )
$dash->set_odd_row( "[format]" )
$dash->set_even_row( "[format]" )
$dash->set_last_row( "[format]" )
$hash_ref = $dash->get_first_row()
$hash_ref = $dash->get_odd_row()
$hash_ref = $dash->get_even_row()
$hash_ref = $dash->get_last_row()
$dash->set_first_col( "[format]" )
$dash->set_odd_col( "[format]" )
$dash->set_even_col( "[format]" )
$dash->set_last_col( "[format]" )
$hash_ref = $dash->get_first_col()
$hash_ref = $dash->get_odd_col()
$hash_ref = $dash->get_even_col()
$hash_ref = $dash->get_last_col()
Options set with these functions are applied to rows or columns as appropriate. Note that first, last, even, and odd is understood with reference to the page or the view, not the total data set.

Options for first and last prevail over options for even and odd. Options for columns prevail over options for rows.

Conditional Formatting (Triggers)

Formatting options in this group are only applied if a ``trigger'' evaluates to true. Therefore, the functions below all take a function reference as argument, besides the actual formatting options.

All triggers have a ``priority'' from highest (hi), over intermediate (med) to lowest (low). If multiple triggers evaluate to true for a certain part of the dashboard (say, a cell), then only the formatting option with the highest priority is applied.

The intended application is to show whether a set of data is ``in the green'' or ``in the red''. Given the prioritization logic of the triggers, this can be easily achieved, without the need for exclusive bounds or conditions across the set of triggers, using code like this:

  $dash->set_row_low( sub{ ...; $x < 3  }, 'green' );
  $dash->set_row_med( sub{ ...; $x < 7  }, 'yellow' );
  $dash->set_row_hi(  sub{ ...; $x > 10 }, 'red' );
$dash->set_row_hi( sub{ my ( $row_ref ) = @_; ... }, "[format]" )
$dash->set_row_med( sub{ my ( $row_ref ) = @_; ... }, "[format]" )
$dash->set_row_low( sub{ my ( $row_ref ) = @_; ... }, "[format]" )
If the triggers evaluates to true, the formatting option is applied to the entire row. The argument to the trigger is an array-ref to the current row. (Additional arguments: index of row in page, and index of row in data set.)
$dash->set_col_hi( $col, sub{ my ( $cell ) = @_; ... }, "[format]" )
$dash->set_col_med( $col, sub{ my ( $cell ) = @_; ... }, "[format]" )
$dash->set_col_low( $col, sub{ my ( $cell ) = @_; ... }, "[format]" )
The first argument to this function is the index of the column in the data set (not in the view!) to which the formatting should be applied. If the triggers evaluates to true, the formatting option is applied to all cells in the column. The argument to the trigger is the contents of the current cell in the specified column.(Additional arguments: the index in the view and in the data set.)
$dash->set_cell_hi( $col, sub{ my ( $cell ) = @_; ... }, "[format]" )
$dash->set_cell_med( $col, sub{ my ( $cell ) = @_; ... }, "[format]" )
$dash->set_cell_low( $col, sub{ my ( $cell ) = @_; ... }, "[format]" )
The first argument to this function is the index of the column in the data set (not in the view!) to which the formatting should be applied. If the triggers evaluates to true, the formatting option is applied to the current cell only. The argument to the trigger is the contents of the current cell in the specified column.(Additional arguments: the index in the view and in the data set.)

Options set with triggers are merged (do not clobber) with options set for first/last and even/odd. (This allows one to have a striped report, and use triggers to change the text color only.)

Options with high (hi) priority prevail over (clobber) options with intermediate (med) priority, which prevail over options with low priority. Options for cells prevail over options for columns, which prevail over options for rows.

Content Formatters

$dash->set_format( $column, sub { ... } )
$dash->set_collate( $column, sub { ... } )
If set, the registered function is called for each row. Its output is used as contents for the current row's cell in the column with index $column.

A formatter set with the first function is given the contents of the data in the current cell, while a collater set with the second function is given the entire row (as array).

Examples:

  $dash->set_format( 1, sub { my ( $x ) = @_; sprintf( "%.2f", $x ) } )
  $dash->set_collate( 1, sub { my ( $r ) = @_; $r[1] . ':' . $r[2] } )

RATIONALE

It was important to me to define a module that would be easy to use, with reasonable defaults and a reasonably small API.

In particular, I wanted a solution which would free the user entirely from having to deal with (i.e. explicitly loop over) individual rows and cells. Furthermore, the user should not have to specify information that is already present in the data (such as the number of rows and columns). Finally, I wanted to free the user from having to address individual cells (e.g. by their location) to provide formatting instructions.

All this required a rule-based system --- you specify the high-level rules, the module makes sure they are applied as necessary.

Below are some further questions that have been asked --- with answers:

Why not just use CSS? Answer: All of this is done through CSS. The difficulty is deciding to which cells to apply the CSS style directives (if this is to be done in a data dependent manner). This module does just that, by inserting the correct CSS ``class'' arguments into the appropriate cell tags (etc).

Why not go with a templating solution? Answer: Templates establish the layout of a table from the outset, which makes it hard to do cell-content-dependent formatting from within the template. And it is simply not convenient, and not in the spirit of the thing, to build templates with lots of conditional code in the template. (I know, having used eg. "HTML::Template" quite extensively.) Given the data-dependent nature of the problem, the table must be built-up row by row and cell by cell individually, applying triggers and formatters as we go along. This is what this module does --- and since we are already must touch each cell individually, we might as well print its HTML as we go along. Using templates in the implementation would not help.

Why not use Excel, PDF, or what have you? Because I want to deliver my reports via the web, so I specifically want HTML output. (Duh!)

Why the name? Because I wanted something more specific and tangible than ``FormattedReport'' or some such. The name points to the source of the idea for this module: corporate metrics dashboards. What managers want to see are the key metrics of the business (sales, orders, what-have-you), with outliers highlighted to make it easy to see which metrics are ``in the green'' and which are ``in the red''. This module allows you to do just that. (And more.)

TO DO

Several ideas:
  • Instead of setting the actual data, it would be nice to set merely a query (and a DB handle) and let the dashboard pull its own data from the DB.
  • When there are subsequent rows, which have identical entries in some columns it can be neat to suppress (leave blank) the repeated entries (e.g. "set_skip_repeats( @skip_cols )" and "get_skip_repeats()").
  • When setting data using an array-ref, it would be nice to specify an optional integer parameter $extend_by, which would extend the range of accessible columns. These new columns would be empty, but could be used with "set_collate()" to build new column values on the fly. (This is never necessary when using a DB query, since one can always include constants in the "SELECT" clause.)

AUTHOR

Philipp K. Janert, <janert at ieee dot org>, http://www.beyondcode.org

COPYRIGHT AND LICENSE

Copyright (C) 2007 by Philipp K. Janert

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.