HTML::FormatText::WithLinks::AndTables(3) Converts HTML to Text with tables intact


use HTML::FormatText::WithLinks::AndTables;
my $text = HTML::FormatText::WithLinks::AndTables->convert($html);

Or optionally...

    my $conf = { # same as HTML::FormatText excepting below
        cellpadding   => 2,  # defaults to 1
        no_rowspacing => 1,  # bool, suppress vertical space between table rows
    my $text = HTML::FormatText::WithLinks::AndTables->convert($html, $conf);


This module was inspired by HTML::FormatText::WithLinks which has proven to be a useful `lynx -dump` work-alike. However one frustration was that no other HTML converters I came across had the ability to deal affectively with HTML <TABLE>s. This module can in a rudimentary sense do so. The aim was to provide facility to take a simple HTML based email template, and to also convert it to text with the <TABLE> structure intact for inclusion as ``multipart/alternative'' content. Further, it will preserve both the formatting specified by the <TD> tag's ``align'' attribute, and will also preserve multiline text inside of a <TD> element provided it is broken using <BR/> tags.


Given the HTML below ...

            <TD ALIGN="right">Name:</TD>
            <TD>Mr. Foo Bar</TD>
            <TD ALIGN="right">Address:</TD>
                #1-276 Quux Lane,     <BR/>
                Schenectady, NY, USA, <BR/>
            <TD ALIGN="right">Email:</TD>
            <TD><a href="mailto:[email protected]">[email protected]</a></TD>

... the (default) return value of convert() will be as follows.

       Name:  Mr. Foo Bar
    Address:  #1-276 Quux Lane,
              Schenectady, NY, USA,
      Email:  [1][email protected]
              1. mailto:[email protected]


    * <TH> elements are treated identically to <TD> elements
    * It assumes a fixed width font for display of resulting text.
    * It doesn't work well on nested <TABLE>s or other nested blocks within <TABLE>s.


Shaun Fryer, "< at>" (author emeritus)

Dale Evans, "<daleevans at>" (current maintainer)


