XML::SAX::Manifold(3) Multipass processing of documents

VERSION

version 0.46

SYNOPSIS


use XML::SAX::Machines qw( Manifold ) ;
my $m = Manifold(
$channel0,
$channel1,
$channel2,
{
Handler => $h, ## optional
}
);

DESCRIPTION

XML::SAX::Manifold is a SAX machine that allows ``multipass'' processing of a document by sending the document through several channels of SAX processors one channel at a time. A channel may be a single SAX processor or a pipeline (see XML::SAX::Pipeline).

The results of each channel are aggregated by a SAX filter that supports the "end_all" event, "XML::Filter::Merger" by default. See the section on writing an aggregator and XML::Filter::Merger.

This differs from XML::Filter::SAXT in that the channels are prioritized and each channel receives all events for a document before the next channel receives any events. XML::SAX::Manifold buffers all events while feeding them to the highest priority channel ($processor1 in the synopsis), and replays them for each lower priority channel one at a time.

The event flow for the example in the SYNOPSIS would look like the following, with the numbers next to the connection arrow indicating when the document's events flow along that arrow.

   +--------------------------------------------------------+
   |         An XML::SAX::Manifold instance                 |
   |                                                        |
   |               +-----------+                            |
   |            +->| Channel_0 |-+                          |
   |          1/   +-----------+  \1                        |
   |  Intake  /                    \                        |
 1 |  +------+ 2   +-----------+  2 \    +--------+ Exhaust |   
 --+->| Dist |---->| Channel_1 |-----*-->| Merger |---------+--> $h
   |  +------+     +-----------+    /    +--------+         |
   |          \3                  3/                        |
   |           \   +-----------+  /                         |
   |            +->| Channel_2 |-+                          |
   |               +-----------+                            |
   +--------------------------------------------------------+

Here's the timing of the event flows:

   1: upstream -> Dist. -> Channel_0 -> Merger -> downstream
   2:             Dist. -> Channel_1 -> Merger -> downstream
   3:             Dist. -> Channel_2 -> Merger -> downstream

When the document arrives from upstream, the events all arrive during time period 1 and are buffered and also passed through Channel_0 and Channel_0's output is sent to the Merger. After all events have been received (as indicated by an "end_document" event from upstream), all events are then played back through Channel_1 and then through Channel_2 (which also output to the Merger).

It's the merger's job to assemble the three documents it receives in to one document; see XML::Filter::Merger for details.

NAME

XML::SAX::Manifold - Multipass processing of documents

METHODS

new
    my $d = XML::SAX::Manifold->new( @channels, \%options );

Longhand for calling the Manifold function exported by XML::SAX::Machines.

Writing an aggregator.

To be written. Pretty much just that "start_manifold_processing" and "end_manifold_processing" need to be provided. See XML::Filter::Merger and it's source code for a starter.

AUTHORS

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Barry Slaymaker.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.