XML::Structured(3) simple conversion API from XML to perl structures and back

SYNOPSIS


use XML::Structured;
$dtd = [
'element' =>
'attribute1',
'attribute2',
[],
'element1',
[ 'element2' ],
[ 'element3' =>
...
],
[[ 'element4' =>
...
]],
];
$hashref = XMLin($dtd, $xmlstring);
$hashref = XMLinfile($dtd, $filename_or_glob);
$xmlstring = XMLout($dtd, $hashref);

DESCRIPTION

The XML::Structured module provides a way to convert xml data into a predefined perl data structure and back to xml. Unlike with modules like XML::Simple it is an error if the xml data does not match the provided skeleton (the ``dtd''). Another advantage is that the order of the attributes and elements is taken from the dtd when converting back to xml.

XMLin()

The XMLin() function takes the dtd and a string as arguments and returns a hash reference containing the data.

XMLinfile()

This function works like "XMLin()", but takes a filename or a file descriptor glob as second argument.

XMLout()

"XMLout()" provides the reverse operation to "XMLin()", it takes a dtd and a hash reference as arguments and returns an XML string.

The DTD

The dtd parameter specifies the structure of the allowed xml data. It consists of nested perl arrays.

simple attributes and elements

The very simple example for a dtd is:

    $dtd = [ 'user' =>
                 'login',
                 'password',
           ];

This dtd will accept/create XML like:

    <user login="foo" password="bar" />

XMLin doesn't care if ``login'' or ``password'' are attributes or elements, so

    <user>
      <login>foo</login>
      <password>bar</password>
    </user>

is also valid input (but doesn't get re-created by "XMLout()").

multiple elements of the same name

If an element may appear multiple times, it must be declared as an array in the dtd:

    $dtd = [ 'user' =>
                 'login',
                 [ 'favorite_fruits' ],
           ];

XMLin will create an array reference as value in this case, even if the xml data contains only one element. Valid XML looks like:

    <user login="foo">
      <favorite_fruits>apple</favorite_fruits>
      <favorite_fruits>peach</favorite_fruits>
    </user>

As attributes may not appear multiple times, XMLout will create elements for this case. Note also that all attributes must come before the first element, thus the first array in the dtd ends the attribute list. As an example, the following dtd

    $dtd = [ 'user' =>
                 'login',
                 [ 'favorite_fruits' ],
                 'password',
           ];

will create xml like:

    <user login="foo">
      <favorite_fruits>apple</favorite_fruits>
      <favorite_fruits>peach</favorite_fruits>
      <password>bar</password>
    </user>

``login'' is translated to an attribute and ``password'' to an element.

You can use an empty array reference to force the end of the attribute list, e.g.:

    $dtd = [ 'user' =>
                 [],
                 'login',
                 'password',
           ];

will translate to

    <user>
      <login>foo</login>
      <password>bar</password>
    </user>

instead of

    <user login="foo" password="bar" />

sub-elements

sub-elements are elements that also contain attributes or other elements. They are specified in the dtd as arrays with more than one element. Here is an example:

    $dtd = [ 'user' =>
                 'login',
                 [ 'address' =>
                     'street',
                     'city',
                 ],
           ];

Valid xml for this dtd looks like:

    <user login="foo">
      <address street="broadway 7" city="new york" />
    </user>

It is sometimes useful to specify such dtds in multiple steps:

    $addressdtd = [ 'address' =>
                         'street',
                         'city',
                  ];
    $dtd = [ 'user' =>
                 'login',
                 $addressdtd,
           ];

multiple sub-elements with the same name

As with simple elements, one can allow sub-elements to occur multiple times. "XMLin()" creates an array of hash references in this case. The dtd specification uses an array reference to an array for this case, for example:

    $dtd = [ 'user' =>
                 'login',
                 [[ 'address' =>
                     'street',
                     'city',
                 ]],
           ];
Or, with the $addressdtd definition used in the previous example:
    $dtd = [ 'user' =>
                 'login',
                 [ $addressdtd ],
           ];

Accepted XML is:

    <user login="foo">
      <address street="broadway 7" city="new york" />
      <address street="rural road 12" city="tempe" />
    </user>

the _content pseudo-element

All of the non-whitespace parts between elements get collected into a single ``_content'' element. As example,

    <user login="foo">
      <address street="broadway 7" city="new york"/>hello
      <address street="rural road 12" city="tempe"/>world
    </user>

would set the _content element to "hello world" (the dtd must allow a _content element, of course). If the dtd is

    $dtd = [ 'user' =>
                 'login',
                 [ $addressdtd ],
                 '_content',
           ];

the xml string created by XMLout() will be:

    <user login="foo">
      <address street="broadway 7" city="new york" />
      <address street="rural road 12" city="tempe" />
      hello world    
    </user>

The exact input cannot be re-created, as the positions and the fragmentation of the content data is lost.

COPYRIGHT

Copyright 2006 Michael Schroeder <[email protected]>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.