Lire::DlfSchema(3) Interface to Lire DLF Schema XML specifications

SYNOPSIS

In DLF converters:


use Lire::DlfSchema;
my $schema = Lire::DlfSchema::load_schema( "email" );
my $fields = $schema->fields();
my $dlf_id = $schema->field( 'dlf_id' );
my $dlf_src = $schema->field( 'dlf_source' );

DESCRIPTION

This module is the interface to the Lire DLF Schemas defined in XML files. A schema defines the order of the fields along with their names, descriptions and types.

Each DlfSchema have at least two predefined fields:

dlf_id
This is an integer which uniquely identify a DLF record in its stream. Its used to link the record to its extended schemas fields and also to link the record to the derived schemas records.
dlf_source
This is an identifier which can be used to track the record the ImportJob that created it.

ACCESSING A SCHEMA OBJECT

The way to access a schema for a superservice is through the load_schema() module function. You use it like this:

    my $schema = Lire::DlfSchema::load_schema( $superservice);

This function will return a schema object which can then be used to query information about the schema. This function will die() on error.

has_superservice( $superservice )

Returns true if there is superservice named $schema_name available. An error will be thrown if the schema name isn't valid for a superservice.

has_schema( $schema_name )

Returns true if there is $schema_name available. An error will be thrown if the schema name isn't valid.

superservices()

Returns the name of the available superservices in an array.

schemas()

Returns the name of the available schemas in an array.

SCHEMA OBJECT METHODS

id()

    my $id = $schema->id();

This method will return the id of the schema. This will be the superservice's name for superservice's main schema. (There are other types of schemas (derived and extended schemas) for which the id will be different than the superservice's name.)

superservice()

    my $super = $schema->superservice();

This method will return the superservice's name of the schema.

title( [$new_title] )

This method will return (or change) the human readable title of the schema. (This is the content of the title element in the XML specification.)

description( [$new_description] )

This method will return (or change) the description of the schema. (This is the content of the description element in the XML specification.) Be aware that this will most likely contain DocBook markup.

field_by_pos()

    my $field = $schema->field_by_pos( 0 );

This method takes an integer as parameter and return the field at that position in the schema. Fields are indexed starting at 0. This method will die() if an invalid position is passed as parameter.

The method returns a Lire::Field(3pm) object.

add_field( $field )

Adds the Lire::Field $field to this schema.

has_field()

    if ( $schema->has_field( 'test ) ) { 
        print "schema has field 'test'\n"; 
    }

This method takes a string as parameter and returns a boolean value. That value will be true if there is a field in the schema with that name, it will be false otherwise.

field()

    my $field = $schema->field( 'from_email' );

This method takes a field's name as parameter and returns the Lire::Field(3pm) object describing that field in the schema. The method will die() if there is no field with that name in the schema.

fields()

    my $fields = $schema->fields();
    my @fields = $schema->fields();

In array context, this method will return an array containing all the fields (as Lire::Field(3pm) objects) in the schema. The order of the fields in the array is the order of the fields in the schema.

In scalar context, it will return an array reference. This method is more efficient than creating an array. DO NOT MODIFY THE RETURNED ARRAY.

field_names()

Returns the name of the fields in this schema. The names are in the same order than the fields.

field_count()

    my $number_of_field = $schema->field_count;

This method returns the number of fields in the schema.

timestamp_field()

    my $time_field = $schema->timestamp_field;

This method will return the Lire::Field(3pm) object representing the timestamp field in the schema. The timestamp field is the one that defines the sort order of the DLF records.

is_schema_compatible()

    if ( $schema->is_schema_compatible( $other_schema ) ) {
    }

This method takes a Lire::DlfSchema(3pm) object as parameter and returns a boolean value. That value will be true if the schema passed as parameter is compatible with the other, it will be false otherwise.

For a superservice's schema, the only compatible schema is an object representing the same superservice's schema.

can_join_schema( $schema )

Returns true if $schema can be joined with this schema. For a DlfSchema, this will be true only when $schema is an ExtendedSchema of this schema.

SQL Related Methods

These methods are used to map DLF record into SQL tables.

sql_table()

Returns the SQL table used to hold the DLF records of this schema.

create_sql_schema( $dlf_store, [ $remove ] )

This will create the SQL schemas necessary to hold the DLF records for this schema in the Lire::DlfStore. If $remove is true, a DROP TABLE will be done before creating the schema.

needs_sql_schema_migration( $dlf_store )

This method will return true if the SQL schema isn't up-to-date in the DlfStore. The method migrate_sql_schema() can be used to bring the schema up to date.

migrate_sql_schema( $dlf_store )

Updates the SQL schemas to the current version.

dlf_query( $sort_spec )

Returns a Lire::DlfQuery object which can be use to return all DLF records sorted according to $sort_spec. Sort spec is a white-space delimited list of sort field names. They must be present in the current schema. If the field's name is prefixed by '-', descending sort order will be used.

insert_sql_query()

Returns the INSERT SQL statement that should be used to insert DLF records in the stream. A DBI::st handle prepared with that query needs to be passed as parameter to execute_insert_query().

sql_clean_query( $with_time )

Returns the DELETE statement that can be use to delete the DLF records in this schema. If $with_time is true, the query can be use for selective cleaning. One bind timestamp parameter should be passed when the query is executed and all records which are older than this timestamp will be deleted.

sql_clean_period_query()

Returns the DELETE statement that can be use to delete the DLF records in this schema. The query should be passed two bind parameters. These parameters will be the time boundaries between which records should be deleted from the schema.

AUTHOR

  Francis J. Lacoste <[email protected]>

VERSION

$Id: DlfSchema.pm,v 1.58 2006/07/23 13:16:28 vanbaal Exp $

COPYRIGHT

Copyright (C) 2001, 2002, 2004 Stichting LogReport Foundation [email protected]

This file is part of Lire.

Lire is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program (see COPYING); if not, check with http://www.gnu.org/copyleft/gpl.html.