Tie::Persistent(3) persistent data structures via tie made easy

VERSION

1.00

SYNOPSIS


use Tie::Persistent;
tie %DB, 'Tie::Persistent', 'file', 'rw'; # read data from 'file'
(tied %DB)->autosync(1); # turn on write back on every modify
# now create/add/modify datastruct
$DB{key} = "value";
(tied %DB)->sync(); # can be called manually
untie %DB; # stores data back into 'file'
# read stored data, no modification of file data
tie %ReadOnly, 'Tie::Persistent', 'file';
foreach (keys %ReadOnly) {
print "$_ => $ReadOnly{$_}\n";
}
untie %ReadOnly; # modifications not stored back

DESCRIPTION

The Tie::Persistent package makes working with persistent data real easy by using the "tie" interface.

It works by storing data contained in a variable into a file (not unlike a database). The primary advantage is speed, as the whole datastructure is kept in memory (which is also a limitation), and, of course, that you can use arbitrary data structures inside the variable (unlike DB_File).

Note that it is most useful if the data structure fits into memory. For larger data structures I recommend MLDBM.

If you want to make an arbitrary object persistent, just store its ref in a scalar tied to 'Tie::Persistent'.

Beware: not every data structure or object can be made persistent. For example, it may not contain GLOB or CODE refs, as these are not really dumpable (yet?).

Also, it works only for variables, you cannot use it for file handles.

[A persistent file handle? Hmmm... Hmmm! I've got an idea: I could start a server and send the file descriptor to it via ioctl(FD_SEND) or sendmsg. Later, I could retrieve it back, so it's persistent as long as the server process keeps running. But the whole file handle may contain more than just the file descriptor. There may be an output routine associated with it that I'd somehow have to dump. Now let's see, there was some way to get the bytecode converted back into perl code... <wanders off into the darkness mumbling> ... ]

PARAMETERS

"tie" %Hash, 'Tie::Persistent', file, mode, other...;

"tie" @Array, 'Tie::Persistent', file, mode, other...;

"tie" $Scalar, 'Tie::Persistent', file, mode, other...;

file
Filename to store the data in. No naming convention is enforced, but I personally use the suffix 'pd' for ``Perl Data'' (or ``Persistent Data''?). No file locking is done; see the section on locking below.
mode (optional)
Same as mode for POSIX fopen() or IO::File::open. Basically a combination of 'r', 'w', 'a' and '+'. Semantics:

 'r' .... read only. Modifications in the data are not stored back
          into the file. A non-existing file gives an error. This is
          the default if no mode is given.
 'rw' ... read/write. Modifications are stored back, if the file does
          not exist, it is created.
 'w' .... write only. The file is not read, the variable starts out empty.
 'a', '+' ... append. Same as 'w', but creates numbered backup files.
 'ra', 'r+' ... Same as 'rw', but creates numbered backup files.

When some kind of write access is specified, a backup file of the old dataset is always created. [You'll thank me for that, believe me.] The reason is simple: when you tie a variable read-write (the contents get restored from the file), and your program isn't fully debugged yet, it may die in the middle of some modifications, but the data will still be written back to the file, possibly leaving them inconsistent. Then you always have at least the previous version that you can restore from.

The default backup filenames follow the Emacs notation, i.e. a '~' is appended; for numbered backup files (specified as 'a' or '+'), an additional number and a '~' is appended.

For a file 'data.pd', the normal backup file would be 'data.pd~' and the numbered backup files would be 'data.pd~1~', 'data.pd~2~' and so on. The latest backup file is the one with the highest number. The backup filename format can be overridden, see below.

other (optional, experimental)
This can be a reference to another (possibly tied) variable or a name of another tieable package.

If a ref is given, it is used internally to store the variable data instead of an anonymous variable ref. This allows to make other tied datastructures persistent, e.g. you could first tie a hash to Tie::IxHash to make it order-preserving and then give it to Tie::Persistent to make it persistent.

A plain name is used to create this tied variable internally. Trailing arguments are passed to the other tieable package.

Example:

 tie %h, 'Tie::Persistent', 'file', 'rw', 'Tie::IxHash';

or

 tie %ixh, 'Tie::IxHash';
 tie %ph,  'Tie::Persistent', 'file', 'w', \%ixh;
 # you can now use %ixh as an alias for %ph

NOTE: This is an experimental feature. It may or may not work with other Tie:: packages. I have only tested it with 'Tie::IxHash'. Please report success or failure.

LOCKING

The data file is not automatically locked. Locking has to be done outside of the package. I recommend using a module like 'Lockfile::Simple' for that.

There are typical two scenarios for locking: you either lock just the 'tie' and/or 'untie' calls, but not the data manipulation, or you lock the whole 'tie' - modify data - 'untie' sequence.

KEEPING DATA SYCHRONIZED

It often is useful to store snapshots of the tied data struct back to the file, e.g. to safeguard against program crashes. You have two possibilities to do that:
  • use sync() to do it manually or
  • set autosync() to do it on every modification.

Note that sync() and autosync() are methods of the tied object, so you have to call them like this:

 (tied %hash)->sync();

and

 (tied @array)->autosync(1);  # or '0' to turn off autosync

There is a global variable $Autosync (see there) that you can set to change the behaviour on a global level for all subsequent ties.

Enabling autosync of course means a quite hefty performance penalty, so think carefully if and how you need it. Maybe there are natural synchronisation points in your application where a manual sync is good enough. Alternatively use MLDBM (if your top-level struct is a hash).

Note: autosync only works if the top-level element of the data structure is modified. If you have more complex data structures and modify elements somewhere deep down, you have to synchronize manually. I therefore recommend the following approach, especially if the topmost structure is a hash:

  • fetch the top-level element into a temporary variable
  • modify the datastructure
  • store back the top-level element, thus triggering a sync.

E.g.

  my $ref = $Hash{$key};      # fetch substructure
  $ref->{$subkey} = $newval;  # modify somewhere down under
  $Hash{$key} = $ref;         # store back

This programming style has the added advantage that you can switch over to other database packages (for example the MLDBM package, in case your data structures outgrow your memory) quite easily by just changing the 'tie' line!

CONFIGURATION VARIABLES

$Tie::Persistent::Readable controls which format to use to store the data inside the file. 'false' means to use 'Storable', which is faster (and the default), 'true' means to use 'Data::Dumper', which is slower but much more readable and thus meant for debugging. This only influences the way the datastructure is written, format detection on read is automatic.

$Tie::Persistent::Autosync gives the default for all tied vars, so modifying it affects all subsequent ties. It's set to 'false' by default.

$Tie::Persistent::BackupFile points to a sub that determines the backup filename format. It gets the filename as $_[0] and returns the backup filename. The default is

 sub { "$_[0]~"; }

which is the Emacs backup format. For NT, you might want to change this to

 sub { "$_[0].bak"; }

or something.

$Tie::Persistent::NumberedBackupFile points to a sub that determines the numbered backup filename format. It gets the filename and a number as $_[0] and $_[1] respectively and returns the backup filename. The default is

 sub { "$_[0]~$_[1]~"; }

which is the extended Emacs backup format.

NOTES

BUGS

Numbered backupfile creation might have problems if the filename (not the backup number) contains the first six digits of the speed of light (in m/s).

All other bugs, please tell me!

AUTHORS

Original version by Roland Giersig <[email protected]>

Benjamin Liberman <[email protected]> added autosyncing and fixed splice.

COPYRIGHT

Copyright (c) 1999-2002 Roland Giersig. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.