DBD::SQLite::VirtualTable(3) SQLite virtual tables implemented in Perl

SYNOPSIS


# register the virtual table module within sqlite
$dbh->sqlite_create_module(mod_name => "DBD::SQLite::VirtualTable::Subclass");
# create a virtual table
$dbh->do("CREATE VIRTUAL TABLE vtbl USING mod_name(arg1, arg2, ...)")
# use it as any regular table
my $sth = $dbh->prepare("SELECT * FROM vtbl WHERE ...");

Note : VirtualTable subclasses or instances are not called directly from Perl code; everything happens indirectly through SQL statements within SQLite.

DESCRIPTION

This module is an abstract class for implementing SQLite virtual tables, written in Perl. Such tables look like regular tables, and are accessed through regular SQL instructions and regular DBI API; but the implementation is done through hidden calls to a Perl class. This is the same idea as Perl's tied variables, but at the SQLite level.

The current abstract class cannot be used directly, so the synopsis above is just to give a general idea. Concrete, usable classes bundled with the present distribution are :

  • DBD::SQLite::VirtualTable::FileContent : implements a virtual column that exposes file contents. This is especially useful in conjunction with a fulltext index; see DBD::SQLite::Fulltext_search.
  • DBD::SQLite::VirtualTable::PerlData : binds to a Perl array within the Perl program. This can be used for simple import/export operations, for debugging purposes, for joining data from different sources, etc.

Other Perl virtual tables may also be published separately on CPAN.

The following chapters document the structure of the abstract class and explain how to write new subclasses; this is meant for module authors, not for end users. If you just need to use a virtual table module, refer to that module's documentation.

ARCHITECTURE

Classes

A virtual table module for SQLite is implemented through a pair of classes :
  • the table class implements methods for creating or connecting a virtual table, for destroying it, for opening new searches, etc.
  • the cursor class implements methods for performing a specific SQL statement

Methods

Most methods in both classes are not called directly from Perl code : instead, they are callbacks, called from the sqlite kernel. Following common Perl conventions, such methods have names in uppercase.

TABLE METHODS

Class methods for registering the module

CREATE_MODULE

  $class->CREATE_MODULE($sqlite_module_name);

Called when the client code invokes

  $dbh->sqlite_create_module($sqlite_module_name => $class);

The default implementation is empty.

DESTROY_MODULE

  $class->DESTROY_MODULE();

Called automatically when the database handle is disconnected. The default implementation is empty.

Class methods for creating a vtable instance

CREATE

  $class->CREATE($dbh_ref, $module_name, $db_name, $vtab_name, @args);

Called when sqlite receives a statement

  CREATE VIRTUAL TABLE $db_name.$vtab_name USING $module_name(@args)

The default implementation just calls ``NEW''.

CONNECT

  $class->CONNECT($dbh_ref, $module_name, $db_name, $vtab_name, @args);

Called when attempting to access a virtual table that had been created during previous database connection. The creation arguments were stored within the sqlite database and are passed again to the CONNECT method.

The default implementation just calls ``NEW''.

_PREPARE_SELF

  $class->_PREPARE_SELF($dbh_ref, $module_name, $db_name, $vtab_name, @args);

Prepares the datastructure for a virtual table instance. @args is
 just the collection of strings (comma-separated) that were given
 within the "CREATE VIRTUAL TABLE" statement; each subclass should
 decide what to do with this information,

The method parses @args to differentiate between options (strings of shape $key=$value or $key="$value", stored in "$self->{options}"), and columns (other @args, stored in "$self->{columns}"). It creates a hashref with the following fields :

"dbh_ref"
a weak reference to the $dbh database handle (see Scalar::Util for an explanation of weak references).
"module_name"
name of the module as declared to sqlite (not to be confounded with the Perl class name).
"db_name"
name of the database (usuallly 'main' or 'temp'), but it may also be an attached database
"vtab_name"
name of the virtual table
"columns"
arrayref of column declarations
"options"
hashref of option declarations

This method should not be redefined, since it performs general work which is supposed to be useful for all subclasses. Instead, subclasses may override the ``NEW'' method.

NEW

  $class->NEW($dbh_ref, $module_name, $db_name, $vtab_name, @args);

Instantiates a virtual table.

Instance methods called from the sqlite kernel

DROP

Called whenever a virtual table is destroyed from the database through the "DROP TABLE" SQL instruction.

Just after the "DROP()" call, the Perl instance will be destroyed (and will therefore automatically call the "DESTROY()" method if such a method is present).

The default implementation for DROP is empty.

Note : this corresponds to the "xDestroy" method in the SQLite documentation; here it was not named "DESTROY", to avoid any confusion with the standard Perl method "DESTROY" for object destruction.

DISCONNECT

Called for every virtual table just before the database handle is disconnected.

Just after the "DISCONNECT()" call, the Perl instance will be destroyed (and will therefore automatically call the "DESTROY()" method if such a method is present).

The default implementation for DISCONNECT is empty.

VTAB_TO_DECLARE

This method is called automatically just after ``CREATE'' or ``CONNECT'', to register the columns of the virtual table within the sqlite kernel. The method should return a string containing a SQL "CREATE TABLE" statement; but only the column declaration parts will be considered. Columns may be declared with the special keyword ``HIDDEN'', which means that they are used internally for the the virtual table implementation, and are not visible to users --- see <http://sqlite.org/c3ref/declare_vtab.html> and <http://www.sqlite.org/vtab.html#hiddencol> for detailed explanations.

The default implementation returns:

  CREATE TABLE $self->{vtab_name}(@{$self->{columns}})

BEST_INDEX

  my $index_info = $vtab->BEST_INDEX($constraints, $order_by)

This is the most complex method to redefined in subclasses. This method will be called at the beginning of a new query on the virtual table; the job of the method is to assemble some information that will be used

a)
by the sqlite kernel to decide about the best search strategy
b)
by the cursor ``FILTER'' method to produce the desired subset of rows from the virtual table.

By calling this method, the SQLite core is saying to the virtual table that it needs to access some subset of the rows in the virtual table and it wants to know the most efficient way to do that access. The "BEST_INDEX" method replies with information that the SQLite core can then use to conduct an efficient search of the virtual table.

The method takes as input a list of $constraints and a list of $order_by instructions. It returns a hashref of indexing properties, described below; furthermore, the method also adds supplementary information within the input $constraints. Detailed explanations are given in <http://sqlite.org/vtab.html#xbestindex>.

Input constraints

Elements of the $constraints arrayref correspond to specific clauses of the "WHERE ..." part of the SQL query. Each constraint is a hashref with keys :

"col"
the integer index of the column on the left-hand side of the constraint
"op"
the comparison operator, expressed as string containing '=', '>', '>=', '<', '<=' or 'MATCH'.
"usable"
a boolean indicating if that constraint is usable; some constraints might not be usable because of the way tables are ordered in a join.

The $constraints arrayref is used both for input and for output. While iterating over the array, the method should add the following keys into usable constraints :

"argvIndex"
An index into the @values array that will be passed to the cursor's ``FILTER'' method. In other words, if the current constraint corresponds to the SQL fragment "WHERE ... AND foo < 123 ...", and the corresponding "argvIndex" takes value 5, this means that the "FILTER" method will receive 123 in $values[5].
"omit"
A boolean telling to the sqlite core that it can safely omit to double check that constraint before returning the resultset to the calling program; this means that the FILTER method has fulfilled the filtering job on that constraint and there is no need to do any further checking.

The "BEST_INDEX" method will not necessarily receive all constraints from the SQL "WHERE" clause : for example a constraint like "col1 < col2 + col3" cannot be handled at this level. Furthemore, the "BEST_INDEX" might decide to ignore some of the received constraints. This is why a second pass over the results will be performed by the sqlite core.

``order_by'' input information

The $order_by arrayref corresponds to the "ORDER BY" clauses in the SQL query. Each entry is a hashref with keys :

"col"
the integer index of the column being ordered
"desc"
a boolean telling of the ordering is DESCending or ascending

This information could be used by some subclasses for optimizing the query strategfy; but usually the sqlite core will perform another sorting pass once all results are gathered.

Hashref information returned by BEST_INDEX

The method should return a hashref with the following keys :

"idxNum"
An arbitrary integer associated with that index; this information will be passed back to ``FILTER''.
"idxStr"
An arbitrary str associated with that index; this information will be passed back to ``FILTER''.
"orderByConsumed"
A boolean telling the sqlite core if the $order_by information has been taken into account or not.
"estimatedCost"
A float that should be set to the estimated number of disk access operations required to execute this query against the virtual table. The SQLite core will often call BEST_INDEX multiple times with different constraints, obtain multiple cost estimates, then choose the query plan that gives the lowest estimate.
"estimatedRows"
An integer giving the estimated number of rows returned by that query.

OPEN

Called to instantiate a new cursor. The default implementation appends "::Cursor" to the current classname and calls "NEW()" within that cursor class.

_SQLITE_UPDATE

This is the dispatch method implementing the "xUpdate()" callback for virtual tables. The default implementation applies the algorithm described in <http://sqlite.org/vtab.html#xupdate> to decide to call ``INSERT'', ``DELETE'' or ``UPDATE''; so there is no reason to override this method in subclasses.

INSERT

  my $rowid = $vtab->INSERT($new_rowid, @values);

This method should be overridden in subclasses to implement insertion of a new row into the virtual table. The size of the @values array corresponds to the number of columns declared through ``VTAB_TO_DECLARE''. The $new_rowid may be explicitly given, or it may be "undef", in which case the method must compute a new id and return it as the result of the method call.

DELETE

  $vtab->INSERT($old_rowid);

This method should be overridden in subclasses to implement deletion of a row from the virtual table.

UPDATE

  $vtab->UPDATE($old_rowid, $new_rowid, @values);

This method should be overridden in subclasses to implement a row update within the virtual table. Usually $old_rowid is equal to $new_rowid, which is a regular update; however, the rowid could be changed from a SQL statement such as

  UPDATE table SET rowid=rowid+1 WHERE ...;

FIND_FUNCTION

  $vtab->FIND_FUNCTION($num_args, $func_name);

When a function uses a column from a virtual table as its first argument, this method is called to see if the virtual table would like to overload the function. Parameters are the number of arguments to the function, and the name of the function. If no overloading is desired, this method should return false. To overload the function, this method should return a coderef to the function implementation.

Each virtual table keeps a cache of results from FIND_FUNCTION calls, so the method will be called only once for each pair "($num_args, $func_name)".

BEGIN_TRANSACTION

Called to begin a transaction on the virtual table.

SYNC_TRANSACTION

Called to signal the start of a two-phase commit on the virtual table.

SYNC_TRANSACTION

Called to commit a virtual table transaction.

ROLLBACK_TRANSACTION

Called to rollback a virtual table transaction.

RENAME

  $vtab->RENAME($new_name)

Called to rename a virtual table.

SAVEPOINT

  $vtab->SAVEPOINT($savepoint)

Called to signal the virtual table to save its current state at savepoint $savepoint (an integer).

ROLLBACK_TO

  $vtab->ROLLBACK_TO($savepoint)

Called to signal the virtual table to return to the state $savepoint. This will invalidate all savepoints with values greater than $savepoint.

RELEASE

  $vtab->RELEASE($savepoint)

Called to invalidate all savepoints with values greater or equal to $savepoint.

Utility instance methods

Methods in this section are in lower case, because they are not called directly from the sqlite kernel; these are utility methods to be called from other methods described above.

dbh

This method returns the database handle ($dbh) associated with the current virtual table.

CURSOR METHODS

Class methods

NEW

  my $cursor = $cursor_class->NEW($vtable, @args)

Instantiates a new cursor. The default implementation just returns a blessed hashref with keys "vtable" and "args".

Instance methods

FILTER

  $cursor->FILTER($idxNum, $idxStr, @values);

This method begins a search of a virtual table.

The $idxNum and $idxStr arguments correspond to values returned by ``BEST_INDEX'' for the chosen index. The specific meanings of those values are unimportant to SQLite, as long as "BEST_INDEX" and "FILTER" agree on what that meaning is.

The "BEST_INDEX" method may have requested the values of certain expressions using the "argvIndex" values of the $constraints list. Those values are passed to "FILTER" through the @values array.

If the virtual table contains one or more rows that match the search criteria, then the cursor must be left point at the first row. Subsequent calls to ``EOF'' must return false. If there are no rows match, then the cursor must be left in a state that will cause ``EOF'' to return true. The SQLite engine will use the ``COLUMN'' and ``ROWID'' methods to access that row content. The ``NEXT'' method will be used to advance to the next row.

EOF

This method must return false if the cursor currently points to a valid row of data, or true otherwise. This method is called by the SQL engine immediately after each ``FILTER'' and ``NEXT'' invocation.

NEXT

This method advances the cursor to the next row of a result set initiated by ``FILTER''. If the cursor is already pointing at the last row when this method is called, then the cursor no longer points to valid data and a subsequent call to the ``EOF'' method must return true. If the cursor is successfully advanced to another row of content, then subsequent calls to ``EOF'' must return false.

COLUMN

  my $value = $cursor->COLUMN($idxCol);

The SQLite core invokes this method in order to find the value for the N-th column of the current row. N is zero-based so the first column is numbered 0.

ROWID

  my $value = $cursor->ROWID;

Returns the rowid of row that the cursor is currently pointing at.

AUTHOR

Laurent Dami <[email protected]>

COPYRIGHT AND LICENSE

Copyright Laurent Dami, 2014.

Parts of the code are borrowed from SQLite::VirtualTable, copyright (C) 2006, 2009 by Qindel Formacion y Servicios, S. L.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.