tchdb(3) the hash database API

DESCRIPTION

Hash database is a file containing a hash table and is handled with the hash database API.

To use the hash database API, include `tcutil.h', `tchdb.h', and related standard header files. Usually, write the following description near the front of a source file.


#include <tcutil.h>
#include <tchdb.h>
#include <stdlib.h>
#include <time.h>
#include <stdbool.h>
#include <stdint.h>

Objects whose type is pointer to `TCHDB' are used to handle hash databases. A hash database object is created with the function `tchdbnew' and is deleted with the function `tchdbdel'. To avoid memory leak, it is important to delete every object when it is no longer in use.

Before operations to store or retrieve records, it is necessary to open a database file and connect the hash database object to it. The function `tchdbopen' is used to open a database file and the function `tchdbclose' is used to close the database file. To avoid data missing or corruption, it is important to close every database file when it is no longer in use. It is forbidden for multible database objects in a process to open the same database at the same time.

API

The function `tchdberrmsg' is used in order to get the message string corresponding to an error code.


const char *tchdberrmsg(int ecode);
`ecode' specifies the error code.
The return value is the message string of the error code.

The function `tchdbnew' is used in order to create a hash database object.


TCHDB *tchdbnew(void);
The return value is the new hash database object.

The function `tchdbdel' is used in order to delete a hash database object.


void tchdbdel(TCHDB *hdb);
`hdb' specifies the hash database object.
If the database is not closed, it is closed implicitly. Note that the deleted object and its derivatives can not be used anymore.

The function `tchdbecode' is used in order to get the last happened error code of a hash database object.


int tchdbecode(TCHDB *hdb);
`hdb' specifies the hash database object.
The return value is the last happened error code.
The following error codes are defined: `TCESUCCESS' for success, `TCETHREAD' for threading error, `TCEINVALID' for invalid operation, `TCENOFILE' for file not found, `TCENOPERM' for no permission, `TCEMETA' for invalid meta data, `TCERHEAD' for invalid record header, `TCEOPEN' for open error, `TCECLOSE' for close error, `TCETRUNC' for trunc error, `TCESYNC' for sync error, `TCESTAT' for stat error, `TCESEEK' for seek error, `TCEREAD' for read error, `TCEWRITE' for write error, `TCEMMAP' for mmap error, `TCELOCK' for lock error, `TCEUNLINK' for unlink error, `TCERENAME' for rename error, `TCEMKDIR' for mkdir error, `TCERMDIR' for rmdir error, `TCEKEEP' for existing record, `TCENOREC' for no record found, and `TCEMISC' for miscellaneous error.

The function `tchdbsetmutex' is used in order to set mutual exclusion control of a hash database object for threading.


bool tchdbsetmutex(TCHDB *hdb);
`hdb' specifies the hash database object which is not opened.
If successful, the return value is true, else, it is false.
Note that the mutual exclusion control of the database should be set before the database is opened.

The function `tchdbtune' is used in order to set the tuning parameters of a hash database object.


bool tchdbtune(TCHDB *hdb, int64_t bnum, int8_t apow, int8_t fpow, uint8_t opts);
`hdb' specifies the hash database object which is not opened.
`bnum' specifies the number of elements of the bucket array. If it is not more than 0, the default value is specified. The default value is 16381. Suggested size of the bucket array is about from 0.5 to 4 times of the number of all records to be stored.
`apow' specifies the size of record alignment by power of 2. If it is negative, the default value is specified. The default value is 4 standing for 2^4=16.
`fpow' specifies the maximum number of elements of the free block pool by power of 2. If it is negative, the default value is specified. The default value is 10 standing for 2^10=1024.
`opts' specifies options by bitwise-or: `HDBTLARGE' specifies that the size of the database can be larger than 2GB by using 64-bit bucket array, `HDBTDEFLATE' specifies that each record is compressed with Deflate encoding, `HDBTBZIP' specifies that each record is compressed with BZIP2 encoding, `HDBTTCBS' specifies that each record is compressed with TCBS encoding.
If successful, the return value is true, else, it is false.
Note that the tuning parameters should be set before the database is opened.

The function `tchdbsetcache' is used in order to set the caching parameters of a hash database object.


bool tchdbsetcache(TCHDB *hdb, int32_t rcnum);
`hdb' specifies the hash database object which is not opened.
`rcnum' specifies the maximum number of records to be cached. If it is not more than 0, the record cache is disabled. It is disabled by default.
If successful, the return value is true, else, it is false.
Note that the caching parameters should be set before the database is opened.

The function `tchdbsetxmsiz' is used in order to set the size of the extra mapped memory of a hash database object.


bool tchdbsetxmsiz(TCHDB *hdb, int64_t xmsiz);
`hdb' specifies the hash database object which is not opened.
`xmsiz' specifies the size of the extra mapped memory. If it is not more than 0, the extra mapped memory is disabled. The default size is 67108864.
If successful, the return value is true, else, it is false.
Note that the mapping parameters should be set before the database is opened.

The function `tchdbsetdfunit' is used in order to set the unit step number of auto defragmentation of a hash database object.


bool tchdbsetdfunit(TCHDB *hdb, int32_t dfunit);
`hdb' specifies the hash database object which is not opened.
`dfunit' specifie the unit step number. If it is not more than 0, the auto defragmentation is disabled. It is disabled by default.
If successful, the return value is true, else, it is false.
Note that the defragmentation parameters should be set before the database is opened.

The function `tchdbopen' is used in order to open a database file and connect a hash database object.


bool tchdbopen(TCHDB *hdb, const char *path, int omode);
`hdb' specifies the hash database object which is not opened.
`path' specifies the path of the database file.
`omode' specifies the connection mode: `HDBOWRITER' as a writer, `HDBOREADER' as a reader. If the mode is `HDBOWRITER', the following may be added by bitwise-or: `HDBOCREAT', which means it creates a new database if not exist, `HDBOTRUNC', which means it creates a new database regardless if one exists, `HDBOTSYNC', which means every transaction synchronizes updated contents with the device. Both of `HDBOREADER' and `HDBOWRITER' can be added to by bitwise-or: `HDBONOLCK', which means it opens the database file without file locking, or `HDBOLCKNB', which means locking is performed without blocking.
If successful, the return value is true, else, it is false.

The function `tchdbclose' is used in order to close a hash database object.


bool tchdbclose(TCHDB *hdb);
`hdb' specifies the hash database object.
If successful, the return value is true, else, it is false.
Update of a database is assured to be written when the database is closed. If a writer opens a database but does not close it appropriately, the database will be broken.

The function `tchdbput' is used in order to store a record into a hash database object.


bool tchdbput(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
`hdb' specifies the hash database object connected as a writer.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`vbuf' specifies the pointer to the region of the value.
`vsiz' specifies the size of the region of the value.
If successful, the return value is true, else, it is false.
If a record with the same key exists in the database, it is overwritten.

The function `tchdbput2' is used in order to store a string record into a hash database object.


bool tchdbput2(TCHDB *hdb, const char *kstr, const char *vstr);
`hdb' specifies the hash database object connected as a writer.
`kstr' specifies the string of the key.
`vstr' specifies the string of the value.
If successful, the return value is true, else, it is false.
If a record with the same key exists in the database, it is overwritten.

The function `tchdbputkeep' is used in order to store a new record into a hash database object.


bool tchdbputkeep(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
`hdb' specifies the hash database object connected as a writer.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`vbuf' specifies the pointer to the region of the value.
`vsiz' specifies the size of the region of the value.
If successful, the return value is true, else, it is false.
If a record with the same key exists in the database, this function has no effect.

The function `tchdbputkeep2' is used in order to store a new string record into a hash database object.


bool tchdbputkeep2(TCHDB *hdb, const char *kstr, const char *vstr);
`hdb' specifies the hash database object connected as a writer.
`kstr' specifies the string of the key.
`vstr' specifies the string of the value.
If successful, the return value is true, else, it is false.
If a record with the same key exists in the database, this function has no effect.

The function `tchdbputcat' is used in order to concatenate a value at the end of the existing record in a hash database object.


bool tchdbputcat(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
`hdb' specifies the hash database object connected as a writer.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`vbuf' specifies the pointer to the region of the value.
`vsiz' specifies the size of the region of the value.
If successful, the return value is true, else, it is false.
If there is no corresponding record, a new record is created.

The function `tchdbputcat2' is used in order to concatenate a string value at the end of the existing record in a hash database object.


bool tchdbputcat2(TCHDB *hdb, const char *kstr, const char *vstr);
`hdb' specifies the hash database object connected as a writer.
`kstr' specifies the string of the key.
`vstr' specifies the string of the value.
If successful, the return value is true, else, it is false.
If there is no corresponding record, a new record is created.

The function `tchdbputasync' is used in order to store a record into a hash database object in asynchronous fashion.


bool tchdbputasync(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
`hdb' specifies the hash database object connected as a writer.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`vbuf' specifies the pointer to the region of the value.
`vsiz' specifies the size of the region of the value.
If successful, the return value is true, else, it is false.
If a record with the same key exists in the database, it is overwritten. Records passed to this function are accumulated into the inner buffer and wrote into the file at a blast.

The function `tchdbputasync2' is used in order to store a string record into a hash database object in asynchronous fashion.


bool tchdbputasync2(TCHDB *hdb, const char *kstr, const char *vstr);
`hdb' specifies the hash database object connected as a writer.
`kstr' specifies the string of the key.
`vstr' specifies the string of the value.
If successful, the return value is true, else, it is false.
If a record with the same key exists in the database, it is overwritten. Records passed to this function are accumulated into the inner buffer and wrote into the file at a blast.

The function `tchdbout' is used in order to remove a record of a hash database object.


bool tchdbout(TCHDB *hdb, const void *kbuf, int ksiz);
`hdb' specifies the hash database object connected as a writer.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
If successful, the return value is true, else, it is false.

The function `tchdbout2' is used in order to remove a string record of a hash database object.


bool tchdbout2(TCHDB *hdb, const char *kstr);
`hdb' specifies the hash database object connected as a writer.
`kstr' specifies the string of the key.
If successful, the return value is true, else, it is false.

The function `tchdbget' is used in order to retrieve a record in a hash database object.


void *tchdbget(TCHDB *hdb, const void *kbuf, int ksiz, int *sp);
`hdb' specifies the hash database object.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned if no record corresponds.
Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.

The function `tchdbget2' is used in order to retrieve a string record in a hash database object.


char *tchdbget2(TCHDB *hdb, const char *kstr);
`hdb' specifies the hash database object.
`kstr' specifies the string of the key.
If successful, the return value is the string of the value of the corresponding record. `NULL' is returned if no record corresponds.
Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.

The function `tchdbget3' is used in order to retrieve a record in a hash database object and write the value into a buffer.


int tchdbget3(TCHDB *hdb, const void *kbuf, int ksiz, void *vbuf, int max);
`hdb' specifies the hash database object.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`vbuf' specifies the pointer to the buffer into which the value of the corresponding record is written.
`max' specifies the size of the buffer.
If successful, the return value is the size of the written data, else, it is -1. -1 is returned if no record corresponds to the specified key.
Note that an additional zero code is not appended at the end of the region of the writing buffer.

The function `tchdbvsiz' is used in order to get the size of the value of a record in a hash database object.


int tchdbvsiz(TCHDB *hdb, const void *kbuf, int ksiz);
`hdb' specifies the hash database object.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
If successful, the return value is the size of the value of the corresponding record, else, it is -1.

The function `tchdbvsiz2' is used in order to get the size of the value of a string record in a hash database object.


int tchdbvsiz2(TCHDB *hdb, const char *kstr);
`hdb' specifies the hash database object.
`kstr' specifies the string of the key.
If successful, the return value is the size of the value of the corresponding record, else, it is -1.

The function `tchdbiterinit' is used in order to initialize the iterator of a hash database object.


bool tchdbiterinit(TCHDB *hdb);
`hdb' specifies the hash database object.
If successful, the return value is true, else, it is false.
The iterator is used in order to access the key of every record stored in a database.

The function `tchdbiternext' is used in order to get the next key of the iterator of a hash database object.


void *tchdbiternext(TCHDB *hdb, int *sp);
`hdb' specifies the hash database object.
`sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. It is allowed to update or remove records whose keys are fetched while the iteration. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.

The function `tchdbiternext2' is used in order to get the next key string of the iterator of a hash database object.


char *tchdbiternext2(TCHDB *hdb);
`hdb' specifies the hash database object.
If successful, the return value is the string of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.

The function `tchdbiternext3' is used in order to get the next extensible objects of the iterator of a hash database object.


bool tchdbiternext3(TCHDB *hdb, TCXSTR *kxstr, TCXSTR *vxstr);
`hdb' specifies the hash database object.
`kxstr' specifies the object into which the next key is wrote down.
`vxstr' specifies the object into which the next value is wrote down.
If successful, the return value is true, else, it is false. False is returned when no record is to be get out of the iterator.

The function `tchdbfwmkeys' is used in order to get forward matching keys in a hash database object.


TCLIST *tchdbfwmkeys(TCHDB *hdb, const void *pbuf, int psiz, int max);
`hdb' specifies the hash database object.
`pbuf' specifies the pointer to the region of the prefix.
`psiz' specifies the size of the region of the prefix.
`max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
The return value is a list object of the corresponding keys. This function does never fail. It returns an empty list even if no key corresponds.
Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.

The function `tchdbfwmkeys2' is used in order to get forward matching string keys in a hash database object.


TCLIST *tchdbfwmkeys2(TCHDB *hdb, const char *pstr, int max);
`hdb' specifies the hash database object.
`pstr' specifies the string of the prefix.
`max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
The return value is a list object of the corresponding keys. This function does never fail. It returns an empty list even if no key corresponds.
Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.

The function `tchdbaddint' is used in order to add an integer to a record in a hash database object.


int tchdbaddint(TCHDB *hdb, const void *kbuf, int ksiz, int num);
`hdb' specifies the hash database object connected as a writer.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`num' specifies the additional value.
If successful, the return value is the summation value, else, it is `INT_MIN'.
If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.

The function `tchdbdbadddouble' is used in order to add a real number to a record in a hash database object.


double tchdbadddouble(TCHDB *hdb, const void *kbuf, int ksiz, double num);
`hdb' specifies the hash database object connected as a writer.
`kbuf' specifies the pointer to the region of the key.
`ksiz' specifies the size of the region of the key.
`num' specifies the additional value.
If successful, the return value is the summation value, else, it is Not-a-Number.
If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.

The function `tchdbsync' is used in order to synchronize updated contents of a hash database object with the file and the device.


bool tchdbsync(TCHDB *hdb);
`hdb' specifies the hash database object connected as a writer.
If successful, the return value is true, else, it is false.
This function is useful when another process connects to the same database file.

The function `tchdboptimize' is used in order to optimize the file of a hash database object.


bool tchdboptimize(TCHDB *hdb, int64_t bnum, int8_t apow, int8_t fpow, uint8_t opts);
`hdb' specifies the hash database object connected as a writer.
`bnum' specifies the number of elements of the bucket array. If it is not more than 0, the default value is specified. The default value is two times of the number of records.
`apow' specifies the size of record alignment by power of 2. If it is negative, the current setting is not changed.
`fpow' specifies the maximum number of elements of the free block pool by power of 2. If it is negative, the current setting is not changed.
`opts' specifies options by bitwise-or: `HDBTLARGE' specifies that the size of the database can be larger than 2GB by using 64-bit bucket array, `HDBTDEFLATE' specifies that each record is compressed with Deflate encoding, `HDBTBZIP' specifies that each record is compressed with BZIP2 encoding, `HDBTTCBS' specifies that each record is compressed with TCBS encoding. If it is `UINT8_MAX', the current setting is not changed.
If successful, the return value is true, else, it is false.
This function is useful to reduce the size of the database file with data fragmentation by successive updating.

The function `tchdbvanish' is used in order to remove all records of a hash database object.


bool tchdbvanish(TCHDB *hdb);
`hdb' specifies the hash database object connected as a writer.
If successful, the return value is true, else, it is false.

The function `tchdbcopy' is used in order to copy the database file of a hash database object.


bool tchdbcopy(TCHDB *hdb, const char *path);
`hdb' specifies the hash database object.
`path' specifies the path of the destination file. If it begins with `@', the trailing substring is executed as a command line.
If successful, the return value is true, else, it is false. False is returned if the executed command returns non-zero code.
The database file is assured to be kept synchronized and not modified while the copying or executing operation is in progress. So, this function is useful to create a backup file of the database file.

The function `tchdbtranbegin' is used in order to begin the transaction of a hash database object.


bool tchdbtranbegin(TCHDB *hdb);
`hdb' specifies the hash database object connected as a writer.
If successful, the return value is true, else, it is false.
The database is locked by the thread while the transaction so that only one transaction can be activated with a database object at the same time. Thus, the serializable isolation level is assumed if every database operation is performed in the transaction. All updated regions are kept track of by write ahead logging while the transaction. If the database is closed during transaction, the transaction is aborted implicitly.

The function `tchdbtrancommit' is used in order to commit the transaction of a hash database object.


bool tchdbtrancommit(TCHDB *hdb);
`hdb' specifies the hash database object connected as a writer.
If successful, the return value is true, else, it is false.
Update in the transaction is fixed when it is committed successfully.

The function `tchdbtranabort' is used in order to abort the transaction of a hash database object.


bool tchdbtranabort(TCHDB *hdb);
`hdb' specifies the hash database object connected as a writer.
If successful, the return value is true, else, it is false.
Update in the transaction is discarded when it is aborted. The state of the database is rollbacked to before transaction.

The function `tchdbpath' is used in order to get the file path of a hash database object.


const char *tchdbpath(TCHDB *hdb);
`hdb' specifies the hash database object.
The return value is the path of the database file or `NULL' if the object does not connect to any database file.

The function `tchdbrnum' is used in order to get the number of records of a hash database object.


uint64_t tchdbrnum(TCHDB *hdb);
`hdb' specifies the hash database object.
The return value is the number of records or 0 if the object does not connect to any database file.

The function `tchdbfsiz' is used in order to get the size of the database file of a hash database object.


uint64_t tchdbfsiz(TCHDB *hdb);
`hdb' specifies the hash database object.
The return value is the size of the database file or 0 if the object does not connect to any database file.