filedb.filetables
module¶
This module defines writer and reader classes for a fast, immutable on-disk key-value database format. The current format is based heavily on D. J. Bernstein’s CDB format (http://cr.yp.to/cdb.html).
Hash file¶
- class whoosh.filedb.filetables.HashWriter(dbfile, magic=b'HSH3', hashtype=0)[source]¶
Implements a fast on-disk key-value store. This hash uses a two-level hashing scheme, where a key is hashed, the low eight bits of the hash value are used to index into one of 256 hash tables. This is basically the CDB algorithm, but unlike CDB this object writes all data serially (it doesn’t seek backwards to overwrite information at the end).
Also unlike CDB, this format uses 64-bit file pointers, so the file length is essentially unlimited. However, each key and value must be less than 2 GB in length.
- Parameters:
dbfile – a
StructFile
object to write to.magic – the format tag bytes to write at the start of the file.
hashtype – an integer indicating which hashing algorithm to use. Possible values are 0 (MD5), 1 (CRC32), or 2 (CDB hash).
- add(key, value)[source]¶
Adds a key/value pair to the file. Note that keys DO NOT need to be unique. You can store multiple values under the same key and retrieve them using
HashReader.all()
.
- add_all(items)[source]¶
Convenience method to add a sequence of
(key, value)
pairs. This is the same as callingHashWriter.add()
on each pair in the sequence.
- class whoosh.filedb.filetables.HashReader(dbfile, length=None, magic=b'HSH3', startoffset=0)[source]¶
Reader for the fast on-disk key-value files created by
HashWriter
.- Parameters:
dbfile – a
StructFile
object to read from.length – the length of the file data. This is necessary since the hashing information is written at the end of the file.
magic – the format tag bytes to look for at the start of the file. If the file’s format tag does not match these bytes, the object raises a
FileFormatError
exception.startoffset – the starting point of the file data.
- classmethod open(storage, name)[source]¶
Convenience method to open a hash file given a
whoosh.filedb.filestore.Storage
object and a name. This takes care of opening the file and passing its length to the initializer.
Ordered Hash file¶
- class whoosh.filedb.filetables.OrderedHashWriter(dbfile)[source]¶
Implements an on-disk hash, but requires that keys be added in order. An
OrderedHashReader
can then look up “nearest keys” based on the ordering.- Parameters:
dbfile – a
StructFile
object to write to.magic – the format tag bytes to write at the start of the file.
hashtype – an integer indicating which hashing algorithm to use. Possible values are 0 (MD5), 1 (CRC32), or 2 (CDB hash).
- class whoosh.filedb.filetables.OrderedHashReader(dbfile, length=None, magic=b'HSH3', startoffset=0)[source]¶
- Parameters:
dbfile – a
StructFile
object to read from.length – the length of the file data. This is necessary since the hashing information is written at the end of the file.
magic – the format tag bytes to look for at the start of the file. If the file’s format tag does not match these bytes, the object raises a
FileFormatError
exception.startoffset – the starting point of the file data.