codec.base module

This module contains base classes/interfaces for “codec” objects.

Classes

class whoosh.codec.base.Codec[source]
class whoosh.codec.base.PerDocumentWriter[source]
class whoosh.codec.base.FieldWriter[source]
class whoosh.codec.base.PostingsWriter[source]
abstract written()[source]

Returns True if this object has already written to disk.

class whoosh.codec.base.TermsReader[source]
class whoosh.codec.base.PerDocumentReader[source]
all_doc_ids()[source]

Returns an iterator of all (undeleted) document IDs in the reader.

class whoosh.codec.base.Segment(indexname)[source]

Do not instantiate this object directly. It is used by the Index object to hold information about a segment. A list of objects of this class are pickled as part of the TOC file.

The TOC file stores a minimal amount of information – mostly a list of Segment objects. Segments are the real reverse indexes. Having multiple segments allows quick incremental indexing: just create a new segment for the new documents, and have the index overlay the new segment over previous ones for purposes of reading/search. “Optimizing” the index combines the contents of existing segments into one (removing any deleted documents along the way).

create_file(storage, ext, **kwargs)[source]

Convenience method to create a new file in the given storage named with this segment’s ID and the given extension. Any keyword arguments are passed to the storage’s create_file method.

abstract delete_document(docnum, delete=True)[source]

Deletes the given document number. The document is not actually removed from the index until it is optimized.

Parameters:
  • docnum – The document number to delete.

  • delete – If False, this undeletes a deleted document.

abstract deleted_count()[source]

Returns the total number of deleted documents in this segment.

doc_count()[source]

Returns the number of (undeleted) documents in this segment.

abstract doc_count_all()[source]

Returns the total number of documents, DELETED OR UNDELETED, in this segment.

has_deletions()[source]

Returns True if any documents in this segment are deleted.

abstract is_deleted(docnum)[source]

Returns True if the given document number is deleted.

open_file(storage, ext, **kwargs)[source]

Convenience method to open a file in the given storage named with this segment’s ID and the given extension. Any keyword arguments are passed to the storage’s open_file method.