matching module

Matchers

class whoosh.matching.Matcher[source]

Base class for all matchers.

all_ids()[source]

Returns a generator of all IDs in the matcher.

What this method returns for a matcher that has already read some postings (whether it only yields the remaining postings or all postings from the beginning) is undefined, so it’s best to only use this method on fresh matchers.

all_items()[source]

Returns a generator of all (ID, encoded value) pairs in the matcher.

What this method returns for a matcher that has already read some postings (whether it only yields the remaining postings or all postings from the beginning) is undefined, so it’s best to only use this method on fresh matchers.

block_quality()[source]

Returns a quality measurement of the current block of postings, according to the current weighting algorithm. Raises NoQualityAvailable if the matcher or weighting do not support quality measurements.

children()[source]

Returns an (possibly empty) list of the submatchers of this matcher.

abstract copy()[source]

Returns a copy of this matcher.

depth()[source]

Returns the depth of the tree under this matcher, or 0 if this matcher does not have any children.

abstract id()[source]

Returns the ID of the current posting.

abstract is_active()[source]

Returns True if this matcher is still “active”, that is, it has not yet reached the end of the posting list.

items_as(astype)[source]

Returns a generator of all (ID, decoded value) pairs in the matcher.

What this method returns for a matcher that has already read some postings (whether it only yields the remaining postings or all postings from the beginning) is undefined, so it’s best to only use this method on fresh matchers.

matching_terms(id=None)[source]

Returns an iterator of ("fieldname", "termtext") tuples for the currently matching term matchers in this tree.

max_quality()[source]

Returns the maximum possible quality measurement for this matcher, according to the current weighting algorithm. Raises NoQualityAvailable if the matcher or weighting do not support quality measurements.

abstract next()[source]

Moves this matcher to the next posting.

replace(minquality=0)[source]

Returns a possibly-simplified version of this matcher. For example, if one of the children of a UnionMatcher is no longer active, calling this method on the UnionMatcher will return the other child.

abstract reset()[source]

Returns to the start of the posting list.

Note that reset() may not do what you expect after you call Matcher.replace(), since this can mean calling reset() not on the original matcher, but on an optimized replacement.

abstract score()[source]

Returns the score of the current posting.

skip_to(id)[source]

Moves this matcher to the first posting with an ID equal to or greater than the given ID.

skip_to_quality(minquality)[source]

Moves this matcher to the next block with greater than the given minimum quality value.

spans()[source]

Returns a list of Span objects for the matches in this document. Raises an exception if the field being searched does not store positions.

abstract supports(astype)[source]

Returns True if the field’s format supports the named data type, for example ‘frequency’ or ‘characters’.

supports_block_quality()[source]

Returns True if this matcher supports the use of quality and block_quality.

term()[source]

Returns a ("fieldname", "termtext") tuple for the term this matcher matches, or None if this matcher is not a term matcher.

term_matchers()[source]

Returns an iterator of term matchers in this tree.

abstract value()[source]

Returns the encoded value of the current posting.

abstract value_as(astype)[source]

Returns the value(s) of the current posting as the given type.

weight()[source]

Returns the weight of the current posting.

whoosh.matching.NullMatcher

alias of <NullMatcher>

class whoosh.matching.ListMatcher(ids, weights=None, values=None, format=None, scorer=None, position=0, all_weights=None, term=None, terminfo=None)[source]

Synthetic matcher backed by a list of IDs.

Parameters:
  • ids – a list of doc IDs.

  • weights – a list of weights corresponding to the list of IDs. If this argument is not supplied, a list of 1.0 values is used.

  • values – a list of encoded values corresponding to the list of IDs.

  • format – a whoosh.formats.Format object representing the format of the field.

  • scorer – a whoosh.scoring.BaseScorer object for scoring the postings.

  • term – a ("fieldname", "text") tuple, or None if this is not a term matcher.

class whoosh.matching.WrappingMatcher(child, boost=1.0)[source]

Base class for matchers that wrap sub-matchers.

class whoosh.matching.MultiMatcher(matchers, idoffsets, scorer=None, current=0)[source]

Serializes the results of a list of sub-matchers.

Parameters:
  • matchers – a list of Matcher objects.

  • idoffsets – a list of offsets corresponding to items in the matchers list.

class whoosh.matching.FilterMatcher(child, ids, exclude=False, boost=1.0)[source]

Filters the postings from the wrapped based on whether the IDs are present in or absent from a set.

Parameters:
  • child – the child matcher.

  • ids – a set of IDs to filter by.

  • exclude – by default, only IDs from the wrapped matcher that are in the set are used. If this argument is True, only IDs from the wrapped matcher that are not in the set are used.

class whoosh.matching.BiMatcher(a, b)[source]

Base class for matchers that combine the results of two sub-matchers in some way.

class whoosh.matching.AdditiveBiMatcher(a, b)[source]

Base class for binary matchers where the scores of the sub-matchers are added together.

class whoosh.matching.UnionMatcher(a, b)[source]

Matches the union (OR) of the postings in the two sub-matchers.

class whoosh.matching.DisjunctionMaxMatcher(a, b, tiebreak=0.0)[source]

Matches the union (OR) of two sub-matchers. Where both sub-matchers match the same posting, returns the weight/score of the higher-scoring posting.

class whoosh.matching.IntersectionMatcher(a, b)[source]

Matches the intersection (AND) of the postings in the two sub-matchers.

class whoosh.matching.AndNotMatcher(a, b)[source]

Matches the postings in the first sub-matcher that are NOT present in the second sub-matcher.

class whoosh.matching.InverseMatcher(child, limit, missing=None, weight=1.0, id=0)[source]

Synthetic matcher, generates postings that are NOT present in the wrapped matcher.

class whoosh.matching.RequireMatcher(a, b)[source]

Matches postings that are in both sub-matchers, but only uses scores from the first.

class whoosh.matching.AndMaybeMatcher(a, b)[source]

Matches postings in the first sub-matcher, and if the same posting is in the second sub-matcher, adds their scores.

class whoosh.matching.ConstantScoreMatcher(score=1.0)[source]

Exceptions

exception whoosh.matching.ReadTooFar[source]

Raised when next() or skip_to() are called on an inactive matcher.

exception whoosh.matching.NoQualityAvailable[source]

Raised when quality methods are called on a matcher that does not support block quality optimizations.