public abstract class BlendedTermQuery
BlendedTermQuery can be used to unify term statistics across
one or more fields in the index. A common problem with structured
documents is that a term that is significant in on field might not be
significant in other fields like in a scenario where documents represent
users with a "first_name" and a "second_name". When someone searches
for "simon" it will very likely get "paul simon" first since "simon" is a
an uncommon last name ie. has a low document frequency. This query
tries to "lie" about the global statistics like document frequency as well
total term frequency to rank based on the estimated statistics.
While aggregating the total term frequency is trivial since it
can be summed up not every Similarity
makes use of this statistic. The document frequency which is used in the
can only be estimated as an lower-bound since it is a document based statistic. For
the document frequency the maximum frequency across all fields per term is used
which is the minimum number of documents the terms occurs in.