public final class CustomPostingsHighlighter
extends org.apache.lucene.search.postingshighlight.PostingsHighlighter
PostingsHighlighter
that works for a single field in a single document.
Uses a custom PassageFormatter
. Accepts field content as a constructor argument, given that loading
is custom and can be done reading from _source field. Supports using different BreakIterator
to break
the text into fragments. Considers every distinct field value as a discrete passage for highlighting (unless
the whole content needs to be highlighted). Supports both returning empty snippets and non highlighted snippets
when no highlighting can be performed.
The use that we make of the postings highlighter is not optimal. It would be much better to highlight
multiple docs in a single call, as we actually lose its sequential IO. That would require to
refactor the elasticsearch highlight api which currently works per hit.Constructor and Description |
---|
CustomPostingsHighlighter(org.apache.lucene.analysis.Analyzer analyzer,
CustomPassageFormatter passageFormatter,
java.text.BreakIterator breakIterator,
java.lang.String fieldValue,
boolean returnNonHighlightedSnippets)
Creates a new instance of
CustomPostingsHighlighter |
CustomPostingsHighlighter(org.apache.lucene.analysis.Analyzer analyzer,
CustomPassageFormatter passageFormatter,
java.lang.String fieldValue,
boolean returnNonHighlightedSnippets)
Creates a new instance of
CustomPostingsHighlighter |
Modifier and Type | Method and Description |
---|---|
protected java.text.BreakIterator |
getBreakIterator(java.lang.String field) |
protected org.apache.lucene.search.postingshighlight.Passage[] |
getEmptyHighlight(java.lang.String fieldName,
java.text.BreakIterator bi,
int maxPassages) |
protected org.apache.lucene.search.postingshighlight.PassageFormatter |
getFormatter(java.lang.String field) |
protected org.apache.lucene.analysis.Analyzer |
getIndexAnalyzer(java.lang.String field) |
Snippet[] |
highlightField(java.lang.String field,
org.apache.lucene.search.Query query,
org.apache.lucene.search.IndexSearcher searcher,
int docId,
int maxPassages)
Highlights terms extracted from the provided query within the content of the provided field name
|
protected java.lang.String[][] |
loadFieldValues(org.apache.lucene.search.IndexSearcher searcher,
java.lang.String[] fields,
int[] docids,
int maxLength) |
public CustomPostingsHighlighter(org.apache.lucene.analysis.Analyzer analyzer, CustomPassageFormatter passageFormatter, java.lang.String fieldValue, boolean returnNonHighlightedSnippets)
CustomPostingsHighlighter
analyzer
- the analyzer used for the field at index time, used for multi term queries internallypassageFormatter
- our own PassageFormatter
which generates snippets in forms of Snippet
objectsfieldValue
- the original field values as constructor argument, loaded from te _source field or the relevant stored field.returnNonHighlightedSnippets
- whether non highlighted snippets should be returned rather than empty snippets when
no highlighting can be performedpublic CustomPostingsHighlighter(org.apache.lucene.analysis.Analyzer analyzer, CustomPassageFormatter passageFormatter, java.text.BreakIterator breakIterator, java.lang.String fieldValue, boolean returnNonHighlightedSnippets)
CustomPostingsHighlighter
analyzer
- the analyzer used for the field at index time, used for multi term queries internallypassageFormatter
- our own PassageFormatter
which generates snippets in forms of Snippet
objectsbreakIterator
- an instance BreakIterator
selected depending on the highlighting optionsfieldValue
- the original field values as constructor argument, loaded from te _source field or the relevant stored field.returnNonHighlightedSnippets
- whether non highlighted snippets should be returned rather than empty snippets when
no highlighting can be performedpublic Snippet[] highlightField(java.lang.String field, org.apache.lucene.search.Query query, org.apache.lucene.search.IndexSearcher searcher, int docId, int maxPassages) throws java.io.IOException
java.io.IOException
protected org.apache.lucene.search.postingshighlight.PassageFormatter getFormatter(java.lang.String field)
getFormatter
in class org.apache.lucene.search.postingshighlight.PostingsHighlighter
protected java.text.BreakIterator getBreakIterator(java.lang.String field)
getBreakIterator
in class org.apache.lucene.search.postingshighlight.PostingsHighlighter
protected org.apache.lucene.search.postingshighlight.Passage[] getEmptyHighlight(java.lang.String fieldName, java.text.BreakIterator bi, int maxPassages)
getEmptyHighlight
in class org.apache.lucene.search.postingshighlight.PostingsHighlighter
protected org.apache.lucene.analysis.Analyzer getIndexAnalyzer(java.lang.String field)
getIndexAnalyzer
in class org.apache.lucene.search.postingshighlight.PostingsHighlighter
protected java.lang.String[][] loadFieldValues(org.apache.lucene.search.IndexSearcher searcher, java.lang.String[] fields, int[] docids, int maxLength) throws java.io.IOException
loadFieldValues
in class org.apache.lucene.search.postingshighlight.PostingsHighlighter
java.io.IOException