Class CustomUnifiedHighlighter

java.lang.Object
org.apache.lucene.search.uhighlight.UnifiedHighlighter
org.apache.lucene.search.uhighlight.CustomUnifiedHighlighter

public class CustomUnifiedHighlighter
extends org.apache.lucene.search.uhighlight.UnifiedHighlighter
Subclass of the UnifiedHighlighter that works for a single field in a single document. Uses a custom PassageFormatter. Accepts field content as a constructor argument, given that loadings field value can be done reading from _source field. Supports using different BreakIterator to break the text into fragments. Considers every distinct field value as a discrete passage for highlighting (unless the whole content needs to be highlighted). Supports both returning empty snippets and non highlighted snippets when no highlighting can be performed.
  • Nested Class Summary

    Nested classes/interfaces inherited from class org.apache.lucene.search.uhighlight.UnifiedHighlighter

    org.apache.lucene.search.uhighlight.UnifiedHighlighter.HighlightFlag, org.apache.lucene.search.uhighlight.UnifiedHighlighter.LimitedStoredFieldVisitor, org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource
  • Field Summary

    Fields
    Modifier and Type Field Description
    static char MULTIVAL_SEP_CHAR  

    Fields inherited from class org.apache.lucene.search.uhighlight.UnifiedHighlighter

    DEFAULT_CACHE_CHARS_THRESHOLD, DEFAULT_MAX_LENGTH, fieldInfos, indexAnalyzer, searcher, ZERO_LEN_AUTOMATA_ARRAY
  • Constructor Summary

    Constructors
    Constructor Description
    CustomUnifiedHighlighter​(org.apache.lucene.search.IndexSearcher searcher, org.apache.lucene.analysis.Analyzer analyzer, org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource offsetSource, org.apache.lucene.search.uhighlight.PassageFormatter passageFormatter, java.util.Locale breakIteratorLocale, java.text.BreakIterator breakIterator, java.lang.String index, java.lang.String field, org.apache.lucene.search.Query query, int noMatchSize, int maxPassages, java.util.function.Predicate<java.lang.String> fieldMatcher, int maxAnalyzedOffset, java.lang.Integer queryMaxAnalyzedOffset)
    Creates a new instance of CustomUnifiedHighlighter
  • Method Summary

    Modifier and Type Method Description
    protected java.text.BreakIterator getBreakIterator​(java.lang.String field)  
    protected org.apache.lucene.search.uhighlight.FieldHighlighter getFieldHighlighter​(java.lang.String field, org.apache.lucene.search.Query query, java.util.Set<org.apache.lucene.index.Term> allTerms, int maxPassages)  
    org.apache.lucene.search.uhighlight.PassageFormatter getFormatter()  
    protected org.apache.lucene.search.uhighlight.PassageFormatter getFormatter​(java.lang.String field)  
    protected org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource getOffsetSource​(java.lang.String field)
    Forces the offset source for this highlighter
    Snippet[] highlightField​(org.apache.lucene.index.LeafReader reader, int docId, CheckedSupplier<java.lang.String,​java.io.IOException> loadFieldValue)
    Highlights the field value.
    protected java.util.Collection<org.apache.lucene.search.Query> preSpanQueryRewrite​(org.apache.lucene.search.Query query)  

    Methods inherited from class org.apache.lucene.search.uhighlight.UnifiedHighlighter

    extractTerms, filterExtractedTerms, getAutomata, getCacheFieldValCharsThreshold, getFieldInfo, getFieldMatcher, getFlags, getHighlightComponents, getIndexAnalyzer, getIndexSearcher, getMaxLength, getMaxNoHighlightPassages, getOffsetStrategy, getOptimizedOffsetSource, getPhraseHelper, getScorer, hasUnrecognizedQuery, highlight, highlight, highlightFields, highlightFields, highlightFields, highlightFieldsAsObjects, highlightWithoutSearcher, loadFieldValues, newLimitedStoredFieldsVisitor, requiresRewrite, setBreakIterator, setCacheFieldValCharsThreshold, setFieldMatcher, setFormatter, setHandleMultiTermQuery, setHighlightPhrasesStrictly, setMaxLength, setMaxNoHighlightPassages, setScorer, shouldHandleMultiTermQuery, shouldHighlightPhrasesStrictly, shouldPreferPassageRelevancyOverSpeed

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

    • CustomUnifiedHighlighter

      public CustomUnifiedHighlighter​(org.apache.lucene.search.IndexSearcher searcher, org.apache.lucene.analysis.Analyzer analyzer, org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource offsetSource, org.apache.lucene.search.uhighlight.PassageFormatter passageFormatter, @Nullable java.util.Locale breakIteratorLocale, @Nullable java.text.BreakIterator breakIterator, java.lang.String index, java.lang.String field, org.apache.lucene.search.Query query, int noMatchSize, int maxPassages, java.util.function.Predicate<java.lang.String> fieldMatcher, int maxAnalyzedOffset, java.lang.Integer queryMaxAnalyzedOffset) throws java.io.IOException
      Creates a new instance of CustomUnifiedHighlighter
      Parameters:
      analyzer - the analyzer used for the field at index time, used for multi term queries internally.
      offsetSource - the UnifiedHighlighter.OffsetSource to used for offsets retrieval.
      passageFormatter - our own CustomPassageFormatter which generates snippets in forms of Snippet objects.
      breakIteratorLocale - the Locale to use for dividing text into passages. If null Locale.ROOT is used.
      breakIterator - the BreakIterator to use for dividing text into passages. If null BreakIterator.getSentenceInstance(Locale) is used.
      index - the index we're highlighting, mostly used for error messages
      field - the name of the field we're highlighting
      query - the query we're highlighting
      noMatchSize - The size of the text that should be returned when no highlighting can be performed.
      maxPassages - the maximum number of passes to highlight
      fieldMatcher - decides which terms should be highlighted
      maxAnalyzedOffset - if the field is more than this long we'll refuse to use the ANALYZED offset source for it because it'd be super slow
      Throws:
      java.io.IOException
  • Method Details

    • highlightField

      public Snippet[] highlightField​(org.apache.lucene.index.LeafReader reader, int docId, CheckedSupplier<java.lang.String,​java.io.IOException> loadFieldValue) throws java.io.IOException
      Highlights the field value.
      Throws:
      java.io.IOException
    • getBreakIterator

      protected java.text.BreakIterator getBreakIterator​(java.lang.String field)
      Overrides:
      getBreakIterator in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
    • getFormatter

      public org.apache.lucene.search.uhighlight.PassageFormatter getFormatter()
    • getFormatter

      protected org.apache.lucene.search.uhighlight.PassageFormatter getFormatter​(java.lang.String field)
      Overrides:
      getFormatter in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
    • getFieldHighlighter

      protected org.apache.lucene.search.uhighlight.FieldHighlighter getFieldHighlighter​(java.lang.String field, org.apache.lucene.search.Query query, java.util.Set<org.apache.lucene.index.Term> allTerms, int maxPassages)
      Overrides:
      getFieldHighlighter in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
    • preSpanQueryRewrite

      protected java.util.Collection<org.apache.lucene.search.Query> preSpanQueryRewrite​(org.apache.lucene.search.Query query)
      Overrides:
      preSpanQueryRewrite in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
    • getOffsetSource

      protected org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource getOffsetSource​(java.lang.String field)
      Forces the offset source for this highlighter
      Overrides:
      getOffsetSource in class org.apache.lucene.search.uhighlight.UnifiedHighlighter