Class CustomUnifiedHighlighter


  • public class CustomUnifiedHighlighter
    extends org.apache.lucene.search.uhighlight.UnifiedHighlighter
    Subclass of the UnifiedHighlighter that works for a single field in a single document. Uses a custom PassageFormatter. Accepts field content as a constructor argument, given that loadings field value can be done reading from _source field. Supports using different BreakIterator to break the text into fragments. Considers every distinct field value as a discrete passage for highlighting (unless the whole content needs to be highlighted). Supports both returning empty snippets and non highlighted snippets when no highlighting can be performed.
    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.lucene.search.uhighlight.UnifiedHighlighter

        org.apache.lucene.search.uhighlight.UnifiedHighlighter.HighlightFlag, org.apache.lucene.search.uhighlight.UnifiedHighlighter.LimitedStoredFieldVisitor, org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static char MULTIVAL_SEP_CHAR  
      • Fields inherited from class org.apache.lucene.search.uhighlight.UnifiedHighlighter

        DEFAULT_CACHE_CHARS_THRESHOLD, DEFAULT_MAX_LENGTH, fieldInfos, indexAnalyzer, searcher, ZERO_LEN_AUTOMATA_ARRAY
    • Constructor Summary

      Constructors 
      Constructor Description
      CustomUnifiedHighlighter​(org.apache.lucene.search.IndexSearcher searcher, org.apache.lucene.analysis.Analyzer analyzer, org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource offsetSource, org.apache.lucene.search.uhighlight.PassageFormatter passageFormatter, java.util.Locale breakIteratorLocale, java.text.BreakIterator breakIterator, java.lang.String fieldValue, int noMatchSize)
      Creates a new instance of CustomUnifiedHighlighter
    • Method Summary

      Modifier and Type Method Description
      protected java.text.BreakIterator getBreakIterator​(java.lang.String field)  
      protected org.apache.lucene.search.uhighlight.FieldHighlighter getFieldHighlighter​(java.lang.String field, org.apache.lucene.search.Query query, java.util.Set<org.apache.lucene.index.Term> allTerms, int maxPassages)  
      protected org.apache.lucene.search.uhighlight.PassageFormatter getFormatter​(java.lang.String field)  
      protected org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource getOffsetSource​(java.lang.String field)
      Forces the offset source for this highlighter
      Snippet[] highlightField​(java.lang.String field, org.apache.lucene.search.Query query, int docId, int maxPassages)
      Highlights terms extracted from the provided query within the content of the provided field name
      protected java.util.List<java.lang.CharSequence[]> loadFieldValues​(java.lang.String[] fields, org.apache.lucene.search.DocIdSetIterator docIter, int cacheCharsThreshold)  
      protected java.util.Collection<org.apache.lucene.search.Query> preMultiTermQueryRewrite​(org.apache.lucene.search.Query query)  
      protected java.util.Collection<org.apache.lucene.search.Query> preSpanQueryRewrite​(org.apache.lucene.search.Query query)  
      • Methods inherited from class org.apache.lucene.search.uhighlight.UnifiedHighlighter

        extractTerms, filterExtractedTerms, getAutomata, getCacheFieldValCharsThreshold, getFieldInfo, getFieldMatcher, getFlags, getIndexAnalyzer, getIndexSearcher, getMaxLength, getMaxNoHighlightPassages, getOffsetStrategy, getOptimizedOffsetSource, getPhraseHelper, getScorer, highlight, highlight, highlightFields, highlightFields, highlightFields, highlightFieldsAsObjects, highlightWithoutSearcher, newLimitedStoredFieldsVisitor, requiresRewrite, setBreakIterator, setCacheFieldValCharsThreshold, setFieldMatcher, setFormatter, setHandleMultiTermQuery, setHighlightPhrasesStrictly, setMaxLength, setMaxNoHighlightPassages, setScorer, shouldHandleMultiTermQuery, shouldHighlightPhrasesStrictly, shouldPreferPassageRelevancyOverSpeed
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • CustomUnifiedHighlighter

        public CustomUnifiedHighlighter​(org.apache.lucene.search.IndexSearcher searcher,
                                        org.apache.lucene.analysis.Analyzer analyzer,
                                        org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource offsetSource,
                                        org.apache.lucene.search.uhighlight.PassageFormatter passageFormatter,
                                        @Nullable
                                        java.util.Locale breakIteratorLocale,
                                        @Nullable
                                        java.text.BreakIterator breakIterator,
                                        java.lang.String fieldValue,
                                        int noMatchSize)
        Creates a new instance of CustomUnifiedHighlighter
        Parameters:
        analyzer - the analyzer used for the field at index time, used for multi term queries internally.
        passageFormatter - our own CustomPassageFormatter which generates snippets in forms of Snippet objects.
        offsetSource - the UnifiedHighlighter.OffsetSource to used for offsets retrieval.
        breakIteratorLocale - the Locale to use for dividing text into passages. If null Locale.ROOT is used.
        breakIterator - the BreakIterator to use for dividing text into passages. If null BreakIterator.getSentenceInstance(Locale) is used.
        fieldValue - the original field values delimited by MULTIVAL_SEP_CHAR.
        noMatchSize - The size of the text that should be returned when no highlighting can be performed.
    • Method Detail

      • highlightField

        public Snippet[] highlightField​(java.lang.String field,
                                        org.apache.lucene.search.Query query,
                                        int docId,
                                        int maxPassages)
                                 throws java.io.IOException
        Highlights terms extracted from the provided query within the content of the provided field name
        Throws:
        java.io.IOException
      • loadFieldValues

        protected java.util.List<java.lang.CharSequence[]> loadFieldValues​(java.lang.String[] fields,
                                                                           org.apache.lucene.search.DocIdSetIterator docIter,
                                                                           int cacheCharsThreshold)
                                                                    throws java.io.IOException
        Overrides:
        loadFieldValues in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
        Throws:
        java.io.IOException
      • getBreakIterator

        protected java.text.BreakIterator getBreakIterator​(java.lang.String field)
        Overrides:
        getBreakIterator in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
      • getFormatter

        protected org.apache.lucene.search.uhighlight.PassageFormatter getFormatter​(java.lang.String field)
        Overrides:
        getFormatter in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
      • getFieldHighlighter

        protected org.apache.lucene.search.uhighlight.FieldHighlighter getFieldHighlighter​(java.lang.String field,
                                                                                           org.apache.lucene.search.Query query,
                                                                                           java.util.Set<org.apache.lucene.index.Term> allTerms,
                                                                                           int maxPassages)
        Overrides:
        getFieldHighlighter in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
      • preMultiTermQueryRewrite

        protected java.util.Collection<org.apache.lucene.search.Query> preMultiTermQueryRewrite​(org.apache.lucene.search.Query query)
        Overrides:
        preMultiTermQueryRewrite in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
      • preSpanQueryRewrite

        protected java.util.Collection<org.apache.lucene.search.Query> preSpanQueryRewrite​(org.apache.lucene.search.Query query)
        Overrides:
        preSpanQueryRewrite in class org.apache.lucene.search.uhighlight.UnifiedHighlighter
      • getOffsetSource

        protected org.apache.lucene.search.uhighlight.UnifiedHighlighter.OffsetSource getOffsetSource​(java.lang.String field)
        Forces the offset source for this highlighter
        Overrides:
        getOffsetSource in class org.apache.lucene.search.uhighlight.UnifiedHighlighter