Interface TokenFilterFactory

All Known Subinterfaces:
NormalizingTokenFilterFactory
All Known Implementing Classes:
AbstractTokenFilterFactory, HunspellTokenFilterFactory, ShingleTokenFilterFactory, ShingleTokenFilterFactory.Factory, StopTokenFilterFactory

public interface TokenFilterFactory
  • Field Summary

    Fields
    Modifier and Type Field Description
    static TokenFilterFactory IDENTITY_FILTER
    A TokenFilterFactory that does no filtering to its TokenStream
  • Method Summary

    Modifier and Type Method Description
    default boolean breaksFastVectorHighlighter()
    Does this analyzer mess up the OffsetAttributes in such as way as to break the FastVectorHighlighter? If this is true then the FastVectorHighlighter will attempt to work around the broken offsets.
    org.apache.lucene.analysis.TokenStream create​(org.apache.lucene.analysis.TokenStream tokenStream)  
    default AnalysisMode getAnalysisMode()
    Get the AnalysisMode this filter is allowed to be used in.
    default TokenFilterFactory getChainAwareTokenFilterFactory​(TokenizerFactory tokenizer, java.util.List<CharFilterFactory> charFilters, java.util.List<TokenFilterFactory> previousTokenFilters, java.util.function.Function<java.lang.String,​TokenFilterFactory> allFilters)
    Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories
    default TokenFilterFactory getSynonymFilter()
    Return a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exception
    java.lang.String name()  
    default org.apache.lucene.analysis.TokenStream normalize​(org.apache.lucene.analysis.TokenStream tokenStream)
    Normalize a tokenStream for use in multi-term queries The default implementation is a no-op
  • Field Details

    • IDENTITY_FILTER

      static final TokenFilterFactory IDENTITY_FILTER
      A TokenFilterFactory that does no filtering to its TokenStream
  • Method Details

    • name

      java.lang.String name()
    • create

      org.apache.lucene.analysis.TokenStream create​(org.apache.lucene.analysis.TokenStream tokenStream)
    • normalize

      default org.apache.lucene.analysis.TokenStream normalize​(org.apache.lucene.analysis.TokenStream tokenStream)
      Normalize a tokenStream for use in multi-term queries The default implementation is a no-op
    • breaksFastVectorHighlighter

      default boolean breaksFastVectorHighlighter()
      Does this analyzer mess up the OffsetAttributes in such as way as to break the FastVectorHighlighter? If this is true then the FastVectorHighlighter will attempt to work around the broken offsets.
    • getChainAwareTokenFilterFactory

      default TokenFilterFactory getChainAwareTokenFilterFactory​(TokenizerFactory tokenizer, java.util.List<CharFilterFactory> charFilters, java.util.List<TokenFilterFactory> previousTokenFilters, java.util.function.Function<java.lang.String,​TokenFilterFactory> allFilters)
      Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories
      Parameters:
      tokenizer - the TokenizerFactory for the preceding chain
      charFilters - any CharFilterFactories for the preceding chain
      previousTokenFilters - a list of TokenFilterFactories in the preceding chain
      allFilters - access to previously defined TokenFilterFactories
    • getSynonymFilter

      default TokenFilterFactory getSynonymFilter()
      Return a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exception
    • getAnalysisMode

      default AnalysisMode getAnalysisMode()
      Get the AnalysisMode this filter is allowed to be used in. The default is AnalysisMode.ALL. Instances need to override this method to define their own restrictions.