All Known Subinterfaces:: NormalizingTokenFilterFactory

All Known Implementing Classes:: AbstractTokenFilterFactory, HunspellTokenFilterFactory, ShingleTokenFilterFactory, ShingleTokenFilterFactory.Factory, StopTokenFilterFactory

public interface TokenFilterFactory

Field Summary

Fields

Modifier and Type

Field

Description

static final TokenFilterFactory

IDENTITY_FILTER

A TokenFilterFactory that does no filtering to its TokenStream
Method Summary

Modifier and Type

Method

Description

default boolean

breaksFastVectorHighlighter()

Does this analyzer mess up the OffsetAttributes in such as way as to break the FastVectorHighlighter? If this is true then the FastVectorHighlighter will attempt to work around the broken offsets.

org.apache.lucene.analysis.TokenStream

create(org.apache.lucene.analysis.TokenStream tokenStream)

default AnalysisMode

getAnalysisMode()

Get the AnalysisMode this filter is allowed to be used in.

default TokenFilterFactory

getChainAwareTokenFilterFactory(IndexService.IndexCreationContext context, TokenizerFactory tokenizer, List<CharFilterFactory> charFilters, List<TokenFilterFactory> previousTokenFilters, Function<String,TokenFilterFactory> allFilters)

Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories If the token filter is part of the definition of a ReloadableCustomAnalyzer, this function is called twice, once at index creation with IndexService.IndexCreationContext.CREATE_INDEX and then later with IndexService.IndexCreationContext.RELOAD_ANALYZERS on shard recovery.

default String

getResourceName()

Get the name of the resource that this filter is based on.

default TokenFilterFactory

getSynonymFilter()

Return a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exception

String

name()

default org.apache.lucene.analysis.TokenStream

normalize(org.apache.lucene.analysis.TokenStream tokenStream)

Normalize a tokenStream for use in multi-term queries The default implementation is a no-op

Field Details
- IDENTITY_FILTER
  
  static final TokenFilterFactory IDENTITY_FILTER
  
  A TokenFilterFactory that does no filtering to its TokenStream
Method Details
- name
  
  String name()
- create
  
  org.apache.lucene.analysis.TokenStream create(org.apache.lucene.analysis.TokenStream tokenStream)
- normalize
  
  default org.apache.lucene.analysis.TokenStream normalize(org.apache.lucene.analysis.TokenStream tokenStream)
  
  Normalize a tokenStream for use in multi-term queries The default implementation is a no-op
- breaksFastVectorHighlighter
  
  default boolean breaksFastVectorHighlighter()
  
  Does this analyzer mess up the OffsetAttributes in such as way as to break the FastVectorHighlighter? If this is true then the FastVectorHighlighter will attempt to work around the broken offsets.
- getChainAwareTokenFilterFactory
  
  default TokenFilterFactory getChainAwareTokenFilterFactory(IndexService.IndexCreationContext context, TokenizerFactory tokenizer, List<CharFilterFactory> charFilters, List<TokenFilterFactory> previousTokenFilters, Function<String,TokenFilterFactory> allFilters)
  
  Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories If the token filter is part of the definition of a ReloadableCustomAnalyzer, this function is called twice, once at index creation with IndexService.IndexCreationContext.CREATE_INDEX and then later with IndexService.IndexCreationContext.RELOAD_ANALYZERS on shard recovery. The IndexService.IndexCreationContext.RELOAD_ANALYZERS context should be used to load expensive resources on a generic thread pool. See SynonymGraphFilterFactory for an example of how this context is used.
  
  Parameters:
  
  context - the IndexCreationContext for the underlying index
  
  tokenizer - the TokenizerFactory for the preceding chain
  
  charFilters - any CharFilterFactories for the preceding chain
  
  previousTokenFilters - a list of TokenFilterFactories in the preceding chain
  
  allFilters - access to previously defined TokenFilterFactories
- getSynonymFilter
  
  default TokenFilterFactory getSynonymFilter()
  
  Return a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exception
- getAnalysisMode
  
  default AnalysisMode getAnalysisMode()
  
  Get the AnalysisMode this filter is allowed to be used in. The default is AnalysisMode.ALL. Instances need to override this method to define their own restrictions.
- getResourceName
  
  default String getResourceName()
  
  Get the name of the resource that this filter is based on. Used to reload analyzers on this resource changes. For an example, see @SynonymGraphTokenFilterFactory#getResourceName()
  
  Returns:
  
  the name of the resource that this filter was loaded from if any

Interface TokenFilterFactory

Field Summary

Method Summary

Field Details

IDENTITY_FILTER

Method Details

name

create

normalize

breaksFastVectorHighlighter

getChainAwareTokenFilterFactory

getSynonymFilter

getAnalysisMode

getResourceName