Module org.elasticsearch.server
Package org.elasticsearch.index.analysis
Interface TokenFilterFactory
- All Known Subinterfaces:
NormalizingTokenFilterFactory
- All Known Implementing Classes:
AbstractTokenFilterFactory
,HunspellTokenFilterFactory
,ShingleTokenFilterFactory
,ShingleTokenFilterFactory.Factory
,StopTokenFilterFactory
public interface TokenFilterFactory
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final TokenFilterFactory
A TokenFilterFactory that does no filtering to its TokenStream -
Method Summary
Modifier and TypeMethodDescriptiondefault boolean
Does this analyzer mess up theOffsetAttribute
s in such as way as to break theFastVectorHighlighter
? If this istrue
then the FastVectorHighlighter will attempt to work around the broken offsets.org.apache.lucene.analysis.TokenStream
create
(org.apache.lucene.analysis.TokenStream tokenStream) default AnalysisMode
Get theAnalysisMode
this filter is allowed to be used in.default TokenFilterFactory
getChainAwareTokenFilterFactory
(IndexService.IndexCreationContext context, TokenizerFactory tokenizer, List<CharFilterFactory> charFilters, List<TokenFilterFactory> previousTokenFilters, Function<String, TokenFilterFactory> allFilters) Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories If the token filter is part of the definition of aReloadableCustomAnalyzer
, this function is called twice, once at index creation withIndexService.IndexCreationContext.CREATE_INDEX
and then later withIndexService.IndexCreationContext.RELOAD_ANALYZERS
on shard recovery.default String
Get the name of the resource that this filter is based on.default TokenFilterFactory
Return a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exceptionname()
default org.apache.lucene.analysis.TokenStream
normalize
(org.apache.lucene.analysis.TokenStream tokenStream) Normalize a tokenStream for use in multi-term queries The default implementation is a no-op
-
Field Details
-
IDENTITY_FILTER
A TokenFilterFactory that does no filtering to its TokenStream
-
-
Method Details
-
name
String name() -
create
org.apache.lucene.analysis.TokenStream create(org.apache.lucene.analysis.TokenStream tokenStream) -
normalize
default org.apache.lucene.analysis.TokenStream normalize(org.apache.lucene.analysis.TokenStream tokenStream) Normalize a tokenStream for use in multi-term queries The default implementation is a no-op -
breaksFastVectorHighlighter
default boolean breaksFastVectorHighlighter()Does this analyzer mess up theOffsetAttribute
s in such as way as to break theFastVectorHighlighter
? If this istrue
then the FastVectorHighlighter will attempt to work around the broken offsets. -
getChainAwareTokenFilterFactory
default TokenFilterFactory getChainAwareTokenFilterFactory(IndexService.IndexCreationContext context, TokenizerFactory tokenizer, List<CharFilterFactory> charFilters, List<TokenFilterFactory> previousTokenFilters, Function<String, TokenFilterFactory> allFilters) Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories If the token filter is part of the definition of aReloadableCustomAnalyzer
, this function is called twice, once at index creation withIndexService.IndexCreationContext.CREATE_INDEX
and then later withIndexService.IndexCreationContext.RELOAD_ANALYZERS
on shard recovery. TheIndexService.IndexCreationContext.RELOAD_ANALYZERS
context should be used to load expensive resources on a generic thread pool. SeeSynonymGraphFilterFactory
for an example of how this context is used.- Parameters:
context
- the IndexCreationContext for the underlying indextokenizer
- the TokenizerFactory for the preceding chaincharFilters
- any CharFilterFactories for the preceding chainpreviousTokenFilters
- a list of TokenFilterFactories in the preceding chainallFilters
- access to previously defined TokenFilterFactories
-
getSynonymFilter
Return a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exception -
getAnalysisMode
Get theAnalysisMode
this filter is allowed to be used in. The default isAnalysisMode.ALL
. Instances need to override this method to define their own restrictions. -
getResourceName
Get the name of the resource that this filter is based on. Used to reload analyzers on this resource changes. For an example, see @SynonymGraphTokenFilterFactory#getResourceName()- Returns:
- the name of the resource that this filter was loaded from if any
-