Class CategorizeTextAggregation.Builder
java.lang.Object
co.elastic.clients.util.ObjectBuilderBase
co.elastic.clients.util.WithJsonObjectBuilderBase<BuilderT>
co.elastic.clients.elasticsearch._types.aggregations.AggregationBase.AbstractBuilder<CategorizeTextAggregation.Builder>
co.elastic.clients.elasticsearch._types.aggregations.CategorizeTextAggregation.Builder
- All Implemented Interfaces:
WithJson<CategorizeTextAggregation.Builder>
,ObjectBuilder<CategorizeTextAggregation>
- Enclosing class:
- CategorizeTextAggregation
public static class CategorizeTextAggregation.Builder
extends AggregationBase.AbstractBuilder<CategorizeTextAggregation.Builder>
implements ObjectBuilder<CategorizeTextAggregation>
Builder for
CategorizeTextAggregation
.-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionbuild()
Builds aCategorizeTextAggregation
.The categorization analyzer specifies how the text is analyzed and tokenized before being categorized.categorizationAnalyzer
(Function<CategorizeTextAnalyzer.Builder, ObjectBuilder<CategorizeTextAnalyzer>> fn) The categorization analyzer specifies how the text is analyzed and tokenized before being categorized.categorizationFilters
(String value, String... values) This property expects an array of regular expressions.categorizationFilters
(List<String> list) This property expects an array of regular expressions.Required - The semi-structured text field to categorize.maxMatchedTokens
(Integer value) The maximum number of token positions to match on before attempting to merge categories.maxUniqueTokens
(Integer value) The maximum number of unique tokens at any position up to max_matched_tokens.minDocCount
(Integer value) The minimum number of documents for a bucket to be returned to the results.protected CategorizeTextAggregation.Builder
self()
shardMinDocCount
(Integer value) The minimum number of documents for a bucket to be returned from the shard before merging.The number of categorization buckets to return from each shard before merging all the results.similarityThreshold
(Integer value) The minimum percentage of tokens that must match for text to be added to the category bucket.The number of buckets to return.Methods inherited from class co.elastic.clients.elasticsearch._types.aggregations.AggregationBase.AbstractBuilder
meta, meta, name
Methods inherited from class co.elastic.clients.util.WithJsonObjectBuilderBase
withJson
Methods inherited from class co.elastic.clients.util.ObjectBuilderBase
_checkSingleUse, _listAdd, _listAddAll, _mapPut, _mapPutAll
-
Constructor Details
-
Builder
public Builder()
-
-
Method Details
-
field
Required - The semi-structured text field to categorize.API name:
field
-
maxUniqueTokens
The maximum number of unique tokens at any position up to max_matched_tokens. Must be larger than 1. Smaller values use less memory and create fewer categories. Larger values will use more memory and create narrower categories. Max allowed value is 100.API name:
max_unique_tokens
-
maxMatchedTokens
The maximum number of token positions to match on before attempting to merge categories. Larger values will use more memory and create narrower categories. Max allowed value is 100.API name:
max_matched_tokens
-
similarityThreshold
The minimum percentage of tokens that must match for text to be added to the category bucket. Must be between 1 and 100. The larger the value the narrower the categories. Larger values will increase memory usage and create narrower categories.API name:
similarity_threshold
-
categorizationFilters
This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters.API name:
categorization_filters
Adds all elements of
list
tocategorizationFilters
. -
categorizationFilters
public final CategorizeTextAggregation.Builder categorizationFilters(String value, String... values) This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters.API name:
categorization_filters
Adds one or more values to
categorizationFilters
. -
categorizationAnalyzer
public final CategorizeTextAggregation.Builder categorizationAnalyzer(@Nullable CategorizeTextAnalyzer value) The categorization analyzer specifies how the text is analyzed and tokenized before being categorized. The syntax is very similar to that used to define the analyzer in the Analyze endpoint. This property cannot be used at the same time as categorization_filters.API name:
categorization_analyzer
-
categorizationAnalyzer
public final CategorizeTextAggregation.Builder categorizationAnalyzer(Function<CategorizeTextAnalyzer.Builder, ObjectBuilder<CategorizeTextAnalyzer>> fn) The categorization analyzer specifies how the text is analyzed and tokenized before being categorized. The syntax is very similar to that used to define the analyzer in the Analyze endpoint. This property cannot be used at the same time as categorization_filters.API name:
categorization_analyzer
-
shardSize
The number of categorization buckets to return from each shard before merging all the results.API name:
shard_size
-
size
The number of buckets to return.API name:
size
-
minDocCount
The minimum number of documents for a bucket to be returned to the results.API name:
min_doc_count
-
shardMinDocCount
The minimum number of documents for a bucket to be returned from the shard before merging.API name:
shard_min_doc_count
-
self
- Specified by:
self
in classAggregationBase.AbstractBuilder<CategorizeTextAggregation.Builder>
-
build
Builds aCategorizeTextAggregation
.- Specified by:
build
in interfaceObjectBuilder<CategorizeTextAggregation>
- Throws:
NullPointerException
- if some of the required fields are null.
-