org.elasticsearch.search.aggregations (server 7.17.3 API)

package org.elasticsearch.search.aggregations

Aggregations

Builds analytic information over all hits in a search request. Aggregations are essentially a tool for sumarizing data, and that summary is often used to generate a visualization.

Types of aggregations

There are three main types of aggregations, each in their own sub package:

Bucket aggregations - which group documents (e.g. a histogram)
Metric aggregations - which compute a summary value from several documents (e.g. a sum)
Pipeline aggregations - which run as a seperate step and compute values across buckets

Additionally there is a support sub package, which contains the type checking and resolution logic, primarily.

How Aggregations Work

TODO: Info about search phases goes here

Aggregations operate in general as Map Reduce jobs. The coordinating node for the query dispatches the aggregation to each data node. The data nodes all instantiate an AggregationBuilder of the appropriate type, which in turn builds the Aggregator for that node. This collects the data from that shard, via BucketCollector.getLeafCollector(org.apache.lucene.index.LeafReaderContext) more or less. These values are shipped back to the coordinating node, which performs the reduction on them (partial reductions in place on the data nodes are also possible).

Three modes of operation

When it comes to actually collecting values, there are three ways aggregations operate, in general. Which one we choose depends on limitations in the query and how the data was ingested (e.g. if it is searchable).

The easiest to understand is the Compatible (i.e. usable in all situations) mode, which can be thought of as iterating each query hit and collecting a value from it. This is the least performant way to evaluate aggregations, requiring looking at every hit.

The fastest way to run an aggregation is by looking at the index structures directly. For example, Lucene just stores the minimum and maximum values of fields per segment, so a min aggregation matching all documents in a segment can just look up its result. Generally speaking, this mode can be engaged when there are no queries or sub-aggregations, and is gated by ValuesSourceConfig.getPointReaderOrNull().

Finally, we can rewrite an aggregation into faster aggregations, or ideally into just a query. Generally, the goal here is to get to filter by filters (which is an optimization on the filters aggregation which runs it as a set of filter queries). Often this process will look like rewriting a DateHistogram into a DateRange, and then rewriting the DateRange into Filters. If you see AdaptingAggregator, that's a good clue that the rewrite mode is being used. In general, when we rewrite aggregations, we are able to detect if the rewritten agg can run in a "fast" mode, and decline the rewrite if it can't.

In general, aggs will try to use one of the fast modes, and if that's not possible, fall back to running in compatible mode.

Related Packages

Package

Description

org.elasticsearch.search

org.elasticsearch.search.aggregations.bucket

org.elasticsearch.search.aggregations.metrics

Aggregations module

org.elasticsearch.search.aggregations.pipeline

org.elasticsearch.search.aggregations.support
Class

Description

AbstractAggregationBuilder<AB extends AbstractAggregationBuilder<AB>>

Base implementation of a AggregationBuilder.

AdaptingAggregator

An Aggregator that delegates collection to another Aggregator and then translates its results into the results you'd expect from another aggregation.

Aggregation

An aggregation.

Aggregation.CommonFields

Common xcontent fields that are shared among addAggregation

AggregationBuilder

A factory that knows how to create an Aggregator of a specific type.

AggregationBuilder.BucketCardinality

A rough count of the number of buckets that Aggregators built by this builder will contain per parent bucket used to validate sorts and pipeline aggregations.

AggregationBuilder.CommonFields

Common xcontent fields shared among aggregator builders

AggregationBuilders

Utility class to create aggregations.

AggregationExecutionException

Thrown when failing to execute an aggregation

AggregationInitializationException

Thrown when failing to execute an aggregation

AggregationPhase

Aggregation phase of a search request, used to collect aggregations

Aggregations

Represents a set of Aggregations

Aggregator

An Aggregator.

Aggregator.BucketComparator

Compare two buckets by their ordinal.

Aggregator.Parser

Parses the aggregation request and creates the appropriate aggregator factory for it.

Aggregator.SubAggCollectionMode

Aggregation mode for sub aggregations.

AggregatorBase

Base implementation for concrete aggregators.

AggregatorFactories

An immutable collection of AggregatorFactories.

AggregatorFactories.Builder

A mutable collection of AggregationBuilders and PipelineAggregationBuilders.

AggregatorFactory

BaseAggregationBuilder

Interface shared by AggregationBuilder and PipelineAggregationBuilder so they can conveniently share the same namespace for XContentParser.namedObject(Class, String, Object).

BucketCollector

A Collector that can collect data in separate buckets.

BucketOrder

MultiBucketsAggregation.Bucket ordering strategy.

CardinalityUpperBound

Upper bound of how many owningBucketOrds that an Aggregator will have to collect into.

DelayedBucket<B extends InternalMultiBucketAggregation.InternalBucket>

A wrapper around reducing buckets with the same key that can delay that reduction as long as possible.

HasAggregations

InternalAggregation

An internal implementation of Aggregation.

InternalAggregation.ReduceContext

InternalAggregation.ReduceContextBuilder

Builds InternalAggregation.ReduceContext.

InternalAggregations

An internal implementation of Aggregations.

InternalMultiBucketAggregation<A extends InternalMultiBucketAggregation,B extends InternalMultiBucketAggregation.InternalBucket>

InternalMultiBucketAggregation.InternalBucket

InternalOrder

Implementations for MultiBucketsAggregation.Bucket ordering strategies.

InternalOrder.Aggregation

MultiBucketsAggregation.Bucket ordering strategy to sort by a sub-aggregation.

InternalOrder.CompoundOrder

MultiBucketsAggregation.Bucket ordering strategy to sort by multiple criteria.

InternalOrder.Parser

Contains logic for parsing a BucketOrder from a XContentParser.

InternalOrder.Streams

Contains logic for reading/writing BucketOrder from/to streams.

InvalidAggregationPathException

KeyComparable<T extends MultiBucketsAggregation.Bucket & KeyComparable<T>>

Defines behavior for comparing bucket keys to imposes a total ordering of buckets of the same type.

LeafBucketCollector

Collects results for a particular segment.

LeafBucketCollectorBase

A LeafBucketCollector that delegates all calls to the sub leaf aggregator and sets the scorer on its source of values if it implements ScorerAware.

MultiBucketCollector

A BucketCollector which allows running a bucket collection with several BucketCollectors.

MultiBucketConsumerService

An aggregation service that creates instances of MultiBucketConsumerService.MultiBucketConsumer.

MultiBucketConsumerService.MultiBucketConsumer

An IntConsumer that throws a MultiBucketConsumerService.TooManyBucketsException when the sum of the provided values is above the limit (`search.max_buckets`).

MultiBucketConsumerService.TooManyBucketsException

NonCollectingAggregator

An aggregator that is not collected, this can typically be used when running an aggregation over a field that doesn't have a mapping.

ParsedAggregation

An implementation of Aggregation that is parsed from a REST response.

ParsedMultiBucketAggregation<B extends MultiBucketsAggregation.Bucket>

ParsedMultiBucketAggregation.ParsedBucket

PipelineAggregationBuilder

A factory that knows how to create an PipelineAggregator of a specific type.

PipelineAggregationBuilder.ValidationContext

PipelineAggregatorBuilders

SearchContextAggregations

The aggregation context that is part of the search context.

TopBucketBuilder<B extends InternalMultiBucketAggregation.InternalBucket>

Merges many buckets into the "top" buckets as sorted by BucketOrder.

Package org.elasticsearch.search.aggregations

Aggregations

Types of aggregations

How Aggregations Work

Three modes of operation