STATS ... BY

Syntax
STATS [column1 =] expression1[, ..., [columnN =] expressionN] [BY grouping_column1[, ..., grouping_columnN]]
Parameters
columnX
The name by which the aggregated value is returned. If omitted, the name is
equal to the corresponding expression (expressionX).
expressionX
An expression that computes an aggregated value.
grouping_columnX
The column containing the values to group by.
DescriptionThe STATS ... BY processing command groups rows according to a common value
and calculate one or more aggregated values over the grouped rows. If BY is
omitted, the output table contains exactly one row with the aggregations applied
over the entire dataset.The following aggregation functions are supported:
AVG
COUNT
COUNT_DISTINCT
MAX
MEDIAN
MEDIAN_ABSOLUTE_DEVIATION
MIN
PERCENTILE
SUM
STATS without any groups is much much faster than adding a group.
Grouping on a single column is currently much more optimized than grouping
      on many columns. In some tests we have seen grouping on a single keyword
      column to be five times faster than grouping on two keyword columns. Do
      not try to work around this by combining the two columns together with
      something like CONCAT and then grouping - that is not going to be
      faster.
ExamplesCalculating a statistic and grouping by the values of another column:
```esql
FROM employees
| STATS count = COUNT(emp_no) BY languages
| SORT languages
```

Omitting BY returns one row with the aggregations applied over the entire
dataset:
```esql
FROM employees
| STATS avg_lang = AVG(languages)
```

It’s possible to calculate multiple values:
```esql
FROM employees
| STATS avg_lang = AVG(languages), max_lang = MAX(languages)
```

It’s also possible to group by multiple values (only supported for long and
keyword family fields):
```esql
FROM employees
| EVAL hired = DATE_FORMAT("YYYY", hire_date)
| STATS avg_salary = AVG(salary) BY hired, languages.long
| EVAL avg_salary = ROUND(avg_salary)
| SORT hired, languages.long
```
