# Summarize Operator

## Overview

Aggregate findings with `summarize`.

```kusto
| summarize [[<dest>=]aggregation[, ...]] by [<dest>=]<expression>[, ...] 
```

Many aggregations are supported, including:

* [`agg.avg()`](https://docs.panther.com/functions/aggregation#agg.avg)
* [`agg.count()`](https://docs.panther.com/functions/aggregation#agg.count)
* [`agg.count_distinct()`](https://docs.panther.com/functions/aggregation#agg.count_distinct)
* [`agg.make_set()`](https://docs.panther.com/functions/aggregation#agg.make_set)
* [`agg.max()`](https://docs.panther.com/functions/aggregation#agg.max)
* [`agg.min()`](https://docs.panther.com/functions/aggregation#agg.min)
* [`agg.percentile_cont()`](https://docs.panther.com/functions/aggregation#agg.percentile_cont)
* [`agg.stddev()`](https://docs.panther.com/functions/aggregation#agg.stddev)
* [`agg.sum()`](https://docs.panther.com/functions/aggregation#agg.sum)
* [`agg.take_any()`](https://docs.panther.com/functions/aggregation#agg.take_any)

See a full list of available aggregations on [Aggregation Functions](https://docs.panther.com/pantherflow/functions/aggregation).

## Examples

### Count

The following query displays the count of all events in the last day stored in field `num_connections`:

```kusto
panther_logs.public.aws_alb
| where p_event_time > time.ago(1d)
| summarize num_connections=agg.count()
```

| num\_connections |
| ---------------- |
| 993              |

### Group count by single field

The following query counts the number of connections for each `clientIp` and orders results with the largest number of connections first:

```kusto
panther_logs.public.aws_alb
| where p_event_time > time.ago(1d)
| summarize num_connections=agg.count() by clientIp
| sort num_connections
```

| clientIp     | num\_connections |
| ------------ | ---------------- |
| 192.167.7.55 | 979              |
| 10.145.4.26  | 130              |
| 10.99.231.15 | 31               |
| ...          |                  |

### Group count by multiple fields

The following query displays `num_connections` grouped by both the `clientIp` and `clientPort` fields:

```kusto
panther_logs.public.aws_alb
| where p_event_time > time.ago(1d)
| summarize num_connections=agg.count() by clientIp, clientPort
| sort num_connections
```

| clientIp     | clientPort | num\_connections |
| ------------ | ---------- | ---------------- |
| 192.167.7.55 | 50160      | 7                |
| 10.145.4.26  | 63335      | 5                |
| 10.145.4.26  | 60845      | 4                |
| 192.167.7.55 | 52138      | 4                |
| 10.99.231.15 | 58704      | 3                |
| …            |            |                  |

### Group count by arbitrary expressions

```kusto
panther_logs.public.aws_alb
| where p_event_time > time.ago(1d)
| summarize num_connections=agg.count() by isOK = elbStatusCode == 200
| sort num_connections
```

| isOK  | num\_connections |
| ----- | ---------------- |
| true  | 1114             |
| false | 324              |

### Count distinct

The query below stores the distinct number of `clientIp`s in `num_distinct_clients`:

```kusto
panther_logs.public.aws_alb
| where p_event_time > time.ago(1d)
| summarize num_distinct_clients=agg.count_distinct(clientIp)
```

| num\_distinct\_clients |
| ---------------------- |
| 121                    |

### Refer to an aggregation later

You can refer to an aggregation in a subsequent query expression:

```kusto
panther_logs.public.aws_alb
| where p_event_time > time.ago(1d)
| summarize num_connections=agg.count() by clientIp, clientPort
| where num_connections >= 5
```

| clientIp     | clientPort | num\_connections |
| ------------ | ---------- | ---------------- |
| 192.167.7.55 | 50160      | 7                |
| 10.145.4.26  | 63335      | 5                |
| ...          |            |                  |

### Summarize by field (without aggregation)

The query below displays each unique `userAgent`:

```kusto
panther_logs.public.aws_alb
| where p_event_time > time.ago(1d)
| summarize by userAgent
```

| userAgent                                                                                                                              |
| -------------------------------------------------------------------------------------------------------------------------------------- |
| Mozilla/5.0 (X11; Linux x86\_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36                              |
| curl/8.1.2                                                                                                                             |
| Mozilla/5.0 (Linux; Android 7.0; LG-H918 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Mobile Safari/537.36 |
| …                                                                                                                                      |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.panther.com/pantherflow/operators/summarize.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
