# PantherFlow Statements

## Overview

{% hint style="info" %}
PantherFlow is in open beta starting with Panther version 1.110, and is available to all customers. Please share any bug reports and feature requests with your Panther support team.
{% endhint %}

A PantherFlow query is composed of one or more statements. There are two types of statements: [tabular expression statements](#tabular-expression-statements) and [`let` statements](#let-statements).

Multiple statements in the same query must be separated by semicolons (`;`). The final statement does not need a semicolon.

## Tabular expression statements

A tabular expression statement is usually what comes to mind when you hear "piped query"—it's composed of a data source and typically one or more operators, separated by the pipe character (`|`). Each operator takes in data, performs its operation, then passes the transformed data on to the next operator.

{% hint style="info" %}
In [Search](https://docs.panther.com/search/search-tool#using-pantherflow-in-search), you can begin your query with a database and table name (as shown below) or the `union` operator. If neither of these are provided, Search will use the values in the [database and table dropdown fields](https://docs.panther.com/search/search-tool#using-database-table-and-date-range-filters).
{% endhint %}

```kusto
panther_logs.public.aws_cloudtrail
| where accountId != '1234567'
| summarize Count=agg.count() by eventName
| extend tooHigh = Count > 100
| where tooHigh
| sort Count
| limit 10
```

### Data sources

Each PantherFlow query must specify a data source. It's possible to use any of the following as the data source:

* Tabular expression statement
* [Table variable](#table-variables)
* [`datatable`](https://docs.panther.com/pantherflow/operators/datatable)
* [`union`](https://docs.panther.com/pantherflow/operators/union)
* [`range`](https://docs.panther.com/pantherflow/operators/range)

## `let` statements

A `let` statement assigns a value to a variable that can be used in subsequent statements. `let` statements can define two types of variables:

* **Table variable**: Represents a tabular expression, and can be later used like a table
* **Scalar variable**: Represents a scalar value, or an expression that evaluates to a scalar value

Variables must be defined in a `let` statement before they are referenced. All `let` statements must end in a semicolon (`;`) when followed by another statement.

### Table variables

A table variable is created when you assign a tabular expression statement to a variable using `let`. It can then be referred to like you would a tabular expression statement (i.e., it is one of the possible [data sources](#data-sources)).

The tabular expression statement assigned to a table variable is not executed until the variable is referred to, or "called," in a subsequent statement.

Naming your query with a table variable can be useful when you:

* Don't want to write out the same query more than once
* Want to make it easier for others to understand what your query is doing

**Examples**

The following example declares a table variable, `elbOK`. Below, `elbOK` is executed exactly as it is defined in the `let` statement:

```kusto
let elbOK = panther_logs.public.aws_alb
| where elbStatusCode == 200;

elbOK
```

Here, an additional operator is applied to `elbOK` within the tabular expression statement:

```kusto
let elbOK = panther_logs.public.aws_alb
| where elbStatusCode == 200;

elbOK 
| where p_event_time > time.ago(1h)
```

You can use table variables wherever you would refer to a dataset, including with the [`union` operator](https://docs.panther.com/pantherflow/operators/union):

```kusto
let ec2Events = panther_logs.public.aws_cloudtrail
| where p_event_time > time.ago(1h)
| where eventSource == "ec2.amazonaws.com";

let s3Events = panther_logs.public.aws_cloudtrail
| where p_event_time > time.ago(1h)
| where eventSource == "s3.amazonaws.com";

union ec2Events, s3Events
| summarize count=agg.count() by eventName, eventSource
| sort count desc
```

### Scalar variables

A scalar variable is created when you assign a non-tabular expression to a variable. Scalar variables can then be referenced throughout subsequent queries.

Declaring scalar variables can be useful when you want to:

* Make your queries more readable and maintainable, especially when using the same value multiple times
* Name values that would otherwise be misunderstood

Take note of the [scalar variable limitations, below](#scalar-variable-limitations).

#### **Examples**

The following example declares a scalar variable, `threshold`, then references it in a [`where`](https://docs.panther.com/pantherflow/operators/where) clause:

```kusto
let threshold = 100;

panther_logs.public.aws_cloudtrail
| where p_event_time > time.ago(1d)
| summarize count=agg.count() by eventName
| where count > threshold
```

The value of an scalar variable can use a [function](https://docs.panther.com/pantherflow/functions), and a scalar variable can be used as a parameter to a function:

```kusto
let domain = "example.com";
let searchSuffix = strings.cat("@", domain);

panther_logs.public.aws_cloudtrail
| where strings.ends_with(userIdentity.principalId, searchSuffix)
| summarize count=agg.count() by eventName
```

You can also use arithmetic expressions with scalar variables:

```kusto
let hourInSeconds = 60 * 60;
let dayInSeconds = hourInSeconds * 24;
let weekInSeconds = dayInSeconds * 7;

panther_logs.public.aws_cloudtrail
| extend ageInSeconds = time.diff("s", p_event_time, time.now())
| extend ageCategory = case(
    ageInSeconds < hourInSeconds, "Less than an hour",
    ageInSeconds < dayInSeconds, "Less than a day",
    ageInSeconds < weekInSeconds, "Less than a week",
    "More than a week"
)
| summarize count=agg.count() by ageCategory
```

You can combine scalar and table variables for powerful queries:

```kusto
// Scalar variables
let minSeverity = 3;
let timeRange = 7d;
let criticalServices = ["ec2.amazonaws.com", "iam.amazonaws.com", "s3.amazonaws.com"];

// Table variable
let baseQuery = panther_logs.public.aws_cloudtrail
| where p_event_time > time.ago(timeRange)
| where eventSource in criticalServices;

// Table variable
let failedActions = baseQuery
| where errorCode != ""
| extend severity = case(
    errorCode == "AccessDenied", 4,
    errorCode == "UnauthorizedOperation", 3,
    strings.starts_with(errorCode, "Client"), 2,
    1
);

// Tabular expression statement
failedActions
| where severity >= minSeverity
| summarize count=agg.count() by eventSource, errorCode, severity
| sort severity desc, count desc
```

### Variable naming rules

Variable names must adhere to the following rules:

* The first character must be a letter, underscore (`_`), or dollar sign (`$`).
* Characters after the first character must be letters, numbers, or underscores.
* Existing table names cannot be used as variable names.
  * For example, if a table already exists named `aws_cloudtrail`, you cannot use `aws_cloudtrail` as a variable name.
* Once a variable name has been used, it cannot be used again in the same PantherFlow search. That is, variables cannot be redefined.

**Examples**

<table><thead><tr><th width="158.21954345703125">Valid variable names</th><th width="360.1328125">Invalid variable names</th></tr></thead><tbody><tr><td><code>myVar123</code></td><td><code>123myVar</code> (starts with number)</td></tr><tr><td><code>my_var</code></td><td><code>my-var</code> (invalid character)</td></tr><tr><td><code>_my_var</code></td><td><code>my.var</code> (invalid character)</td></tr><tr><td><code>$my_var</code></td><td><code>my_var$</code> (<code>$</code> only allowed as first character)</td></tr></tbody></table>
