# PantherFlow Best Practices

## Overview

{% hint style="info" %}
PantherFlow is in open beta starting with Panther version 1.110, and is available to all customers. Please share any bug reports and feature requests with your Panther support team.
{% endhint %}

To ensure your PantherFlow query results return as quickly as possible (and to minimize Snowflake costs arising from the search), it's recommended to follow the best practices below.

If, after implementing them, your query is still running slowly:

* Reduce the time range in your query.
* If the query completes, see the number of returned rows to understand how much data you're querying—if it's a large amount, it's likely expected that it would take a while.
* Reach out to your Panther Support team for additional help.

## General PantherFlow best practices

### **Use the `limit` operator**

Use the [`limit` operator](https://docs.panther.com/pantherflow/operators/limit) to specify the maximum number of records your query will return.

Example: `panther_logs.public.aws_alb | limit 100`

### **Use a time range filter**

Use the [`where` operator](https://docs.panther.com/pantherflow/operators/where) to filter by a time range (perhaps against `p_event_time`). A query with a time range filter will access fewer [micro-partitions](https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions), which returns results faster.

Example: `panther_logs.public.aws_alb | where p_event_time > time.ago(1d)`

Learn more about [available time functions here](https://docs.panther.com/functions#date-time).

### **Use `p_any` fields**

During log ingestion, Panther extracts common security indicators into `p_any` fields, which standardize attribute names across all data sources. The `p_any` fields are stored in optimized columns. It's recommended to query `p_any` fields instead of various differently named fields for multiple log types.

Learn more on [Standard Fields](https://docs.panther.com/search/panther-fields).

Example: `panther_logs.public.aws_alb | '10.0.0.0' in p_any_ip_addresses`

### **Use the `project` operator**

A query without a [`project` operator](https://docs.panther.com/pantherflow/operators/project) retrieves all columns, which can slow down queries. When possible, use `project` to query only the fields you need to investigate.

Example: `panther_logs.public.aws_alb | project targetIp, targetPort`

### **Summarize results**

Summaries execute faster than queries fetching full log records. Using a summary is especially helpful when you're investigating logs over a long period of time, or when you don't know how much data volume exists for the time range you're investigating.

Instead of querying the full data set, use the [`summarize` operator](https://docs.panther.com/pantherflow/operators/summarize), which will execute faster and help you determine a narrower timeframe to query next.

Example: `panther_logs.public.aws_alb | summarize count=agg.count() by targetIp`

Learn more about [available aggregation functions here](https://docs.panther.com/functions#aggregations).

### **Filter data early with `where`**

Filter data with a [`where`](https://docs.panther.com/pantherflow/operators/where) clause before performing expensive operations, such as [`summarize`](https://docs.panther.com/pantherflow/operators/summarize) or [`join`](https://docs.panther.com/pantherflow/operators/join), rather than after.

Example:

```kusto
// Instead of:
panther_logs.public.aws_alb 
| summarize agg.count() by actor 
| where actor != nil

// Use:
panther_logs.public.aws_alb 
| where actor != nil 
| summarize agg.count() by actor 
```

### **Avoid the `search` operator**

The [`search` operator](https://docs.panther.com/pantherflow/operators/search) can introduce slowness, and should be avoided unless necessary. If you know which column (or columns) might contain the text you'd like to search for, instead of searching across *all* columns in the specified database/table with `search`, use [`where`](https://docs.panther.com/pantherflow/operators/where) with [`strings.contains()`](https://docs.panther.com/functions/string#strings.contains).

Example:

* Instead of: `| search 'alice'`
* Use: `| where strings.contains(name, 'alice')`

## How to best search across all logs in PantherFlow

To search across all logs in PantherFlow, use the [`union` operator](https://docs.panther.com/pantherflow/operators/union):

```kusto
union panther_logs.public.*
```

{% hint style="info" %}
Further optimizations for `union` are planned.
{% endhint %}

If you know which column(s) the value you're searching for should appear in (for example if you're performing an indicator search—searching for an Indicator of Compromise \[IoC] in a [`p_any` field](https://docs.panther.com/search/panther-fields#indicator-fields)), you can optimize the `union panther_logs.public.*` search by adding [`project`](https://docs.panther.com/pantherflow/operators/project) and [`where`](https://docs.panther.com/pantherflow/operators/where) filters to search for the IoC only in the relevant column(s):

```kusto
union panther_logs.public.*
| project p_event_time, p_any_ip_addresses
| where p_event_time > time.ago(1d)
| where p_any_ip_addresses != null
| where 'ip1' in p_any_ip_addresses or 'ip2' in p_any_ip_addresses
```

You can also perform an indicator search in Panther using:

* (Recommended) The [Search](https://docs.panther.com/search/search-tool) tool: see [Searching Indicators of Compromise](https://docs.panther.com/search/search-tool#searching-indicators-of-compromise)
  * Search has built-in optimizations that make searching across all logs efficient.
* The `executeIndicatorSearchQuery` [GraphQL API](https://docs.panther.com/panther-developer-workflows/api/graphql) endpoint: see an example [here](https://docs.panther.com/panther-developer-workflows/api/graphql/data-lake-queries#execute-a-search-query)
* [Panther AI](https://docs.panther.com/ai): the correct [data search and analysis tool](https://docs.panther.com/ai#data-search-and-analysis) will automatically be chosen

{% hint style="warning" %}
Do not query all logs with the `panther_views` database—it is [planned for deprecation](https://docs.panther.com/search/backend#panther-views).
{% endhint %}
