PantherFlow Best Practices

Optimize your PantherFlow queries

Overview

PantherFlow is in open beta starting with Panther version 1.110, and is available to all customers. Please share any bug reports and feature requests with your Panther support team.

To ensure your PantherFlow query results return as quickly as possible (and to minimize Snowflake costs arising from the search), it's recommended to follow the best practices below.

If, after implementing them, your query is still running slowly:

Reduce the time range in your query.
Check the number of returned rows to see how much data you're querying—if it's a large amount of data, it is likely expected for it to take a while.
Reach out to your Panther Support team for additional help.

General PantherFlow best practices

Use the `limit` operator

Use the limit operator to specify the maximum number of records your query will return.

Example: panther_logs.public.aws_alb | limit 100

Use a time range filter

Use the where operator to filter by a time range (perhaps against p_event_time). A query with a time range filter will access fewer micro-partitions, which returns results faster.

Example: panther_logs.public.aws_alb | where p_event_time > time.ago(1d)

Learn more about available time functions here.

Use `p_any` fields

During log ingestion, Panther extracts common security indicators into p_any fields, which standardize attribute names across all data sources. The p_any fields are stored in optimized columns. It's recommended to query p_any fields instead of various differently named fields for multiple log types.

Learn more on Standard Fields.

Example: panther_logs.public.aws_alb | '10.0.0.0' in p_any_ip_addresses

Use the `project` operator

A query without a project operator retrieves all columns, which can slow down queries. When possible, use project to query only the fields you need to investigate.

Example: panther_logs.public.aws_alb | project targetIp, targetPort

Summarize results

Summaries execute faster than queries fetching full log records. Using a summary is especially helpful when you're investigating logs over a long period of time, or when you don't know how much data volume exists for the time range you're investigating.

Instead of querying the full data set, use the summarize operator, which will execute faster and help you determine a narrower timeframe to query next.

Example: panther_logs.public.aws_alb | summarize count=agg.count() by targetIp

Learn more about available aggregation functions here.

Filter data early with `where`

Filter data with a where clause before performing expensive operations, such as summarize or join, rather than after.

Example:

// Instead of:
panther_logs.public.aws_alb 
| summarize agg.count() by actor 
| where actor != nil

// Use:
panther_logs.public.aws_alb 
| where actor != nil 
| summarize agg.count() by actor

Avoid the `search` operator

The search operator can introduce slowness, and should be avoided unless necessary. If you know which column (or columns) might contain the text you'd like to search for, instead of searching across all columns in the specified database/table with search, use where with strings.contains().

Example:

Instead of: | search 'alice'
Use: | where strings.contains(name, 'alice')

How to best search across all logs in PantherFlow

To search across all logs in PantherFlow, use the union operator:

union panther_logs.public.*

Further optimizations for union are planned.

If you know which column(s) the value you're searching for should appear in (for example if you're performing an indicator search—searching for an Indicator of Compromise [IoC] in a p_any field), you can optimize the union panther_logs.public.* search by adding project and where filters to search for the IoC only in the relevant column(s):

union panther_logs.public.*
| project p_event_time, p_any_ip_addresses
| where p_event_time > time.ago(1d)
| where p_any_ip_addresses != null
| where 'ip1' in p_any_ip_addresses or 'ip2' in p_any_ip_addresses

You can also perform an indicator search in Panther using:

(Recommended) The Search tool: see Searching Indicators of Compromise
- Search has built-in optimizations that make searching across all logs efficient.
The executeIndicatorSearchQuery GraphQL API endpoint: see an example here
Panther AI: the correct data search and analysis tool will automatically be chosen

Do not query all logs with the panther_views database—it is planned for deprecation.

PreviousPantherFlow Quick Reference NextPantherFlow Statements

Last updated 3 months ago

Was this helpful?

Overview

General PantherFlow best practices

Use the limit operator

Use a time range filter

Use p_any fields

Use the project operator

Summarize results

Filter data early with where

Avoid the search operator

How to best search across all logs in PantherFlow

Use the `limit` operator

Use `p_any` fields

Use the `project` operator

Filter data early with `where`

Avoid the `search` operator