Snowflake Configuration for Optimal Search Performance

Learn how Panther leverages Snowflake settings to optimize search performance

Overview

Panther has determined how to configure Snowflake to yield optimal search performance. These configuration settings include warehouse size, query acceleration, and search optimization.

These guidelines are used in determining configurations for Panther-managed Snowflake instances, and can serve as a reference for customer-configured Snowflake instances.

Warehouse size

Warehouse size determines the amount of compute resources used when performing an operation in Snowflake (e.g., searching Snowflake tables).

Search performance and cost expectations

In general, the larger a warehouse is, the faster a query runs. An increase in warehouse size means an across-the-board increase in query speed. However, it also means an across-the-board increase in compute spend. For example, a SMALL warehouse will typically run queries twice as fast as an X-SMALL warehouse, but costs twice as much to run.

Below is a high-level warehouse size recommendation based on your ingest volume in Panther. These ingest volume/warehouse size pairings will all yield similar search performances.

Ingest volume in TB/month (uncompressed)
Recommended warehouse size

< 8

X-SMALL

8-15

SMALL

16-31

MEDIUM

32-63

LARGE

64-127

X-LARGE

128-255

2X-LARGE

256-512

3X-LARGE

Learn more about warehouse size on Snowflake's Overview of Warehouses documentation.

Query acceleration

Query acceleration is only available within Snowflake's Enterprise Edition.

Query acceleration is a Snowflake service that uses idle Snowflake warehouses to speed up slow-running queries.

The scale factor

Query acceleration is configured with a scale factor, which is a cost control mechanism that sets an upper limit on the amount of compute resources that can be borrowed for query acceleration. Panther determines the scale factor for Panther-managed Snowflake instances.

For example, a SMALL warehouse with a scale factor of 10 means that up to 10 more SMALL warehouses may be allocated for a particular query.

Learn more about the scale factor in Snowflake's Adjusting the Scale Factor documentation.

Search performance and cost expectations

  • Query acceleration does not kick in until a running query is deemed "slow." Generally, queries running for longer than one minute are candidates to be sped up.

  • The cost of running a query with acceleration is roughly the same as if the query were run without acceleration, because the cost of the additional warehouses is balanced by the reduced compute time. The theoretical maximum cost, however, is determined by the scale factor—so a warehouse with a scale factor of 10 could, at worst, cost ten times more than the bare warehouse cost.

Learn more about query acceleration on Snowflake's Using the Query Acceleration Service documentation.

Search optimization

Search optimization is only available within Snowflake's Enterprise Edition.

Search optimization is a Snowflake service that indexes ingested data to dramatically improve speed when performing “needle in a haystack” searches (i.e., one in a million-type events).

Search performance and cost expectations

  • With search optimization enabled, queries can execute 10-100x faster.

  • The more unique a value is, the greater impact search optimization has.

    • For example, if you are searching for isHuman = True and 50% of events are True, search optimization will not improve performance at all; however, if only .0001% of the events are True, search optimization will have a significant impact.

  • The cost of search optimization can vary based on the following:

    • Both during search optimization initialization and thereafter, as data is ingested, the cost is influenced by the number and size of the tables and fields that are indexed. The more tables and fields, and the larger tables and fields, the more expensive.

      • When search optimization is initialized, all existing data must be indexed, meaning there is an upfront cost proportional to the amount of historical data being indexed.

      • Following initialization, the cost is calculated at data ingest based on the compute required to generate the search optimization indexes.

    • When an index is used to speed up a search, there is no extra cost. This means searches may become cheaper, since they run more quickly.

Learn more about search optimization on Snowflake's Using the Search Optimization Service documentation.

Last updated

#1924: [don't merge until ~Oct] Notion Logs (Beta)

Change request updated