Correlation Rules (Beta)
Correlation rules establish correlations across logs, identify anomalies, and model complex attack behavior, then generate alerts
Overview
Correlation rules are in open beta starting with Panther version 1.108, and are available to all customers. Please share any bug reports and feature requests with your Panther support team.
Using correlation rules in Panther, you can track multiple actions across log types. In a correlation rule, you specify a group or specific sequence of signals that must occur in a certain window of time in order to be considered a match—then generating a signal and optionally, an alert.
You can also include the absence of a signal in your correlation rule criteria. Because matches on correlation rules are determined by signals, not by rule matches or alerts, it's possible to include rules, scheduled rules, and correlation rules that have alerting disabled.
Correlation rules may be particularly useful if you want to generate an alert when, for example:
A certain Okta user logs in successfully after at least one hundred unsuccessful login attempts, then logs in to AWS as a root user (see full example below)
Advanced security settings were disabled for a GitHub repository, which then was not archived (see full example below)
Learn how to create custom correlation rules below, and more about the YAML keys that make up correlation rules on Correlation Rule Reference. You can also leverage Panther-managed correlation rules.
Using correlation rules may increase your Snowflake compute costs. To learn how to make your correlation rules more cost efficient, see the guidelines in Making correlation rules more efficient, below.
How correlation rules work
Correlation rules are written in YAML and reference previously created rules, scheduled rules, and/or correlation rules. Each correlation rule runs on a schedule, and defines a "lookback window," i.e., the amount of time in the past the rule should look to find signals (or absences of signals).
You can apply additional criteria to your correlation rule, such as:
Requiring a minimum number of signals to be found for a certain rule, or a maximum
Requiring certain event values to match from one rule to another (e.g., requiring signals for all individual rules to contain the same IP address)
Requiring subsequent steps in a sequence to have occurred within a certain amount of time
What happens when there is a match on a correlation rule
Matches on correlation rules generate signals. When a correlation rule has alerting enabled, rule matches are generated, which can create alerts according to the correlation rule's deduplication configuration. Learn more about the difference between signals, rule matches, and alerts here.
When an alert is generated for a correlation rule, the individual rules, scheduled rules, and correlation rules referenced in the correlation rule may or may not also generate their own alerts. This depends on:
Whether each individual rule, scheduled rule, and correlation rule has alerting enabled
The
Threshold
value on individual rules or scheduled rules, and theMinMatchCount
value on the correlation rule. An edge case in which the correlation rule could generate an alert but not the rules that make it up is if the event threshold for an individual rule (set with theThreshold
key) is higher than theMinMatchCount
on the correlation rule, and the number of actual rule matches is somewhere in the middle.
Group vs. sequence
There are two types of correlation rules: groups and sequences. Both group and sequence correlation rules define a collection of rules for which signals must be found (or not found, by defining Absence: true
).
Group correlation rules will generate an alert if all rules have produced signals (or absences), no matter the order in which signals were found. Sequence rules, however, define a particular order in which signals (or absences) must be found in order to generate an alert.
Setting the schedule and lookback window
Each correlation rule defines both of the following:
A schedule: Defined by the
Schedule
field (which uses eitherRateMinutes
orCronExpression
), the schedule indicates how frequently the correlation rule should run.A lookback window: Defined by the
LookbackWindowMinutes
field, the lookback window specifies the number of minutes in the past the correlation rule should look to find signals (or absences of signals) for the rules, scheduled rules, or correlation rules included in its group or sequence.
It's important to understand how the configurations for these fields work together. See Setting Schedule
and Setting LookbackWindowMinutes
, below.
Setting Schedule
Schedule
When determining how to configure Schedule
, it's common to evaluate how important it is to be alerted in a timely manner for matches for this correlation rule. For example, it might suffice to run lower priority correlation rules only every 24 hours; alternatively, you might want to run higher priority correlation rules every 15 minutes.
When setting the Schedule
, it's also advised to consider the associated Snowflake compute cost, as more frequent runs will generate more expense. See Making correlation rules more efficient for more cost guidelines.
Setting LookbackWindowMinutes
LookbackWindowMinutes
A common formula used to calculate the lookback window is: LookbackWindowMinutes
= the interval on which the correlation rule is run + log ingest latency, where:
The interval on which the correlation rule is run is defined in minutes, and is equivalent to
RateMinutes
Log ingest latency is the maximum latency expected from the data sources evaluated by the rules associated to the correlation rule
Because signals are fetched within the lookback window based on the time their associated event occurred (
p_event_time
), not the time the event was ingested into Panther (p_parse_time
), it's necessary to take ingest latency into account when determiningLookbackWindowMinutes
to ensure you are processing all "new" data since the last time the correlation rule ran.
For example, if you have configured your correlation rule to run every hour (e.g., by setting RateMinutes
to 60
) and the source of the logs processed by the rules associated to the correlation rule states that it may delay log forwarding to Panther by up to three hours, you might set LookbackWindowMinutes
to 60 + 3*60
, or 240
. To further illustrate this, consider that because of the three-hour ingestion delay, data received by Panther at 9:01am
can have a p_event_time
of as early as 6:01am
. If the correlation rule runs every hour, on the hour, at 10:00am
, it would need to look back to at least 6:01am
.
The LookbackWindowMinutes
value can also have an impact on Snowflake compute costs. See Making correlation rules more efficient for more information.
Deduplication of events
The deduplication period in correlation rules is the value of the LookbackWindowMinutes
field. This means overlapping correlation rules within the same LookbackWindowMinutes
time frame will only contain the unique events that caused that correlation rule to match.
Deduplication set on individual rules and scheduled rules referenced in a correlation rule (with dedup()
, DedupPeriodMinutes
, Threshold
, or set in the Console) is not applicable to the correlation rule.
Correlation rule errors
While working with correlation rules, you may receive one of the following errors:
Simple Detections error code: You have used incorrect syntax to construct a correlation rule
Detection error: Execution of your correlation rule has failed
System error: Your correlation rule has timed out (likely due to a too-large
LookbackWindowMinutes
value)
Group correlation rules
A group correlation rule defines a collection of rules for which signals (or a lack of signals) must occur in a given lookback window. The signals can occur in any order.
If you would like the collection of events to occur in a specific order, use a sequence correlation rule instead.
MatchCriteria
In a group correlation rule, the MatchCriteria
key defines fields, per rule, scheduled rule, and correlation rule, that must have matching values in order for the correlation rule to pass.
For rules associated to multiple log types, scheduled rules, or correlation rules, only p_
fields can be matched on. (For rules associated to only one log type, any field may be matched on.)
If match criteria is not defined, because there is no requirement for certain event field values to match, the correlation rule is less specific.
Learn more about MatchCriteria
on Correlation Rule Reference.
MinMatchCount
In a group correlation rule, MinMatchCount
is an optional field that specifies the minimum number of individual rules, scheduled rules, or correlation rules (defined in Group
) that must match in order for this correlation rule to match.
For example, if you list five rules within Group
and add MinMatchCount: 2
, the correlation rule will match if any two of the five rules generate a signal.
MinMatchCount
is also an available field on individual rules defined in Group
. See both fields in use together in the Group with MinMatchCount example, below.
Group examples
The examples below refer to these JSON events:
In this example, MatchCriteria
specifies that the IP
field in all four rules must contain the same value.
Sequence correlation rules
A sequence correlation rule defines a collection of rules for which signals (or a lack of signals) must occur in a specific order within a given lookback window.
The order of the sequence is defined by the order of rules defined within the Sequence
key.
If you would like the correlation rule to match merely if all rules have signals (or absences), without requiring a specific order, use a group correlation rule instead.
Transitions
Within a sequence, you can optionally define transitions. Transitions define additional criteria for how one step in a sequence can traverse to the next, including how much time can occur between steps, as well as which event fields must have matching values. Using Transitions
with WithinTimeFrameMinutes
and/or Match
increases the specificity of your correlation rule.
If transitions are defined, there must be one fewer transition than the number of rules included in Sequence
. Additionally, the items within Transitions
must be in the same order as the Sequence
list.
Currently, there can only be one type of field matched on per correlation rule (e.g., all IP address fields or all email address fields). For rules associated to multiple log types, scheduled rules, or correlation rules, only p_
fields can be matched on. (For rules associated to only one log type, any field may be matched on.)
Learn more about transitions on Correlation Rule Reference.
Sequence examples
The examples below refer to these JSON events:
Using Transitions
with Match
within Sequence
is useful if you'd like to define event fields whose values must match.
Testing correlation rules
You can add unit tests to a correlation rule to evaluate whether, given certain conditions, a match on the correlation rule would be generated (potentially creating an alert—see What happens when there is a match on a correlation rule).
Correlation rule tests exist to test the correlation logic only. To test the rule logic of the individual rules making up the correlation rule, use the unit tests on the individual rules themselves.
Unit tests are defined on a correlation rule within the Unit Tests tab (in the Panther Console) or the top-level Tests
field (in the CLI workflow), and are structured similarly to unit tests for rules or policies.
Each unit test on a correlation rule must include a Name
, ExpectedResult
, and RuleOutputs
field. Learn more about the YAML structure for unit tests, including how to construct a RuleOutputs
value here, on Correlation Rule Reference.
After writing tests for correlation rules, you can run them using the Panther Analysis Tool test
command. Running pat test
for correlation rules requires an API token—see Authenticating with an API token for more information.
Unit test examples
Using the following correlation rule:
We can write the following tests:
If the below tests were included in the rule's YAML file (as is required when managing your detections in the CLI workflow), they would be positioned under a Tests
key.
If there are ten login attempts followed by a successful login, then a root login, we expect the correlation rule to return true
. This test uses absolute timestamps—learn more about how to use absolute timestamps here, on Correlation Rule Reference.
Limitations of correlation rules
If a sequence uses transitions:
The number of transitions allowed is one fewer than the number of rules included in the sequence collection.
The order of items in the
Transitions
list must correspond to the order of rules in theSequence
field.
If you use event field matching:
The value of the previous
Match.On
/Match.From
must match the nextMatch.On
/Match.To
value.For rules associated to more than one log type, scheduled rules, or correlation rules, values for
Match.On
/Match.From
/Match.To
orMatchCriteria.Match
must be one of thep_
fields listed on Standard Fields.Only one type of field may be matched on throughout the correlation rule. For example, only IP addresses can be matched on, or only email addresses.
How to create a correlation rule
You can write correlation rules in the Panther Console or locally. For explanations of the YAML keys used to construct a correlation rule, see Correlation Rule Reference.
In addition to creating custom correlation rules, you can also leverage Panther-managed correlation rules, available in the correlation_rules
directory of the panther-analysis
repository.
Using the flow chart visualizer in the Console
You can use the flow chart visualizer in the Panther Console while working with Correlation Rules. This renders the correlation rules visually while you construct them in YAML, and the UI provides immediate validation feedback on your rules.
Creating a correlation rule in YAML in the Console
To create a correlation rule in the Panther Console, you can either select detections from the list view to generate YAML, or construct the YAML yourself.
In the left-hand navigation bar of your Panther Console, click Detections.
In the list of detections, click the checkbox on the left-hand side of each detection you'd like to include in your correlation rule.
The order in which you click the detections will be the order of the generated sequence correlation rule.
On the detection creation page, finish configuring your correlation rule:
Name: Enter a descriptive name for the correlation rule.
ID (optional): Click the pen icon and enter a unique ID for your correlation rule.
In the upper-right corner, the Enabled toggle will be set to
ON
by default. If you'd like to disable the rule, flip the toggle toOFF
.Under the YAML Editor tab:
If desired, make modifications to the generated correlation rule. The rule, by default:
Is a
Sequence
correlation rule.Sets
MinMatchCount
to1
for all rules and scheduled rules included in the correlation rule.Within
Transitions
, setsMatch.On
for each transition top_any_usernames
.You can update the
Match.On
values, or instead useMatch.To
andMatch.From
.
Sets
Schedule.RateMinutes
to15
andSchedule.TimeoutMinutes
to2
.Sets
LookbackWindowMinutes
to60
.
More information about correlation detection YAML syntax can be found on Correlation Rule Reference, including a full list of required and optional fields.
Under the Alert Settings tab:
Within the Basics tab, configure the following fields:
(Only applicable if Create Alert is set to
ON
) Severity: Select a severity level for the alerts triggered by this detection.(Only applicable if Create Alert is set to
ON
) Destination Overrides: Optionally choose destinations to receive alerts for this detection, regardless of severity. Note that destinations can also be set dynamically, in the rule function. See Routing Order Precedence to learn more about routing precedence.
(Only applicable if Create Alert is set to
ON
) Within the Context sub-tab, optionally provide values for the following fields:Description: Enter additional context about the rule.
Runbook: Enter the procedures and operations relating to this rule.
To see examples of runbooks for built-in rules, see Alert Runbooks.
Reference: Enter an external link to more information relating to this rule.
Summary Attributes: Enter the attributes you want to showcase in the alerts that are triggered by this detection.
To use a nested field as a summary attribute, use the Snowflake dot notation in the Summary Attribute field to traverse a path in a JSON object:
<column>:<level1_element>.<level2_element>.<level3_element>
The alert summary will then be generated for the referenced object in the alert. Learn more about traversing semi-structured data in Snowflake here.
For more information on Alert Summaries, see Assigning and Managing Alerts.
Tags: Enter custom tags to help you understand the rule at a glance (e.g.,
HIPAA
.)In the Framework Mapping section:
Click Add New to enter a report.
Provide values for the following fields:
Report Key: Enter a key relevant to your report.
Report Values: Enter values for that report.
Under the Unit Tests tab, optionally add tests:
Click + Add new unit test.
Boilerplate code for a test for this correlation rule will be populated.
Make any necessary adjustments to the populated text, and fill in the remainder of the test, including the contents of
Matches
.
In the upper-right corner, click Deploy.
Switching between sequence and group in the Console
You can switch your correlation rule from a sequence to a group, and vice versa, using the Organize Rules as Group/Sequence button at the top right of the YAML Editor panel.
If your correlation rule is already a sequence and you click Organize Rules as Group:
The
Sequence
key will be changed toGroup
.The
Transitions
section will be removed.A
MatchCriteria
section will be added.If you have previously edited a
MatchCriteria
section, that version will be added. Otherwise, a defaultMatchCriteria
section will be provided.
If your correlation rule is already a group and you click Organize Rules as Sequence:
The
Group
key will be changed toSequence
.The
MatchCriteria
section will be removed.A
Transitions
section will be added.If you have previously edited a
Transitions
section, that version will be added. Otherwise, a defaultTransitions
section will be provided.
Creating a correlation rule in YAML in the CLI workflow
Correlation rule full examples
Discovering exfiltrated GitHub credentials
This Discovering.Exfiltrated.Credentials
correlation rule checks every 10 minutes to see if there has been a signal for the AWS.CloudTrail.IaaS
rule (defined in the second tab below) not followed by a signal for the GitHub.CICD
rule (defined in the third tab below) in the last 10 minutes.
Brute force Okta login to AWS root login
This Brute.Force.Login
correlation rule checks every 30 minutes to see if there has been a signal for the Standard.BruteForceByIP
rule followed by a signal for the Okta.Login.Success
rule (defined in the second tab below) followed by a signal for the AWS.Console.RootLogin
rule, with additional time frame and event IP value matching requirements.
GitHub repository security policy disabled without subsequent archival
This Github.Repo.Security.Policy.Disabled.Without.Archival
correlation rule checks every 10 minutes to see if there has been a signal for the GitHub.Advanced.Security.Change
rule not followed by a signal for the GitHub.Repo.Archived
rule (defined in the second tab below) where there is also a matching value for the p_alert_context.repo
event field within the last 10 minutes.
Making correlation rules more efficient
Correlation rules use complex pattern recognitions, which means they have the potential to be computationally expensive. To reduce Snowflake costs associated with correlation rules, keep the following guidelines in mind.
Run the correlation rule as infrequently as possible
How often a correlation rule runs—determined by its Schedule
value—can have a large impact on its cost. It's therefore recommended to ensure your correlation rule is not running more often than it needs to (while still meeting your detection needs—see Setting Schedule
, above, for considerations when configuring this field).
A general guideline when thinking about the relationship between a correlation rule's run frequency and cost is: Each time you double the interval on which the correlation rule is run (e.g. by doubling RateMinutes
), you halve the cost it generates.
Set LookbackWindowMinutes
as low as possible
LookbackWindowMinutes
as low as possibleThe amount of data a correlation rule runs over is largely defined by its LookbackWindowMinutes
value, and this data quantity is a leading factor in the correlation rule's processing time and resulting cost. It's recommended to reduce the amount of data your correlation rule processes by setting its LookbackWindowMinutes
value as low as possible (while still meeting your detection needs—see Setting LookbackWindowMinutes
, above, for considerations when configuring this field).
As an example, say you'd like to identify when two rules each generate a signal within 10 minutes of one another (using WithinTimeFrameMinutes
). While setting LookbackWindowMinutes
to exactly 10
is not advisable, you can safely use a value of as low as 15
.
Choose match fields with the lowest cardinality possible
Match field cardinality is positively correlated with correlation rule cost. If your correlation rule uses event value matching, when choosing which fields to match on, it's recommended to use fields that have lower cardinality.
Cardinality of a match field can be influenced by a few factors, including:
The number of possible values the field can have—the more possible values, the higher the cardinality.
For example, if
field_a
can have one of three possible values (e.g.,"yellow"
,"red"
, or"blue"
), butfield_b
will only ever have one of two values (e.g.,"purple"
or"green"
),field_b
has lower cardinality thanfield_a
.
The field's data type—typically, fields with a non-scalar data type (i.e., that are an
array
orobject
) have higher cardinality than fields with a scalar data type (i.e.,string
,boolean
, ornumber
).For example, if your log schema designates both an
email
andusername
field as ausername
indicator, meaning yourp_any_usernames
field will include them both in anarray
(e.g.,p_any_usernames: ["Bob Smith", "[email protected]"]
), thatp_any_usernames
field will have a higher cardinality than theemail
field, which has astring
type with a single value.
Last updated