Correlation Rules (Beta)
Correlation rules establish correlations across logs, identify anomalies, and model complex attack behavior, then generate alerts
Last updated
Was this helpful?
Correlation rules establish correlations across logs, identify anomalies, and model complex attack behavior, then generate alerts
Last updated
Was this helpful?
Using correlation rules in Panther, you can track multiple actions across log types. In a correlation rule, you specify a or specific of that must occur in a certain window of time in order to be considered a match—then generating a signal and optionally, an .
You can also include the absence of a signal in your correlation rule criteria. Because matches on correlation rules are determined by signals, not by rule matches or alerts, it's possible to include rules, scheduled rules, and correlation rules that .
Correlation rules may be particularly useful if you want to generate an alert when, for example:
A certain Okta user logs in successfully after at least one hundred unsuccessful login attempts, then logs in to AWS as a root user ()
Advanced security settings were disabled for a GitHub repository, which then was not archived ()
, and more about the YAML keys that make up correlation rules on . You can also leverage .
Using correlation rules may increase your Snowflake compute costs. To learn how to make your correlation rules more cost efficient, see the guidelines in , below.
Correlation rules are written in YAML and reference previously created , , and/or correlation rules. Each correlation rule , and i.e., the amount of time in the past the rule should look to find (or absences of signals).
You can apply additional criteria to your correlation rule, such as:
Requiring a minimum number of signals to be found for a certain rule, or a maximum
Requiring certain event values to match from one rule to another (e.g., requiring signals for all individual rules to contain the same IP address)
Requiring subsequent steps in a to have occurred within a certain amount of time
When an alert is generated for a correlation rule, the individual rules, scheduled rules, and correlation rules referenced in the correlation rule may or may not also generate their own alerts. This depends on:
Whether each individual rule, scheduled rule, and correlation rule has alerting enabled
The Threshold
value on individual rules or scheduled rules, and the MinMatchCount
value on the correlation rule. An edge case in which the correlation rule could generate an alert but not the rules that make it up is if the event threshold for an individual rule (set with the Threshold
key) is higher than the MinMatchCount
on the correlation rule, and the number of actual rule matches is somewhere in the middle.
Group correlation rules will generate an alert if all rules have produced signals (or absences), no matter the order in which signals were found. Sequence rules, however, define a particular order in which signals (or absences) must be found in order to generate an alert.
Each correlation rule defines both of the following:
When setting the values of Schedule
and LookbackWindowMinutes
, it's recommended to take a few factors into account:
How important it is for you to be alerted in a timely manner for matches on this correlation rule
For example, it might suffice to run lower priority correlation rules only every 24 hours; alternatively, you might want to run higher priority correlation rules every 15 minutes.
How far apart in time the first and last signal the correlation rule is looking for can be to be considered part of the same occurrence that should generate a match
The maximum latency expected from the data source(s) evaluated by the rules associated to the correlation rule
It's recommended to set your lookback window such that you won't miss occurrences of the thing you're searching for because the required signals were split across the lookback windows of multiple correlation rule runs.
To ensure the signals required for your correlation rule to match are not split across different runs (i.e., they are included in the same lookback window), first determine how far apart in time the first and last signal can be in order to be considered part of the same occurrence that should be alerted on. For convenience, let's call this value "maximum signal timespan minutes."
It's recommended to configure your LookbackWindowMinutes
value so that every possible window of length "maximum signal timespan minutes" is covered within at least one run of the correlation rule. It's generally possible to do this with the following formula:
If RateMinutes
<= "maximum signal timespan minutes":
LookbackWindowMinutes
= 2 x "maximum signal timespan minutes" + log ingest latency
If RateMinutes
> "maximum signal timespan minutes":
LookbackWindowMinutes
= "maximum signal timespan minutes" + RateMinutes
+ log ingest latency
LookbackWindowMinutes
Log ingest latency is the maximum latency expected from the data sources evaluated by the rules associated to the correlation rule.
Because signals are fetched within the lookback window based on the time their associated event occurred (p_event_time
), not the time the event was ingested into Panther (p_parse_time
), it's necessary to take ingest latency into account when determining LookbackWindowMinutes
to ensure you are processing all "new" data since the last time the correlation rule ran.
For example, if you have configured your correlation rule to run every hour (e.g., by setting RateMinutes
to 60
) and the source of the logs processed by the rules associated to the correlation rule states that it may delay log forwarding to Panther by up to three hours, you might set LookbackWindowMinutes
to 60 + 3*60
, or 240
. To further illustrate this, consider that because of the three-hour ingestion delay, data received by Panther at 9:01am
can have a p_event_time
of as early as 6:01am
. If the correlation rule runs every hour, on the hour, at 10:00am
, it would need to look back to at least 6:01am
.
Deduplication set on individual rules and scheduled rules referenced in a correlation rule (with dedup()
, DedupPeriodMinutes
, Threshold
, or set in the Console) is not applicable to the correlation rule.
While working with correlation rules, you may receive one of the following errors:
In a group correlation rule, the MatchCriteria
key defines fields, per rule, scheduled rule, and correlation rule, that must have matching values in order for the correlation rule to pass.
If match criteria is not defined, because there is no requirement for certain event field values to match, the correlation rule is less specific.
In a group correlation rule, MinMatchCount
is an optional field that specifies the minimum number of individual rules, scheduled rules, or correlation rules (defined in Group
) that must match in order for this correlation rule to match.
For example, if you list five rules within Group
and add MinMatchCount: 2
, the correlation rule will match if any two of the five rules generate a signal.
The examples below refer to these JSON events:
In this example, MatchCriteria
specifies that the IP
field in all four rules must contain the same value.
The order of the sequence is defined by the order of rules defined within the Sequence
key.
Within a sequence, you can optionally define transitions. Transitions define additional criteria for how one step in a sequence can traverse to the next, including how much time can occur between steps, as well as which event fields must have matching values. Using Transitions
with WithinTimeFrameMinutes
and/or Match
increases the specificity of your correlation rule.
If transitions are defined, there must be one fewer transition than the number of rules included in Sequence
. Additionally, the items within Transitions
must be in the same order as the Sequence
list.
The examples below refer to these JSON events:
Using Transitions
with Match
within Sequence
is useful if you'd like to define event fields whose values must match.
Using the following correlation rule:
We can write the following tests:
If a sequence uses transitions:
The number of transitions allowed is one fewer than the number of rules included in the sequence collection.
The order of items in the Transitions
list must correspond to the order of rules in the Sequence
field.
If you use event field matching:
The value of the previous Match.On
/Match.From
must match the next Match.On
/Match.To
value.
Only one type of field may be matched on throughout the correlation rule. For example, only IP addresses can be matched on, or only email addresses.
You can use the flow chart visualizer in the Panther Console while working with Correlation Rules. This renders the correlation rules visually while you construct them in YAML, and the UI provides immediate validation feedback on your rules.
To create a correlation rule in the Panther Console, you can either select detections from the list view to generate YAML, or construct the YAML yourself.
In the left-hand navigation bar of your Panther Console, click Detections.
In the list of detections, click the checkbox on the left-hand side of each detection you'd like to include in your correlation rule.
On the detection creation page, finish configuring your correlation rule:
Name: Enter a descriptive name for the correlation rule.
ID (optional): Click the pen icon and enter a unique ID for your correlation rule.
In the upper-right corner, the Enabled toggle will be set to ON
by default. If you'd like to disable the rule, flip the toggle to OFF
.
Under the YAML Editor tab:
If desired, make modifications to the generated correlation rule. The rule, by default:
Under the Alert Settings tab:
Within the Basics tab, configure the following fields:
(Only applicable if Create Alert is set to ON
) Within the Context sub-tab, optionally provide values for the following fields:
Description: Enter additional context about the rule.
Runbook: Enter the procedures and operations relating to this rule.
Reference: Enter an external link to more information relating to this rule.
Summary Attributes: Enter the attributes you want to showcase in the alerts that are triggered by this detection.
To use a nested field as a summary attribute, use the Snowflake dot notation in the Summary Attribute field to traverse a path in a JSON object:
<column>:<level1_element>.<level2_element>.<level3_element>
Tags: Enter custom tags to help you understand the rule at a glance (e.g., HIPAA
.)
In the Framework Mapping section:
Click Add New to enter a report.
Provide values for the following fields:
Report Key: Enter a key relevant to your report.
Report Values: Enter values for that report.
Under the Unit Tests tab, optionally add tests:
Click + Add new unit test.
Boilerplate code for a test for this correlation rule will be populated.
Make any necessary adjustments to the populated text, and fill in the remainder of the test, including the contents of Matches
.
In the upper-right corner, click Deploy.
You can switch your correlation rule from a sequence to a group, and vice versa, using the Organize Rules as Group/Sequence button at the top right of the YAML Editor panel.
If your correlation rule is already a sequence and you click Organize Rules as Group:
The Sequence
key will be changed to Group
.
The Transitions
section will be removed.
A MatchCriteria
section will be added.
If you have previously edited a MatchCriteria
section, that version will be added. Otherwise, a default MatchCriteria
section will be provided.
If your correlation rule is already a group and you click Organize Rules as Sequence:
The Group
key will be changed to Sequence
.
The MatchCriteria
section will be removed.
A Transitions
section will be added.
If you have previously edited a Transitions
section, that version will be added. Otherwise, a default Transitions
section will be provided.
Correlation rules use complex pattern recognitions, which means they have the potential to be computationally expensive. To reduce Snowflake costs associated with correlation rules, keep the following guidelines in mind.
A general guideline when thinking about the relationship between a correlation rule's run frequency and cost is: Each time you double the interval on which the correlation rule is run (e.g. by doubling RateMinutes
), you halve the cost it generates.
LookbackWindowMinutes
as low as possibleAs an example, say you'd like to identify when two rules each generate a signal within 10 minutes of one another (using WithinTimeFrameMinutes
). While setting LookbackWindowMinutes
to exactly 10
is not advisable, you can safely use a value of as low as 15
.
Match field cardinality is positively correlated with correlation rule cost. If your correlation rule uses event value matching, when choosing which fields to match on, it's recommended to use fields that have lower cardinality.
Cardinality of a match field can be influenced by a few factors, including:
The number of possible values the field can have—the more possible values, the higher the cardinality.
For example, if field_a
can have one of three possible values (e.g., "yellow"
, "red"
, or "blue"
), but field_b
will only ever have one of two values (e.g., "purple"
or "green"
), field_b
has lower cardinality than field_a
.
The field's data type—typically, fields with a non-scalar data type (i.e., that are an array
or object
) have higher cardinality than fields with a scalar data type (i.e., string
, boolean
, or number
).
Matches on correlation rules generate . When a correlation rule has alerting enabled, rule matches are generated, which can create alerts according to the correlation rule's . .
There are two types of correlation rules: and . Both group and sequence correlation rules define a collection of rules for which signals must be found (or not found, by setting Absence: true
).
A schedule: Defined by the field (which uses either RateMinutes
or CronExpression
), the schedule indicates how frequently the correlation rule should run.
A lookback window: Defined by the LookbackWindowMinutes
field, the lookback window specifies the number of minutes in the past the correlation rule should look to find (or absences of signals) for the rules, scheduled rules, or correlation rules included in its group or sequence.
See
See
Schedule
and LookbackWindowMinutes
values can have an impact on Snowflake compute costs. See for more information.
(This should not be confused with , which is the time frame within which two steps in a sequence must occur in order to pass. "Maximum signal timespan minutes" and WithinTimeFrameMinutes
can be equal if the correlation rule is a sequence and specifies only two steps.)
To understand the inclusion of log ingest latency, see
The deduplication period in correlation rules is the value of the field. This means overlapping correlation rules within the same LookbackWindowMinutes
time frame will only contain the unique events that caused that correlation rule to match.
: You have used incorrect syntax to construct a correlation rule
: Execution of your correlation rule has failed
: Your correlation rule has timed out (likely due to a too-large LookbackWindowMinutes
value)
A group correlation rule defines a collection of rules for which (or a lack of signals) must occur in a given lookback window. The signals can occur in any order.
If you would like the collection of events to occur in a specific order, use a instead.
For rules associated to multiple log types, scheduled rules, or correlation rules, only can be matched on. (For rules associated to only one log type, any field may be matched on.)
Learn more about MatchCriteria
on .
MinMatchCount
is also an available field on individual rules defined in Group
. See both fields in use together in the , below.
A sequence correlation rule defines a collection of rules for which (or a lack of signals) must occur in a specific order within a given lookback window.
If you would like the correlation rule to match merely if all rules have signals (or absences), without requiring a specific order, use a instead.
Currently, there can only be one type of field matched on per correlation rule (e.g., all IP address fields or all email address fields). For rules associated to multiple log types, scheduled rules, or correlation rules, only can be matched on. (For rules associated to only one log type, any field may be matched on.)
Learn more about transitions on .
You can add unit tests to a correlation rule to evaluate whether, given certain conditions, a match on the correlation rule would be generated (potentially creating an alert—see ).
Unit tests are defined on a correlation rule within the Unit Tests tab (in the Panther Console) or the top-level Tests
field (in the CLI workflow), and are structured similarly to .
Each unit test on a correlation rule must include a Name
, ExpectedResult
, and RuleOutputs
field. Learn more about the YAML structure for unit tests, including how to construct a RuleOutputs
value .
After writing tests for correlation rules, you can run them using the . Running pat test
for correlation rules requires an API token—see for more information.
If there are ten login attempts followed by a successful login, then a root login, we expect the correlation rule to return true
. This test uses absolute timestamps—learn more about how to use absolute timestamps .
If there are ten login attempts followed by a successful login, then a root login, we expect the correlation rule to return true
. This test uses relative timestamps—learn more about how to use relative timestamps .
For rules associated to more than one log type, scheduled rules, or correlation rules, values for Match.On
/Match.From
/Match.To
or MatchCriteria.Match
must be one of the p_
fields listed on .
You can write correlation rules in the Panther Console or locally. For explanations of the YAML keys used to construct a correlation rule, see .
In addition to creating custom correlation rules, you can also leverage , available in the correlation_rules
directory of the panther-analysis
repository.
The order in which you click the detections will be the order of the generated correlation rule.
Click Correlate.
Is a correlation rule.
You can change this to by clicking Organize Rules as Group. Learn more in .
Sets to 1
for all rules and scheduled rules included in the correlation rule.
Within , sets for each transition to p_any_usernames
.
You can update the Match.On
values, or instead use and .
Sets to 15
and to 2
.
Sets to 60
.
More information about correlation detection YAML syntax can be found on , including a full list of required and optional fields.
Create Alert: This ON/OFF
toggle indicates whether an should be created when there are matches, or only a .
(Only applicable if Create Alert is set to ON
) Severity: Select a for the alerts triggered by this detection.
(Only applicable if Create Alert is set to ON
) Destination Overrides: Optionally choose destinations to receive alerts for this detection, regardless of severity. Note that destinations can also be set dynamically, in the rule function. See to learn more about routing precedence.
To see examples of runbooks for built-in rules, see .
The alert summary will then be generated for the referenced object in the alert.
For more information on Alert Summaries, see .
Below the code editor, click Run Test to evaluate the test.
By default, the correlation rule is organized as a . You can change this to by clicking Organize Rules as Group. Learn more in .
More information about correlation detection YAML syntax can be found on , including a full list of required and optional fields.
Create Alert: This ON/OFF
toggle indicates whether an should be created when there are matches, or only a .
(Only applicable if Create Alert is set to ON
) Severity: Select a for the alerts triggered by this detection.
(Only applicable if Create Alert is set to ON
) Destination Overrides: Optionally choose destinations to receive alerts for this detection, regardless of severity. Note that destinations can also be set dynamically, in the rule function. See to learn more about routing precedence.
To see examples of runbooks for built-in rules, see .
The alert summary will then be generated for the referenced object in the alert.
For more information on Alert Summaries, see .
Below the code editor, click Run Test to evaluate the test.
More information about correlation detection YAML syntax can be found on , including a full list of required and optional fields.
This Discovering.Exfiltrated.Credentials
correlation rule checks every 10 minutes to see if there has been a for the AWS.CloudTrail.IaaS
rule (defined in the second tab below) not followed by a signal for the GitHub.CICD
rule (defined in the third tab below) in the last 10 minutes.
This Brute.Force.Login
correlation rule checks every 30 minutes to see if there has been a for the rule followed by a signal for the Okta.Login.Success
rule (defined in the second tab below) followed by a signal for the rule, with additional time frame and event IP value matching requirements.
This Github.Repo.Security.Policy.Disabled.Without.Archival
correlation rule checks every 10 minutes to see if there has been a for the rule not followed by a signal for the GitHub.Repo.Archived
rule (defined in the second tab below) where there is also a matching value for the p_alert_context.repo
event field within the last 10 minutes.
How often a correlation rule runs—determined by its Schedule
value—can have a large impact on its cost. It's therefore recommended to ensure your correlation rule is not running more often than it needs to (while still meeting your detection needs—see , for considerations when configuring this field).
The amount of data a correlation rule runs over is largely defined by its LookbackWindowMinutes
value, and this data quantity is a leading factor in the correlation rule's processing time and resulting cost. It's recommended to reduce the amount of data your correlation rule processes by setting its LookbackWindowMinutes
value as low as possible (while still meeting your detection needs—see , for considerations when configuring this field).
For example, if your log schema designates both an email
and username
field as a username
, meaning your p_any_usernames
field will include them both in an array
(e.g., p_any_usernames: ["Bob Smith", "bob.smith@example.com"]
), that p_any_usernames
field will have a higher cardinality than the email
field, which has a string
type with a single value.