Behavioral Analytics and Anomaly Detection Template Macros (Beta)
Detect outliers with Panther-managed macros for behavioral analytics and anomaly detection
Overview
These template macros are in open beta starting with Panther version 1.112, and are available to all customers. Please share any bug reports and feature requests with your Panther support team.
Panther provides template macros for identifying anomalous and new values across your log data. Using the template macros, you can compare recent log events to historical data to identify activity deviating significantly from the established norm. These behavioral analytics and anomaly detection macros may be useful as part of your User and Entity Behavior Analytics (UEBA) strategy.
These macros work by comparing data in a recent time interval to data in a longer lookback window and determining the level of deviation between the two.
The macros Panther provides are:
statistical_anomaly
: Identifies outlier values for a numerical fieldExample: Find VPC hosts that have been sending an unusually high volume of traffic over the last hour, as compared to the last two weeks.
Learn more in statistical_anomaly.
statistical_anomaly_peer
: Identifies outlier values for a numerical field within a peer groupExample: Identify attempts by a user to access a resource that is unusual for members of the same team.
Learn more in statistical_anomaly_peer.
new_unique_values
: Identifies new values for a given entityExample: Find API tokens that have accessed a resource in the last day they have not accessed in the previous 30 days.
Learn more in new_unique_values.
new_unique_values_peer
: Identifies new values for a given entity within a peer groupExample: Detect if an EC2 instance has connected to an IP that is atypical for members of its VPC group.
Learn more in new_unique_values_peer.
Learn how to view the macros' source code below.
How to use behavioral analytics and anomaly detection macros in Panther
You can invoke the Panther-managed behavioral analytics and anomaly detection macros in Data Explorer by following the instructions below. This process is similar to the Calling template macros in other queries instructions, but is specific to using the behavioral analytics and anomaly detection template macros.
In the left-hand navigation bar of your Panther Console, click Investigate > Data Explorer.
At the top of the SQL editor, add a
-- pragma: template
statement.Optionally, you can also add
--pragma: show macro expanded
. This will expand the macro into its source code, and is useful for troubleshooting issues with your query. See Debugging template macros for more information.Import one of the available macros:
Define a subquery using a Common Table Expression (CTE). The subquery must:
SELECT
at least:p_event_time
An entity column. The entity is often an ID of some kind (such as an email address, user ID, application ID, or hostname), but can be any data type (such as an IP address). The column must be a top-level field. If the field is nested within an object or array, create an alias for the column with the
AS
keyword.An aggregation column. For
statistical_anomaly
queries, this column's contents will be aggregated and scanned for unusual values. Fornew_unique_values
queries, this column's contents will be scanned for new values. As with the entity column, this column must be a top-level field.(If you're using
statistical_anomaly_peer
ornew_unique_values_peer
) A peer group field.
Using a
WHERE
clause, define the lookback window (i.e., the longer period of time that denotes the baseline against which the shorter window is compared). The lookback window should end at the current time; for this reason it is recommended to usep_occurs_since()
. Learn more aboutp_occurs_since()
here.
Below the subquery, invoke the macro:
See the full list of input arguments for each macro in Behavioral analytics and anomaly detection macro reference, below.
Click Run Search.
Going beyond ad-hoc searches
While Panther's behavioral analytics and anomaly detection queries are useful for threat hunting, they're more powerful when used as a monitoring system. The queries can be saved, set to run on a schedule, and attached to Scheduled Rules. In this way, you can get alerted whenever anomalous activity is observed.
Full examples
See full examples invoking all macros below.
Viewing the macro source code
You can view the behavioral analytics and anomaly detection macro source code in either your Panther Console or in the panther-analysis repository.
To view the macro source code in your Panther Console:
In the left-hand navigation bar of your Panther Console, click Investigate > Saved Searches.
Peer group analysis
It's often useful to compare an entity's behavior specifically against the behavior of its peers. For example, has an engineer recently signed into an account that other engineers have not?
Panther provides peer versions of statistical_anomaly
and new_unique_values
to perform such analysis. In the peer versions, baseline statistics are calculated according to the peer group, then entity behavior is compared to the baseline.
These queries function similarly to the non-peer versions, with the addition of an extra parameter, group_field
, to define the peer group. This should be a column whose value is used to group entities together. Some common examples of the group_field
value are: user role, job department, VPC ID, and account ID.
Behavioral analytics and anomaly detection macro reference
Below, you can find reference information for how to use the macros provided by Panther. Unless otherwise specified, assume all input arguments must be provided in the order shown here.
statistical_anomaly
The statistical_anomaly
macro looks over a dataset for unusual data points within a recent period. It takes a CTE as the base data set and compares the baseline activity for an entity over that period to the most recent activity by the same entity, and calculates how unusual this behavior is.
You must provide the base data set, specify which column contains the entity name and which column to use for data comparison. You must also define the size of the recent period in which to look for anomalies.
Input arguments
Each of the following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which provides data for the macro to analyze.
entity_field
String
Name of the column to use for grouping; usually a name, IP address, or ID.
agg_field
String
Name of the column to search for outliers.
agg_func
String
Which SQL function to use to aggregate the data in agg_field
within a time period. Common value are count
, sum
, and max
.
tmag
String
Number of units for the lookback window in which to look for anomalies. i.e.: the 1 in "1 day".
tunit
String
Unit of time for the lookback window in which to look for anomalies. i.e.: the day in "1 day". Must be singular (no "s" at the end).
zscore
Number
Outlier threshold; results will not be returned unless their calculated zscore
value is higher than this.
Returns
The table returned after executing the macro will have the following columns:
N
Number
The value of the data in agg_field
, as aggregated by agg_func
, for the given entity over the lookback period.
t1
Timestamp
Start of the lookback period.
t2
Timestamp
End of the lookback period.
<entity_field>
Any
Value of the chosen entity_field
.
p_zscore
Number
Calculated zscore
of the entity's activities over the lookback period. Higher zscore
value means more anomalous.
p_mean
Number
Average value of the agg_field
column for this entity over the data in subquery
, excluding during the lookback period.
p_stddev
Number
Standard deviation of the agg_field
column for this entity over the data in subquery
, excluding during the lookback period. Larger p_stddev
means the entity's activity was less consistent overall.
new_unique_values
The new_unique_values
macro scans a data set and returns a set of new values from a chosen column present in a recent lookback period for a given entity.
Input arguments
The following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which contains the base data to use for finding anomalies.
entity_field
String
Name of the column to use for grouping; usually a user name, IP address, or ID.
agg_field
String
Name of the column in which to search for new values.
interval
String
Returns
The table returned after executing the macro will have the following columns:
<entity_field>
Any
Value of the defined entity_field
column.
<agg_field>
Any
Any new values discovered in the agg_field
column during the lookback period
statistical_anomaly_peer
Use this macro to determine unusual numerical behavior from an entity compared to its peer group. For example, checking an EC2 instance's traffic volume compared to others with the same tag, or access requests to an S3 object compared to other objects in the same bucket.
Input arguments
The following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which provides data for the macro to analyze.
entity_field
String
Name of the column to use for identifying an entity; usually a name, IP address, or ID.
group_field
String
Name of the column to use to group entities; for example, a role name or an Account ID.
agg_field
String
Name of the column to search for outliers.
agg_func
String
Which SQL function to use to aggregate the data in agg_field
within a time period. Common value are count
, sum
, and max
.
tmag
String
Number of units for the lookback window in which to look for anomalies. i.e.: the 1 in "1 day".
tunit
String
Unit of time for the lookback window in which to look for anomalies. i.e.: the day in "1 day". Must be singular (no "s" at the end).
zscore
Number
Outlier threshold; results will not be returned unless their calculated zscore
value is higher than this.
Returns
The table returned after executing the macro will have the following columns:
N
Number
The value of the data in agg_field
, as aggregated by agg_func
, for the given entity over the lookback period.
t1
Timestamp
Start of the lookback period.
t2
Timestamp
End of the lookback period.
<entity_field>
Any
Value of the chosen entity_field
.
<group_field>
Any
Value of the chosen group_field
.
p_zscore
Number
Calculated zscore
of the entity's activities over the lookback period. Higher zscore
value means more anomalous.
p_mean
Number
Average value of the agg_field
column for this entity over the data in subquery
, excluding during the lookback period.
p_stddev
Number
Standard deviation of the agg_field
column for this entity over the data in subquery
, excluding during the lookback period. Larger p_stddev
means the entity's activity was less consistent overall.
new_unique_values_peer
Use this macro to identify when an entity has done something that hasn't previously been observed by a member of its peer group.
Input arguments
The following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which contains the base data to use for finding anomalies.
entity_field
String
Name of the column to use for identifying an entity; usually a name, IP address, or ID.
group_field
String
Name of the column to use to group entities; for example, a role name or an Account ID.
agg_field
String
Name of the column in which to search for new values.
interval
String
Returns
The table returned after executing the macro will have the following columns:
<entity_field>
Any
Value of the defined entity_field
column.
<group_field>
Any
Value of the defined group_field
column.
<agg_field>
Any
Any new values discovered in the agg_field
column during the lookback period
Last updated
Was this helpful?