Behavioral Analytics and Anomaly Detection Template Macros (Beta)
Detect outliers with Panther-managed macros for behavioral analytics and anomaly detection
Overview
Panther provides template macros for identifying anomalous and new values across your log data. Using the template macros, you can compare recent log events to historical data to identify activity deviating significantly from the established norm. These behavioral analytics and anomaly detection macros may be useful as part of your User and Entity Behavior Analytics (UEBA) strategy.
These macros work by comparing data in a recent time interval to data in a longer lookback window and determining the level of deviation between the two.
The macros Panther provides are:
statistical_anomaly: Identifies outlier values for a numerical fieldExample: Find VPC hosts that have been sending an unusually high volume of traffic over the last hour, as compared to the last two weeks.
Learn more in statistical_anomaly.
statistical_anomaly_peer: Identifies outlier values for a numerical field within a peer groupExample: Identify attempts by a user to access a resource that is unusual for members of the same team.
Learn more in statistical_anomaly_peer.
new_unique_values: Identifies new values for a given entityExample: Find API tokens that have accessed a resource in the last day they have not accessed in the previous 30 days.
Learn more in new_unique_values.
new_unique_values_peer: Identifies new values for a given entity within a peer groupExample: Detect if an EC2 instance has connected to an IP that is atypical for members of its VPC group.
Learn more in new_unique_values_peer.
Learn how to view the macros' source code below.
Enabling the macros
Before invoking the macros, you need to make them available in your Panther instance. The macros are provided in a Panther-managed Saved Search, which can be obtained:
In the Console workflow: via a
PantherManaged.AnomaliesDetection PackIn the CLI workflow: retrieved from panther-analysis and uploaded with the Panther Analysis Tool (PAT)
To enable the macros in the Panther Console, enable the PantherManaged.Anomalies Pack:
In the left-hand navigation bar of your Panther Console, click Detections, then Packs.
In the Filter Pack by text field, enter "Anomaly."
On the right side of the Panther Anomaly Detection Pack tile, set the Enabled toggle to
ON.

The macros were added to the panther-analysis repository, via an anomalies.yml file, in v3.75.1. To get this file into your own Panther content repository, do one of the following:
Sync your repository with any panther-analysis release equal or greater to v.3.75.1.
Download the
anomalies.ymlfile manually from GitHub and save it in your repository.This path may appeal to you if you'd like to avoid performing a full sync with panther-analysis.
Once anomalies.yml is present in your repository, you can upload it to Panther using PAT.
How to use behavioral analytics and anomaly detection macros in Panther
You can invoke the Panther-managed behavioral analytics and anomaly detection macros in Data Explorer by following the instructions below. This process is similar to the Calling template macros in other queries instructions, but is specific to using the behavioral analytics and anomaly detection template macros.
In the left-hand navigation bar of your Panther Console, click Investigate > Data Explorer.
At the top of the SQL editor, add a
-- pragma: templatestatement.-- pragma: template -- pragma: show macro expanded # OptionalImport one of the available macros:
{% import 'anomalies' <statistical_anomaly, new_unique_values, statistical_anomaly_peer, OR new_unique_values_peer> %} -- Specify only one macro, not all fourDefine a subquery using a Common Table Expression (CTE). The subquery must:
SELECTat least:p_event_timeAn entity column. The entity is often an ID of some kind (such as an email address, user ID, application ID, or hostname), but can be any data type (such as an IP address). The column must be a top-level field. If the field is nested within an object or array, create an alias for the column with the
ASkeyword.An aggregation column. For
statistical_anomalyqueries, this column's contents will be aggregated and scanned for unusual values. Fornew_unique_valuesqueries, this column's contents will be scanned for new values. As with the entity column, this column must be a top-level field.(If you're using
statistical_anomaly_peerornew_unique_values_peer) A peer group field.
Using a
WHEREclause, define the lookback window (i.e., the longer period of time that denotes the baseline against which the shorter window is compared). The lookback window should end at the current time; for this reason it is recommended to usep_occurs_since(). Learn more aboutp_occurs_since()here.
with subquery as ( select user:email as email, -- entity column event_type, -- aggregation column p_event_time from mytable where p_occurs_since(30d) ),Below the subquery, invoke the macro:
{{ <statistical_anomaly, new_unique_values, statistical_anomaly_peer, OR new_unique_values_peer> (<subquery>, '<entity_col>', ...) }}See the full list of input arguments for each macro in Behavioral analytics and anomaly detection macro reference, below.
Click Run Search.
Going beyond ad-hoc searches
While Panther's behavioral analytics and anomaly detection queries are useful for threat hunting, they're more powerful when used as a monitoring system. The queries can be saved, set to run on a schedule, and attached to Scheduled Rules. In this way, you can get alerted whenever anomalous activity is observed.
Full examples
See full examples invoking all macros below.
Full example using statistical_anomaly
statistical_anomalyThis query compares VPC traffic observed in the past hour to a baseline set by the past 24 hours, and alerts if any addresses have sent an unusual volume of outbound traffic, potentially indicating a data exfiltration action.
-- pragma: template
{% import 'anomalies' statistical_anomaly %}
WITH subquery AS (
-- Look for outbound requests:
SELECT
p_event_time as p_timeline,
concat(srcAddr,' -> ',dstAddr,':',dstPort) as traffic,
*
FROM
panther_logs.public.aws_vpcflow
WHERE
p_occurs_since('7 day')
AND dstAddr not like '10.%'
AND dstPort < 1024
AND flowDirection = 'egress'
AND pktDstAwsService is null
),
{{statistical_anomaly('subquery', 'traffic', 'bytes', 'sum', '1', 'hour', 3)}}Full example using new_unique_values
new_unique_valuesThis query analyzes login events in the last hour and returns cases where an email address was used to login from an IP address that hasn't been seen in the previous 30 days.
This query can be useful for identifying suspicious logins.
--pragma: template
--pragma: show macro expanded
{% import 'anomalies' new_unique_values %}
with subquery as (
select email, src_ip, p_event_time
from mytable where p_occurs_since(30d)
),
{{ new_unique_values('subquery', 'email', 'src_ip', '1hr') }}Full example using statistical_anomaly_peer
statistical_anomaly_peerThe example below uses peer grouping to compare the number of GitHub pull requests created by one user to the number of pull requests created by other members of their team.
The query uses where filter clauses to only fetch pull request creation events, as well as to exclude events generated during weekends. It also pulls username/development team associations from a custom lookup table.
{% import 'anomalies' statistical_anomaly_peer %}
with logs as (
select
p_event_time,
actor,
action as action_name
from panther_logs.public.github_audit
where p_occurs_since('90d') -- last 90 days of data
and action_name = 'pull_request.create' -- only pull request creation events
and DAYNAME(p_event_time) not in ('Sat', 'Sun') -- ignore weekends
), subquery as (
select
logs.p_event_time,
logs.actor as actor,
logs.event as event_type
lut.team as team
from
logs
join
panther_lookups.public.custom_github_teams as lut
on logs.actor = lut.actor
),
{{ statistical_anomaly_peer('subquery', 'actor', 'team', 'n', 'count', '1', 'day', 0.1) }}Full example using new_unique_values_peer
new_unique_values_peerThe query in this example seeks to discover whether any users have created a GitHub pull request in a repository that has not been used previously by any user on their team.
The query uses a where filter clause to only fetch pull request creation events. The repo column is extracted in order to analyze for new values. The query also pulls username/development team associations from a custom lookup table.
{% import 'anomalies' new_unique_values_peer %}
with logs as (
select
p_event_time,
actor,
action as action_name
repo,
from panther_logs.public.github_audit
where p_occurs_since('90d') -- last 90 days of data
and action_name = 'pull_request.create' -- only pull request creation events
), subquery as (
select
logs.p_event_time,
logs.actor as actor,
logs.event as event_type
lut.team as team
from
logs
join
panther_lookups.public.custom_github_teams as lut
on logs.actor = lut.actor
),
{{ new_unique_values_peer('subquery', 'actor', 'team', 'repo', '1d') }}Viewing the macro source code
After enabling the macros, you can view the behavioral analytics and anomaly detection macro source code in either your Panther Console or in the panther-analysis repository.
To view the macro source code in your Panther Console:
In the left-hand navigation bar of your Panther Console, click Investigate > Saved Searches.
Search for the Saved Search named
anomalies, and click its name.
You will be directed to Data Explorer, where you can view the source code:

To view the macro source code in panther-analysis:
In your local Panther code repository or in Panther's upstream panther-analysis repository, view the
queries/macros/anomalies.ymlfile.
Peer group analysis
It's often useful to compare an entity's behavior specifically against the behavior of its peers. For example, has an engineer recently signed into an account that other engineers have not?
Panther provides peer versions of statistical_anomaly and new_unique_values to perform such analysis. In the peer versions, baseline statistics are calculated according to the peer group, then entity behavior is compared to the baseline.
These queries function similarly to the non-peer versions, with the addition of an extra parameter, group_field, to define the peer group. This should be a column whose value is used to group entities together. Some common examples of the group_field value are: user role, job department, VPC ID, and account ID.
Behavioral analytics and anomaly detection macro reference
Below, you can find reference information for how to use the macros provided by Panther. Unless otherwise specified, assume all input arguments must be provided in the order shown here.
statistical_anomaly
The statistical_anomaly macro looks over a dataset for unusual data points within a recent period. It takes a CTE as the base data set and compares the baseline activity for an entity over that period to the most recent activity by the same entity, and calculates how unusual this behavior is.
You must provide the base data set, specify which column contains the entity name and which column to use for data comparison. You must also define the size of the recent period in which to look for anomalies.
Input arguments
Each of the following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which provides data for the macro to analyze.
entity_field
String
Name of the column to use for grouping; usually a name, IP address, or ID.
agg_field
String
Name of the column to search for outliers.
agg_func
String
Which SQL function to use to aggregate the data in agg_field within a time period. Common value are count, sum, and max.
tmag
String
Number of units for the lookback window in which to look for anomalies. i.e.: the 1 in "1 day".
tunit
String
Unit of time for the lookback window in which to look for anomalies. i.e.: the day in "1 day". Must be singular (no "s" at the end).
zscore
Number
Outlier threshold; results will not be returned unless their calculated zscore value is higher than this.
Returns
The table returned after executing the macro will have the following columns:
N
Number
The value of the data in agg_field, as aggregated by agg_func, for the given entity over the lookback period.
t1
Timestamp
Start of the lookback period.
t2
Timestamp
End of the lookback period.
<entity_field>
Any
Value of the chosen entity_field.
p_zscore
Number
Calculated zscore of the entity's activities over the lookback period. Higher zscore value means more anomalous.
p_mean
Number
Average value of the agg_field column for this entity over the data in subquery, excluding during the lookback period.
p_stddev
Number
Standard deviation of the agg_field column for this entity over the data in subquery, excluding during the lookback period. Larger p_stddev means the entity's activity was less consistent overall.
new_unique_values
The new_unique_values macro scans a data set and returns a set of new values from a chosen column present in a recent lookback period for a given entity.
Input arguments
The following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which contains the base data to use for finding anomalies.
entity_field
String
Name of the column to use for grouping; usually a user name, IP address, or ID.
agg_field
String
Name of the column in which to search for new values.
interval
String
Size of the period in which to look for new values. Uses the same syntax as p_occurs_since.
Returns
The table returned after executing the macro will have the following columns:
<entity_field>
Any
Value of the defined entity_field column.
<agg_field>
Any
Any new values discovered in the agg_field column during the lookback period
statistical_anomaly_peer
Use this macro to determine unusual numerical behavior from an entity compared to its peer group. For example, checking an EC2 instance's traffic volume compared to others with the same tag, or access requests to an S3 object compared to other objects in the same bucket.
Input arguments
The following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which provides data for the macro to analyze.
entity_field
String
Name of the column to use for identifying an entity; usually a name, IP address, or ID.
group_field
String
Name of the column to use to group entities; for example, a role name or an Account ID.
agg_field
String
Name of the column to search for outliers.
agg_func
String
Which SQL function to use to aggregate the data in agg_field within a time period. Common value are count, sum, and max.
tmag
String
Number of units for the lookback window in which to look for anomalies. i.e.: the 1 in "1 day".
tunit
String
Unit of time for the lookback window in which to look for anomalies. i.e.: the day in "1 day". Must be singular (no "s" at the end).
zscore
Number
Outlier threshold; results will not be returned unless their calculated zscore value is higher than this.
Returns
The table returned after executing the macro will have the following columns:
N
Number
The value of the data in agg_field, as aggregated by agg_func, for the given entity over the lookback period.
t1
Timestamp
Start of the lookback period.
t2
Timestamp
End of the lookback period.
<entity_field>
Any
Value of the chosen entity_field.
<group_field>
Any
Value of the chosen group_field.
p_zscore
Number
Calculated zscore of the entity's activities over the lookback period. Higher zscore value means more anomalous.
p_mean
Number
Average value of the agg_field column for this entity over the data in subquery, excluding during the lookback period.
p_stddev
Number
Standard deviation of the agg_field column for this entity over the data in subquery, excluding during the lookback period. Larger p_stddev means the entity's activity was less consistent overall.
new_unique_values_peer
Use this macro to identify when an entity has done something that hasn't previously been observed by a member of its peer group.
Input arguments
The following arguments must be provided to the macro, in the order shown below.
subquery
String
Name of the CTE defined previously, which contains the base data to use for finding anomalies.
entity_field
String
Name of the column to use for identifying an entity; usually a name, IP address, or ID.
group_field
String
Name of the column to use to group entities; for example, a role name or an Account ID.
agg_field
String
Name of the column in which to search for new values.
interval
String
Size of the lookback period in which to look for new values. Uses the same syntax as p_occurs_since.
Returns
The table returned after executing the macro will have the following columns:
<entity_field>
Any
Value of the defined entity_field column.
<group_field>
Any
Value of the defined group_field column.
<agg_field>
Any
Any new values discovered in the agg_field column during the lookback period
Last updated
Was this helpful?

