Alert Runbooks

Recommended steps to investigate and reconcile an alert

Overview

An alert runbook is a set of instructions for investigating and remediating the issue triggering an alert. It's recommended to provide a descriptive runbook, as Panther AI alert triage will take it into consideration. Learn how to write an effective runbook below.

Runbooks are defined on detections:

Panther provides alert runbooks for a number of Panther-managed detections—view them in the panther-analysis repository.

Runbook examples

def runbook(event):
    user_arn = event.deep_get("userIdentity", "arn", default="this user")
    source_ip = event.deep_get("sourceIPAddress", default="this IP address")
    
    return f"""
    1. Find all API calls by {user_arn} in the 24 hours before the alert
    2. Check if the source IP {source_ip} is associated with known cloud provider IP ranges or VPN endpoints
    3. Look for other alerts from {user_arn} or {source_ip} in the past 7 days
    """
Additional runbook examples

The runbook examples in this section are written as YAML Runbook field values, but could be converted to Python runbook functions.

User behavior analysis runbook:

Runbook: |
  1. Find all API calls by the user ARN in the 24 hours before the alert to establish normal behavior
  2. Identify if this action has been performed by this user in the past 90 days
  3. Check for other alerts or suspicious activity from this user in the past 7 days

Resource access patterns runbook:

Runbook: |
  1. Query for all access attempts to this S3 bucket in the 1 hour window around the alert
  2. Determine if the accessing principal has legitimate access to this bucket based on past 30 days of activity
  3. Check if the accessed object keys match sensitive data patterns

Network/IP analysis runbook:

Runbook: |
  1. Find all API calls from the source IP in the 6 hours before and after the alert
  2. Check if the IP is associated with known cloud providers, VPNs, or corporate network ranges
  3. Look for geographic inconsistencies in login locations for this user

Privilege escalation investigation runbook:

Runbook: |
  1. Query IAM policy changes for this user or role in the 48 hours before the alert
  2. Find all API calls using the newly granted permissions in the 24 hours after the policy change
  3. Check if the user has performed similar privilege escalation actions in the past 90 days

Data exfiltration investigation runbook:

Runbook: |
  1. Calculate total data transferred by this user to external destinations in the 24 hours before and after the alert
  2. Compare data transfer volume to the user's 30-day average baseline
  3. Identify if the destination IPs or domains are associated with cloud storage or file sharing services

Failed authentication analysis runbook:

Runbook: |
  1. Count failed authentication attempts for this user in the 1 hour before the first failure
  2. Check if successful authentication occurred after the failed attempts from a different IP
  3. Look for password reset or MFA changes for this user in the 24 hours around the alert

Tips for writing an effective runbook

To write an effective runbook that Panther AI or a human analyst can follow:

  • Provide 2-3 focused investigation steps that build on each other.

    • Ensure steps are specific, concrete, and actionable. For example:

      • Good (specific, actionable): "Query AWS CloudTrail for all API calls by {user_arn} in the 6 hours before and after this alert to identify what actions were performed."

      • Bad (vague, without a clear outcome): "Search for related user activity."

    • To help gather context and build a narrative, think of using steps that can answer:

      1. What happened? (immediate context)

      2. Is this normal? (baseline comparison)

      3. What else is suspicious? (correlation)

  • Reference specific alert fields by name to avoid ambiguity.

    • For example, use sourceIPAddress instead of "the IP" and userIdentity:arn instead of "the user."

  • Indicate time windows for data searches (e.g., "24 hours before" or "30 minutes around"). When deciding how much time to search, use the following guidelines:

    • Looking for recent suspicious activity: 1-6 hours before/after

    • Establishing behavioral baselines: 30-90 days of history

    • Executing correlation searches: 24 hours to 7 days

    • Searching for long-term patterns: 90 days

Runbook templates

The runbook templates in section are written as YAML Runbook field values, but could be converted to Python runbook functions.

Generic runbook template:

Runbook: |
  1. [Action] by [specific field] in [timeframe]
  2. [Check/Compare/Verify] [specific condition]
  3. [Search/Look for] [related activity] in [timeframe]

AWS CloudTrail runbook template:

Runbook: |
  1. Query CloudTrail for all API calls by [userIdentity:arn] in the [timeframe]
  2. Check if [sourceIPAddress] matches known [condition]
  3. Find other alerts for this [user/resource/action] in the [timeframe]

Authentication runbook template:

Runbook: |
  1. Count [auth events] for [user] in [timeframe before alert]
  2. Check if [condition about IP/location/device]
  3. Look for [successful auth/password changes/account modifications] in [timeframe]

How Panther AI uses an alert runbook

When Panther AI triages an alert, it reads the alert runbook and autonomously executes it. Writing a runbook according to the tips above can help Panther AI perform the strongest alert triage possible.

See a demo of a detection runbook affecting AI alert triage here.

Panther AI runbook-directed alert triage capabilities

When triaging an alert and executing a detection's runbook, Panther AI has access to a number of capabilities through its tools (though you don't need to specify a tool name—you can let Panther AI decide which one to use):

Capability/tool
What it does
Example runbook step

Log search

Search events for any log type using filters

"Find all AWS CloudTrail events by the user ARN in the 24 hours before the alert"

Structured queries

Query data lake tables with SQL-like syntax

"Query S3 server access logs for all GetObject operations on this bucket in the past hour"

Detection details

Get detection rule source code and metadata

"Review the detection rule logic to understand what threshold triggered this alert"

Related alerts

Find alerts by rule, user, IP, or other fields

"Find all other alerts from this rule for the same user in the past 30 days"

Alert details

Get complete alert context and events

"Retrieve the full alert details including all events and context fields"

Historical AI analysis

Search past AI triage responses

"Check if similar privilege escalation patterns have been analyzed before"

Schema information

Get log type field definitions

"Review the Okta SystemLog schema to understand available fields for correlation"

Indicator enrichment

Check IP, domain, hash reputation

"Check if the source IP is associated with known threat actors or proxy services"

Data profiling

Analyze column value distributions

"Summarize the most common event names for this user in the past 7 days"

Last updated

Was this helpful?