# Alert Runbooks

## Overview

An alert runbook is a set of instructions for investigating and remediating the issue triggering an alert. It's recommended to provide a descriptive runbook, as [Panther AI alert triage](https://docs.panther.com/alerts/..#panther-ai-alert-triage) will take it into consideration. Learn how to write an effective runbook [below](#tips-for-writing-an-effective-runbook).

Runbooks are defined on detections:

* In Python detections, in the dynamic [`runbook()` function](https://docs.panther.com/detections/rules/python#runbook) or the static [`Runbook` field](https://docs.panther.com/detections/rules/python#python-rule-specification-reference).
* In Simple Detections, in the static [`Runbook` field](https://docs.panther.com/detections/rules/writing-simple-detections#simple-detection-rule-specification-reference).
* In correlation rules, in the static [`Runbook` field](https://docs.panther.com/detections/correlation-rules/correlation-rule-reference#correlation-rule-top-level-fields).

Panther provides alert runbooks for a number of [Panther-managed detections](https://docs.panther.com/detections/panther-managed)—view them in the [`panther-analysis` repository](https://github.com/panther-labs/panther-analysis).

### Runbook examples

```python
def runbook(event):
    user_arn = event.deep_get("userIdentity", "arn", default="this user")
    source_ip = event.deep_get("sourceIPAddress", default="this IP address")
    
    return f"""
    1. Find all API calls by {user_arn} in the 24 hours before the alert
    2. Check if the source IP {source_ip} is associated with known cloud provider IP ranges or VPN endpoints
    3. Look for other alerts from {user_arn} or {source_ip} in the past 7 days
    """
```

<details>

<summary>Additional runbook examples</summary>

{% hint style="info" %}
The runbook examples in this section are written as YAML `Runbook` field values, but could be converted to Python `runbook` functions.
{% endhint %}

User behavior analysis runbook:

```yaml
Runbook: |
  1. Find all API calls by the user ARN in the 24 hours before the alert to establish normal behavior
  2. Identify if this action has been performed by this user in the past 90 days
  3. Check for other alerts or suspicious activity from this user in the past 7 days
```

Resource access patterns runbook:

```yaml
Runbook: |
  1. Query for all access attempts to this S3 bucket in the 1 hour window around the alert
  2. Determine if the accessing principal has legitimate access to this bucket based on past 30 days of activity
  3. Check if the accessed object keys match sensitive data patterns
```

Network/IP analysis runbook:

```yaml
Runbook: |
  1. Find all API calls from the source IP in the 6 hours before and after the alert
  2. Check if the IP is associated with known cloud providers, VPNs, or corporate network ranges
  3. Look for geographic inconsistencies in login locations for this user
```

Privilege escalation investigation runbook:

```yaml
Runbook: |
  1. Query IAM policy changes for this user or role in the 48 hours before the alert
  2. Find all API calls using the newly granted permissions in the 24 hours after the policy change
  3. Check if the user has performed similar privilege escalation actions in the past 90 days
```

Data exfiltration investigation runbook:

```yaml
Runbook: |
  1. Calculate total data transferred by this user to external destinations in the 24 hours before and after the alert
  2. Compare data transfer volume to the user's 30-day average baseline
  3. Identify if the destination IPs or domains are associated with cloud storage or file sharing services
```

Failed authentication analysis runbook:

```yaml
Runbook: |
  1. Count failed authentication attempts for this user in the 1 hour before the first failure
  2. Check if successful authentication occurred after the failed attempts from a different IP
  3. Look for password reset or MFA changes for this user in the 24 hours around the alert
```

</details>

## Tips for writing an effective runbook

To write an effective runbook that Panther AI or a human analyst can follow:

* Provide 2-3 focused investigation steps that build on each other.
  * Ensure steps are specific, concrete, and actionable. For example:
    * **Good (specific, actionable)**: "Query AWS CloudTrail for all API calls by {user\_arn} in the 6 hours before and after this alert to identify what actions were performed."
    * **Bad (vague, without a clear outcome)**: "Search for related user activity."
  * To help gather context and build a narrative, think of using steps that can answer:
    1. **What happened?** (immediate context)
    2. **Is this normal?** (baseline comparison)
    3. **What else is suspicious?** (correlation)
* Reference specific alert fields by name to avoid ambiguity.
  * For example, use `sourceIPAddress` instead of "the IP" and `userIdentity:arn` instead of "the user."
* Indicate time windows for data searches (e.g., "24 hours before" or "30 minutes around"). When deciding how much time to search, use the following guidelines:
  * Looking for recent suspicious activity: 1-6 hours before/after
  * Establishing behavioral baselines: 30-90 days of history
  * Executing correlation searches: 24 hours to 7 days
  * Searching for long-term patterns: 90 days

<details>

<summary>Runbook templates</summary>

{% hint style="info" %}
The runbook templates in section are written as YAML `Runbook` field values, but could be converted to [Python `runbook` functions](https://docs.panther.com/detections/rules/python#runbook).
{% endhint %}

Generic runbook template:

```yaml
Runbook: |
  1. [Action] by [specific field] in [timeframe]
  2. [Check/Compare/Verify] [specific condition]
  3. [Search/Look for] [related activity] in [timeframe]
```

AWS CloudTrail runbook template:

```yaml
Runbook: |
  1. Query CloudTrail for all API calls by [userIdentity:arn] in the [timeframe]
  2. Check if [sourceIPAddress] matches known [condition]
  3. Find other alerts for this [user/resource/action] in the [timeframe]
```

Authentication runbook template:

```yaml
Runbook: |
  1. Count [auth events] for [user] in [timeframe before alert]
  2. Check if [condition about IP/location/device]
  3. Look for [successful auth/password changes/account modifications] in [timeframe]
```

</details>

## How Panther AI uses an alert runbook

When [Panther AI triages an alert](https://docs.panther.com/alerts/..#panther-ai-alert-triage), it reads the alert runbook and autonomously executes it. Writing a runbook according to [the tips above](#tips-for-writing-an-effective-runbook) can help Panther AI perform the strongest alert triage possible.

[See a demo of a detection `runbook` affecting AI alert triage here](https://docs.panther.com/ai/examples#using-a-detection-runbook-to-direct-ai-alert-triage).

<details>

<summary>Panther AI runbook-directed alert triage capabilities</summary>

When [triaging an alert](https://docs.panther.com/alerts/..#panther-ai-alert-triage) and executing a detection's `runbook`, Panther AI has access to a number of capabilities through its [tools](https://docs.panther.com/ai#tools) (though you don't need to specify a tool name—you can let Panther AI decide which one to use):

<table><thead><tr><th width="155.81253051757812">Capability/tool</th><th>What it does</th><th>Example runbook step</th></tr></thead><tbody><tr><td>Log search</td><td>Search events for any log type using filters</td><td>"Find all AWS CloudTrail events by the user ARN in the 24 hours before the alert"</td></tr><tr><td>Structured queries</td><td>Query data lake tables with SQL-like syntax</td><td>"Query S3 server access logs for all GetObject operations on this bucket in the past hour"</td></tr><tr><td>Detection details</td><td>Get detection rule source code and metadata</td><td>"Review the detection rule logic to understand what threshold triggered this alert"</td></tr><tr><td>Related alerts</td><td>Find alerts by rule, user, IP, or other fields</td><td>"Find all other alerts from this rule for the same user in the past 30 days"</td></tr><tr><td>Alert details</td><td>Get complete alert context and events</td><td>"Retrieve the full alert details including all events and context fields"</td></tr><tr><td>Historical AI analysis</td><td>Search past AI triage responses</td><td>"Check if similar privilege escalation patterns have been analyzed before"</td></tr><tr><td>Schema information</td><td>Get log type field definitions</td><td>"Review the Okta SystemLog schema to understand available fields for correlation"</td></tr><tr><td>Indicator enrichment</td><td>Check IP, domain, hash reputation</td><td>"Check if the source IP is associated with known threat actors or proxy services"</td></tr><tr><td>Data profiling</td><td>Analyze column value distributions</td><td>"Summarize the most common event names for this user in the past 7 days"</td></tr></tbody></table>

</details>
