# Risk Scoring and Classification Framework

## Overview

When you [triage alerts with Panther AI](https://docs.panther.com/alerts#using-panther-ai-with-alerts), you will see a **Risk Classification** score.

Panther's risk scoring system uses a pseudo-Bayesian log-odds ratio approach to evaluate security events, combining both risky and benign indicators into a normalized score that ranges from -1 (benign) to +1 (risky).

<figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2F8rt8pdaWwpAR9sykB2vM%2FScreenshot%202026-01-16%20at%203.08.49%E2%80%AFPM.png?alt=media&#x26;token=337484f9-6471-4261-bab2-cb02bd15804a" alt=""><figcaption></figcaption></figure>

## Core methodology

The system requires explicit enumeration of both risky and benign indicators, rather than making binary judgments. Each indicator receives a base score, then three multiplicative weights are applied to reflect confidence, context, and temporal relevance.

### Indicator scoring scale

**Risky Indicators** receive positive scores on a 1-10 scale:

* **Critical (8-10)**: Active exploitation, successful compromise, data exfiltration
* **High (6-7)**: Known vulnerabilities targeted, privilege escalation attempts, lateral movement
* **Medium (4-5)**: Reconnaissance, scanning, suspicious patterns, policy violations
* **Low (1-3)**: Minor anomalies, configuration drift, informational findings

**Benign Indicators** receive negative scores on a -1 to -10 scale:

* **Strong Mitigation (-8 to -10)**: Complete blocking, successful detection, verified false positive
* **Moderate Mitigation (-4 to -7)**: Partial blocking, expected behavior, authorized activity
* **Weak Mitigation (-1 to -3)**: Limited controls, uncertain legitimacy, incomplete data

### Weighting factors

Each indicator's base score is adjusted by three multiplicative factors (all ranging from 0.0 to 1.0):

1. **Evidence Confidence**: Reliability of the data source and depth of analysis
2. **Context Weighting**: Asset criticality, environment type (production vs. sandbox), and business impact
3. **Temporal Relevance**: How recent the activity is
   * Last 24 hours: 1.0
   * 1-7 days: 0.8
   * 7-30 days: 0.6
   * Over 30 days: 0.3

### Mathematical formula

The system aggregates weighted scores into two totals:

> *ABIS (Aggregate Benign Indicators Score) = Σ(Score × Confidence × Context × Temporal)*\
> \&#xNAN;*ARIS (Aggregate Risk Indicators Score) = Σ(Score × Confidence × Context × Temporal)*

These are combined into a final score:

> *CRS (Composite Risk Score) = (ARIS + ABIS) / (ARIS - ABIS)*

This normalization formula produces results where:

* **CRS = 0**: Perfectly balanced (risky and benign evidence equal)
* **CRS > 0**: High risk (risky evidence dominates)
* **CRS < 0**: Low risk (benign evidence dominates)

#### Classification thresholds

Based on the composite risk score, events are classified as:

* **Risky**: CRS exceeds the positive threshold
* **Benign**: CRS falls below the negative threshold
* **Inconclusive**: CRS falls between thresholds

### Theoretical foundation

This methodology draws from established risk assessment frameworks:

* **CVSS v3.1**: Multiplicative weighting of base, temporal, and environmental metrics
* **FAIR (Factor Analysis of Information Risk)**: Combines threat frequency, vulnerability, and loss magnitude factors
* **Bayesian Risk Scoring**: Uses log-odds ratios for binary classification
* **OCTAVE Approach**: Evidence-based risk management

### Key advantages

The approach avoids binary "good/bad" judgments by requiring analysts to:

* Explicitly document both supporting and contradicting evidence
* Assign confidence levels based on data quality
* Account for environmental context (asset importance, business criticality)
* Weight recent activity more heavily than historical patterns

This enforces rigorous, evidence-based reasoning and prevents over-reliance on single indicators or assumptions. The temporal decay ensures that stale indicators don't artificially inflate risk scores, while context weighting allows appropriate differentiation between attacks on critical production systems versus activity in development sandboxes.

## Using risk scores in your workflow

Risk classification scores help prioritize your alert queue and focus analyst time where it matters most:

* **Risky (CRS significantly above 0)**: Warrants immediate investigation. Review the risky indicators and verify the AI's findings against the cited evidence.
* **Inconclusive (CRS near 0)**: Requires additional context. Consider running follow-up prompts to gather more data, or manually review the alert's associated events.
* **Benign (CRS significantly below 0)**: Likely safe to deprioritize, but review the benign indicators to confirm they align with your understanding of the environment.

The risk score is a starting point for prioritization, not a final verdict. Always review the individual indicators (both risky and benign) that contributed to the score, and use the [citations](https://docs.panther.com/ai/..#citations) provided by Panther AI to verify the underlying data.
