Data Models

Data Models provide a way to configure a set of unified fields across all log types

Overview

Use Data Models to configure a set of unified fields across all log types, by creating mappings between event fields for various log types and unified Data Model names. You can leverage Panther-managed Data Models, and create custom ones.

Data Models use case

Suppose you have a detection that checks for a particular source IP address in network traffic logs, and you'd like to use it for multiple log types. These log types might not only span different categories (e.g., DNS, Zeek, Apache), but also different vendors. Without a common logging standard, each of these log types may represent the source IP using a different field name, such as ipAddress, srcIP, or ipaddr. The more log types you'd like to monitor, the more complex and cumbersome the logic of this check becomes. For example, it might look something like:

(event.get('ipAddress') == '127.0.0.1' or 
event.get('srcIP') == '127.0.0.1' or 
event.get('ipaddr') == '127.0.0.1')

If we instead define a Data Model for each of these log types, we can translate the event's field name to the Data Model name, meaning the detection can simply reference the Data Model version. The above logic then simplifies to:

event.udm('source_ip') == '127.0.0.1'

Panther-managed Data Models

By default, Panther comes with built-in Data Models for several log types, such as AWS.S3ServerAccess, AWS.VPCFlow, and Okta.SystemLog. All currently supported data models can be found in the panther-analysis repository, here.

The names of the supported Data Model mappings are listed in the Panther-managed Data Model mapping names table, below.

How to create custom Data Models

Custom Data Models can be created in a few ways: in the Panther Console, using the Panther Analysis Tool (PAT), or in the Panther API. See the tabs below for creation instructions for each method.

Your custom Data Model mappings can use the names referenced in Panther-managed Data Models, or your own custom names. Each mapping Name can map to an event field (with Path or Field Path) or a method you define (with Field Method or Method). If you map to a method, you must define the method either in a separate Python file (if working in the CLI workflow), which is referenced in the YAML file using Filename, or in the Python Module field in the Console.

Each log type can only have one enabled Data Model specified (however, a single Data Model can contain multiple mappings). If you want to change or update an existing Data Model, disable the existing one, then create a new, enabled one.

To create a new Data Model in the Panther Console:

  1. In the upper right corner, click Create New.

  2. Under Settings, fill in the form fields.

    • Display Name: Enter a user-friendly display name for this Data Model.

    • ID: Enter a unique ID for this Data Model.

    • Log Type: Select a log type this Data Model should apply to. Only one log type per Data Model is permitted.

  3. Under Data Model Mappings, create Name/Field Path or Name/Field Method pairs.

  4. If you used the Field Method field, define the method(s) in the Python Module (optional) section.

  5. In the upper right corner, click Save.

Evaluating whether a field exists in Path

Within a Path value, you can include logic that checks whether a certain event field exists. If it does, the mapping will be applied; if it doesn't, the mapping doesn't take effect. For example, take the following Path values from the Panther-managed gsuite_data_model.yml:

  - Name: assigned_admin_role
    Path: $.events[*].parameters[?(@.name == 'ROLE_NAME')].value

Using Data Models

Referencing Data Models in a rule

To reference a Data Model field in a rule:

  1. In a rule's YAML file, ensure LogTypes field contains the LogType of the Data Model you'd like applied:

    AnalysisType: rule
    DedupPeriodMinutes: 60
    DisplayName: DataModel Example Rule
    Enabled: true
    Filename: my_new_rule.py
    RuleID: DataModel.Example.Rule
    Severity: High
    LogTypes:
      # Add LogTypes where this rule is applicable
      # and a Data Model exists for that LogType
      - AWS.CloudTrail
    Tags:
      - Tags
    Description: >
      This rule exists to validate the CLI workflows of the Panther CLI
    Runbook: >
      First, find out who wrote this the spec format, then notify them with feedback.
    Tests:
      - Name: test rule
        ExpectedResult: true
        # Add the LogType to the test specification in the 'p_log_type' field
        Log: {
          "p_log_type": "AWS.CloudTrail"
        }
  2. Add the LogType to all the Rule's Test cases, in the p_log_type field.

  3. Leverage the event.udm() method in the rule's Python logic:

    def rule(event):    
        # filter events on unified data model field
        return event.udm('event_type') == 'failed_login'
    
    
    def title(event):
        # use unified data model field in title
        return '{}: User [{}] from IP [{}] has exceeded the failed logins threshold'.format(
            event.get('p_log_type'), event.udm('actor_user'),
            event.udm('source_ip'))

See examples of Data Models in Panther's Github repository.

Using Data Models with Enrichment

Panther provides a built-in method on the event object called event.udm_path. It returns the original path that was used for the Data Model.

AWS.VPCFlow logs example

Using event.udm_path('destination_ip') will return 'dstAddr', since this is the path defined in the Data Model for that log type. The following example uses event.udm_path:

from panther_base_helpers import deep_get

def rule(event):
    return True

def title(event):
    return event.udm_path('destination_ip')

def alert_context(event):
    enriched_data = deep_get(event, 'p_enrichment', 'lookup_table_name', event.udm_path('destination_ip'))
    return {'enriched_data':enriched_data}

This test case was used:

{   
  "p_log_type": "AWS.VPCFlow",
   "dstAddr": "1.1.1.1",
   "p_enrichment": {
      "lookup_table_name": {
        "dstAddr": {
          "datakey": "datavalue" }}}}

The test case returns an alert that includes Alert Context with the datakey and datavalue:

Testing Data Models

To test a Data Model, write unit tests for a detection that references a Data Model mapping using event.udm() in its rule() logic.

DataModel specification reference

A complete list of DataModel specification fields:

Field name

Required

Description

Expected value

AnalysisType

Yes

Indicates whether this specification is defining a rule, policy, data model, or global

datamodel

DataModelID

Yes

The unique identifier of the data model

String

DisplayName

No

What name to display in the UI and alerts. The DataModelID will be displayed if this field is not set.

String

Enabled

Yes

Whether this Data Model is enabled

Boolean

FileName

No

The path (with file extension) to the Python DataModel body

String

LogTypes

Yes

Which log type this Data Model will apply to

Singleton List of strings Note: Although LogTypes accepts a list of strings, you can only specify one log type per Data Model

Mappings

Yes

Mapping from source field name or method to unified data model field name

DataModel Mappings

Mappings translate LogType fields to unified Data Model fields. Each mapping entry must define a Name and either a Path or a method Method. The Path can be a simple field name or a JSON Path. The method must be implemented in the file listed in the data model specification Filename field.

Mappings:
  - Name: source_ip
    Path: srcIp
  - Name: user
    Path: $.events[*].parameters[?(@.name == 'USER_EMAIL')].value
  - Name: event_type
    Method: get_event_type

The example above depicts logic within the user mapping's Path value to check if the USER_EMAIL event field exists. Learn more in Evaluating whether a field exists in Path.

For more information about jsonpath-ng, see pypi.org's documentation here.

Panther-managed Data Model mapping names

The Panther-supported Data Model mapping names are described below. When creating your own Data Model mappings, you may use the names below, in addition to custom ones.

Data Model mapping name

Description

actor_user

ID or username of the user whose action triggered the event.

assigned_admin_role

Admin role ID or name assigned to a user in the event.

destination_ip

Destination IP for the traffic

destination_port

Destination port for the traffic

event_type

Custom description for the type of event. Out of the box support for event types can be found in the global, panther_event_type_helpers.py.

http_status

Numeric http status code for the traffic

source_ip

Source IP for the traffic

source_port

Source port for the traffic

user_agent

User agent associated with the client in the event.

user

ID or username of the user that was acted upon to trigger the event.

Last updated