Data Models
Data Models provide a way to configure a set of unified fields across all log types
Overview
Use Data Models to configure a set of unified fields across all log types, by creating mappings between event fields for various log types and unified Data Model names. You can leverage Panther-managed Data Models, and create custom ones.
Data Models use case
Suppose you have a detection that checks for a particular source IP address in network traffic logs, and you'd like to use it for multiple log types. These log types might not only span different categories (e.g., DNS, Zeek, Apache), but also different vendors. Without a common logging standard, each of these log types may represent the source IP using a different field name, such as ipAddress, srcIP, or ipaddr. The more log types you'd like to monitor, the more complex and cumbersome the logic of this check becomes. For example, it might look something like:
(event.get('ipAddress') == '127.0.0.1' or
event.get('srcIP') == '127.0.0.1' or
event.get('ipaddr') == '127.0.0.1')If we instead define a Data Model for each of these log types, we can translate the event's field name to the Data Model name, meaning the detection can simply reference the Data Model version. The above logic then simplifies to:
event.udm('source_ip') == '127.0.0.1'Panther-managed Data Models
By default, Panther comes with built-in Data Models for several log types, such as AWS.S3ServerAccess, AWS.VPCFlow, and Okta.SystemLog. All currently supported data models can be found in the panther-analysis repository, here.
The names of the supported Data Model mappings are listed in the Panther-managed Data Model mapping names table, below.
How to create custom Data Models
Custom Data Models can be created in a few ways: in the Panther Console, using the Panther Analysis Tool (PAT), or in the Panther API. See the tabs below for creation instructions for each method.
Your custom Data Model mappings can use the names referenced in Panther-managed Data Models, or your own custom names. Each mapping Name can map to an event field (with Path or Field Path) or a method you define (with Field Method or Method). If you map to a method, you must define the method either in a separate Python file (if working in the CLI workflow), which is referenced in the YAML file using Filename, or in the Python Module field in the Console.
Each log type can only have one enabled Data Model specified (however, a single Data Model can contain multiple mappings). If you want to change or update an existing Data Model, disable the existing one, then create a new, enabled one.
To create a new Data Model in the Panther Console:
In the left-hand navigation bar of your Panther Console, click Detections.
Click the Data Models tab. .

In the upper-right corner, click Create New.
Under Settings, fill in the form fields.
Display Name: Enter a user-friendly display name for this Data Model.
ID: Enter a unique ID for this Data Model.
Log Type: Select a log type this Data Model should apply to. Only one log type per Data Model is permitted.
Enabled: Select wether you'd like this Data Model enabled or disabled.

Under Data Model Mappings, create Name/Field Path or Name/Field Method pairs.
If you used the Field Method field, define the method(s) in the Python Module (optional) section.
In the upper right corner, click Save.
You can now reference this Data Model in your rules. Learn more in Referencing Data Models in a rule.
How to create a Data Model in the CLI workflow
Folder setup
All files related to your custom Data Models must be stored in a folder with a name containing data_models (this could be a top-level data_models directory, or sub-directories with names matching *data_models*).
File setup
Create your Data Model specification YAML file (e.g.,
data_models/aws_cloudtrail_datamodel.yml):AnalysisType: datamodel LogTypes: - AWS.CloudTrail DataModelID: AWS.CloudTrail Filename: aws_cloudtrail_data_model.py Enabled: true Mappings: - Name: actor_user Path: $.userIdentity.userName - Name: event_type Method: get_event_type - Name: source_ip Path: sourceIPAddress - Name: user_agent Path: userAgentSet
AnalysisTypetodatamodel.For
LogTypes, provide the name of one of your log types. Despite this field taking a list, only one log type per Data Model is supported.Provide a value for the
DataModelIDfield.Within
Mappings, createName/PathorName/Methodpairs.Learn more about
Mappingssyntax below, in DataModelMappings.
See Data Model Specification Reference below for a complete list of required and optional fields.
If you included one or more
Methodfields withinMappings, create an associated Python file (data_models/aws_cloudtrail_datamodel.py), and define any referenced methods.In this case, you must also add the
Filenamefield to the Data Model YAML file. If noMethodfields are present, no Python file/Filenamefield is required.from panther_base_helpers import deep_get def get_event_type(event): if event.get('eventName') == 'ConsoleLogin' and deep_get(event, 'userIdentity', 'type') == 'IAMUser': if event.get('responseElements', {}).get('ConsoleLogin') == 'Failure': return "failed_login" if event.get('responseElements', {}).get('ConsoleLogin') == 'Success': return "successful_login" return None
Upload your Data Model to your Panther instance using the PAT
uploadcommand.You can now reference this Data Model in your rules. Learn more in Referencing Data Models in a rule.
How to create a Data Model using the Panther API
See the
POSToperation on Data Models.
Evaluating whether a field exists in Path
PathWithin a Path value, you can include logic that checks whether a certain event field exists. If it does, the mapping will be applied; if it doesn't, the mapping doesn't take effect.
For example, take the following Path values from the Panther-managed gsuite_data_model.yml:
- Name: assigned_admin_role
Path: $.events[*].parameters[?(@.name == 'ROLE_NAME')].valueUsing Data Models
Referencing Data Models in a rule
To reference a Data Model field in a rule:
In a rule's YAML file, ensure
LogTypesfield contains the log type of the Data Model you'd like applied:AnalysisType: rule DedupPeriodMinutes: 60 DisplayName: DataModel Example Rule Enabled: true Filename: my_new_rule.py RuleID: DataModel.Example.Rule Severity: High LogTypes: # Add LogTypes where this rule is applicable # and a Data Model exists for that LogType - AWS.CloudTrail Tags: - Tags Description: > This rule exists to validate the CLI workflows of the Panther CLI Runbook: > First, find out who wrote this the spec format, then notify them with feedback. Tests: - Name: test rule ExpectedResult: true # Add the LogType to the test specification in the 'p_log_type' field Log: { "p_log_type": "AWS.CloudTrail" }Add the log type to all the Rule's
Testcases, in thep_log_typefield.Use the
event.udm()method in the rule's Python logic:def rule(event): # filter events on unified data model field return event.udm('event_type') == 'failed_login' def title(event): # use unified data model field in title return '{}: User [{}] from IP [{}] has exceeded the failed logins threshold'.format( event.get('p_log_type'), event.udm('actor_user'), event.udm('source_ip'))
Using Data Models with Enrichment
Panther provides a built-in method on the event object called event.udm_path(). It returns the original path that was used for the Data Model.
AWS.VPCFlow logs example
In the example below, calling event.udm_path('destination_ip') will return 'dstAddr', since this is the path defined in the Data Model for that log type.
from panther_base_helpers import deep_get
def rule(event):
return True
def title(event):
return event.udm_path('destination_ip')
def alert_context(event):
enriched_data = deep_get(event, 'p_enrichment', 'lookup_table_name', event.udm_path('destination_ip'))
return {'enriched_data':enriched_data}To test this, we can use this test case:
{
"p_log_type": "AWS.VPCFlow",
"dstAddr": "1.1.1.1",
"p_enrichment": {
"lookup_table_name": {
"dstAddr": {
"datakey": "datavalue" }}}}The test case returns the following alert, with Alert Context containing the value of dstAddr (or {"datakey": "datavalue"}) as the value of enriched_data.

Testing Data Models
To test a Data Model, write unit tests for a detection that references a Data Model mapping using event.udm() in its rule() logic.
DataModel specification reference
A complete list of DataModel specification fields:
Field name
Required
Description
Expected value
AnalysisType
Yes
Indicates whether this specification is defining a rule, policy, data model, or global
datamodel
DataModelID
Yes
The unique identifier of the data model
String
DisplayName
No
What name to display in the UI and alerts. The DataModelID will be displayed if this field is not set.
String
Enabled
Yes
Whether this Data Model is enabled
Boolean
FileName
No
The path (with file extension) to the Python Data Model body
String
LogTypes
Yes
Which log type this Data Model will apply to
Singleton List of strings
Note: Although LogTypes accepts a list of strings, you can only specify one log type per Data Model
Mappings
Yes
Mapping from source field name or method to unified data model field name
DataModel Mappings
MappingsMappings translate LogType fields to unified Data Model fields. Each Mappings entry must define:
Name: How you will reference this data model in detections.One of:
Path: The path to the field in the original log type's schema. This value can be a simple field name or a JSON path. For more information about jsonpath-ng, see pypi.org's documentation here.Method: The name of the method. The method must be defined in the file listed in the data model specificationFilenamefield.
Example:
Mappings:
- Name: source_ip
Path: srcIp
- Name: user
Path: $.events[*].parameters[?(@.name == 'USER_EMAIL')].value
- Name: event_type
Method: get_event_typeThe Path value of the user data model has logic that checks if the USER_EMAIL event field exists. Learn more in Evaluating whether a field exists in Path.
Panther-managed Data Model mapping names
The Panther-managed Data Model mapping names are described below. When creating your own Data Model mappings, you may use the names below, in addition to custom ones.
Data Model mapping name
Description
actor_user
ID or username of the user whose action triggered the event.
assigned_admin_role
Admin role ID or name assigned to a user in the event.
destination_ip
Destination IP for the traffic
destination_port
Destination port for the traffic
event_type
Custom description for the type of event. Out of the box support for event types can be found in the global, panther_event_type_helpers.py.
http_status
Numeric http status code for the traffic
source_ip
Source IP for the traffic
source_port
Source port for the traffic
user_agent
User agent associated with the client in the event.
user
ID or username of the user that was acted upon to trigger the event.
Last updated
Was this helpful?

