Data Models
Data Models provide a way to configure a set of unified fields across all log types
Overview
Use Data Models to configure a set of unified fields across all log types, by creating mappings between event fields for various log types and unified Data Model names. You can leverage Panther-managed Data Models, and create custom ones.
Data Models for detections are different from the Panther Unified Data Model fields (also known as Core Fields). To learn more, see Core Fields vs. Data Models in Python detections.
Data Models use case
Suppose you have a detection that checks for a particular source IP address in network traffic logs, and you'd like to use it for multiple log types. These log types might not only span different categories (e.g., DNS, Zeek, Apache), but also different vendors. Without a common logging standard, each of these log types may represent the source IP using a different field name, such as ipAddress
, srcIP
, or ipaddr
. The more log types you'd like to monitor, the more complex and cumbersome the logic of this check becomes. For example, it might look something like:
If we instead define a Data Model for each of these log types, we can translate the event's field name to the Data Model name, meaning the detection can simply reference the Data Model version. The above logic then simplifies to:
Panther-managed Data Models
By default, Panther comes with built-in Data Models for several log types, such as AWS.S3ServerAccess
, AWS.VPCFlow
, and Okta.SystemLog
. All currently supported data models can be found in the panther-analysis repository, here.
The names of the supported Data Model mappings are listed in the Panther-managed Data Model mapping names table, below.
How to create custom Data Models
Custom Data Models can be created in a few ways: in the Panther Console, using the Panther Analysis Tool (PAT), or in the Panther API. See the tabs below for creation instructions for each method.
Your custom Data Model mappings can use the names referenced in Panther-managed Data Models, or your own custom names. Each mapping Name
can map to an event field (with Path
or Field Path) or a method you define (with Field Method
or Method). If you map to a method, you must define the method either in a separate Python file (if working in the CLI workflow), which is referenced in the YAML file using Filename
, or in the Python Module field in the Console.
Each log type can only have one enabled Data Model specified (however, a single Data Model can contain multiple mappings). If you want to change or update an existing Data Model, disable the existing one, then create a new, enabled one.
To create a new Data Model in the Panther Console:
In the left-hand navigation bar of your Panther Console, click Detections.
In the upper-right corner, click Create New.
Under Settings, fill in the form fields.
Display Name: Enter a user-friendly display name for this Data Model.
ID: Enter a unique ID for this Data Model.
Log Type: Select a log type this Data Model should apply to. Only one log type per Data Model is permitted.
Under Data Model Mappings, create Name/Field Path or Name/Field Method pairs.
If you used the Field Method field, define the method(s) in the Python Module (optional) section.
In the upper right corner, click Save.
You can now reference this Data Model in your rules. Learn more in Using Data Models in rules.
Evaluating whether a field exists in Path
Path
Within a Path
value, you can include logic that checks whether a certain event field exists. If it does, the mapping will be applied; if it doesn't, the mapping doesn't take effect. For example, take the following Path values from the Panther-managed gsuite_data_model.yml
:
Using Data Models
Referencing Data Models in a rule
To reference a Data Model field in a rule:
In a rule's YAML file, ensure
LogTypes
field contains theLogType
of the Data Model you'd like applied:Add the LogType to all the Rule's
Test
cases, in thep_log_type
field.Leverage the
event.udm()
method in the rule's Python logic:
See examples of Data Models in Panther's Github repository.
Using Data Models with Enrichment
Panther provides a built-in method on the event object called event.udm_path
. It returns the original path that was used for the Data Model.
AWS.VPCFlow logs example
Using event.udm_path('destination_ip')
will return 'dstAddr'
, since this is the path defined in the Data Model for that log type.
The following example uses event.udm_path
:
This test case was used:
The test case returns an alert that includes Alert Context with the datakey
and datavalue
:
Testing Data Models
To test a Data Model, write unit tests for a detection that references a Data Model mapping using event.udm()
in its rule()
logic.
DataModel specification reference
A complete list of DataModel specification fields:
Field name
Required
Description
Expected value
AnalysisType
Yes
Indicates whether this specification is defining a rule, policy, data model, or global
datamodel
DataModelID
Yes
The unique identifier of the data model
String
DisplayName
No
What name to display in the UI and alerts. The DataModelID
will be displayed if this field is not set.
String
Enabled
Yes
Whether this Data Model is enabled
Boolean
FileName
No
The path (with file extension) to the Python DataModel body
String
LogTypes
Yes
Which log type this Data Model will apply to
Singleton List of strings
Note: Although LogTypes
accepts a list of strings, you can only specify one log type per Data Model
Mappings
Yes
Mapping from source field name or method to unified data model field name
DataModel Mappings
Mappings
Mappings translate LogType
fields to unified Data Model fields. Each mapping entry must define a Name
and either a Path
or a method Method
. The Path
can be a simple field name or a JSON Path. The method must be implemented in the file listed in the data model specification Filename
field.
The example above depicts logic within the user
mapping's Path
value to check if the USER_EMAIL
event field exists. Learn more in Evaluating whether a field exists in Path
.
For more information about jsonpath-ng, see pypi.org's documentation here.
Panther-managed Data Model mapping names
The Panther-supported Data Model mapping names are described below. When creating your own Data Model mappings, you may use the names below, in addition to custom ones.
Data Model mapping name
Description
actor_user
ID or username of the user whose action triggered the event.
assigned_admin_role
Admin role ID or name assigned to a user in the event.
destination_ip
Destination IP for the traffic
destination_port
Destination port for the traffic
event_type
Custom description for the type of event. Out of the box support for event types can be found in the global, panther_event_type_helpers.py
.
http_status
Numeric http status code for the traffic
source_ip
Source IP for the traffic
source_port
Source port for the traffic
user_agent
User agent associated with the client in the event.
user
ID or username of the user that was acted upon to trigger the event.
Last updated