LogoLogo
Knowledge BaseCommunityRelease NotesRequest Demo
  • Overview
  • Quick Start
    • Onboarding Guide
  • Data Sources & Transports
    • Supported Logs
      • 1Password Logs
      • Apache Logs
      • AppOmni Logs
      • Asana Logs
      • Atlassian Logs
      • Auditd Logs
      • Auth0 Logs
      • AWS Logs
        • AWS ALB
        • AWS Aurora
        • AWS CloudFront
        • AWS CloudTrail
        • AWS CloudWatch
        • AWS Config
        • AWS EKS
        • AWS GuardDuty
        • AWS Security Hub
        • Amazon Security Lake
        • AWS S3
        • AWS Transit Gateway
        • AWS VPC
        • AWS WAF
      • Azure Monitor Logs
      • Bitwarden Logs
      • Box Logs
      • Carbon Black Logs
      • Cisco Umbrella Logs
      • Cloudflare Logs
      • CrowdStrike Logs
        • CrowdStrike Falcon Data Replicator
        • CrowdStrike Event Streams
      • Docker Logs
      • Dropbox Logs
      • Duo Security Logs
      • Envoy Logs
      • Fastly Logs
      • Fluentd Logs
      • GCP Logs
      • GitHub Logs
      • GitLab Logs
      • Google Workspace Logs
      • Heroku Logs
      • Jamf Pro Logs
      • Juniper Logs
      • Lacework Logs
        • Lacework Alert Channel Webhook
        • Lacework Export
      • Material Security Logs
      • Microsoft 365 Logs
      • Microsoft Entra ID Audit Logs
      • Microsoft Graph Logs
      • MongoDB Atlas Logs
      • Netskope Logs
      • Nginx Logs
      • Notion Logs
      • Okta Logs
      • OneLogin Logs
      • Orca Security Logs (Beta)
      • Osquery Logs
      • OSSEC Logs
      • Proofpoint Logs
      • Push Security Logs
      • Rapid7 Logs
      • Salesforce Logs
      • SentinelOne Logs
      • Slack Logs
      • Snowflake Audit Logs (Beta)
      • Snyk Logs
      • Sophos Logs
      • Sublime Security Logs
      • Suricata Logs
      • Sysdig Logs
      • Syslog Logs
      • Tailscale Logs
      • Teleport Logs
      • Tenable Vulnerability Management Logs
      • Thinkst Canary Logs
      • Tines Logs
      • Tracebit Logs
      • Windows Event Logs
      • Wiz Logs
      • Zeek Logs
      • Zendesk Logs
      • Zoom Logs
      • Zscaler Logs
        • Zscaler ZIA
        • Zscaler ZPA
    • Custom Logs
      • Log Schema Reference
      • Transformations
      • Script Log Parser (Beta)
      • Fastmatch Log Parser
      • Regex Log Parser
      • CSV Log Parser
    • Data Transports
      • HTTP Source
      • AWS Sources
        • S3 Source
        • CloudWatch Logs Source
        • SQS Source
          • SNS Source
        • EventBridge
      • Google Cloud Sources
        • Cloud Storage (GCS) Source
        • Pub/Sub Source
      • Azure Blob Storage Source
    • Monitoring Log Sources
    • Ingestion Filters
      • Raw Event Filters
      • Normalized Event Filters (Beta)
    • Data Pipeline Tools
      • Chronosphere Onboarding Guide
      • Cribl Onboarding Guide
      • Fluent Bit Onboarding Guide
        • Fluent Bit Configuration Examples
      • Fluentd Onboarding Guide
        • General log forwarding via Fluentd
        • MacOS System Logs to S3 via Fluentd
        • Syslog to S3 via Fluentd
        • Windows Event Logs to S3 via Fluentd (Legacy)
        • GCP Audit to S3 via Fluentd
      • Observo Onboarding Guide
      • Tarsal Onboarding Guide
    • Tech Partner Log Source Integrations
  • Detections
    • Using Panther-managed Detections
      • Detection Packs
    • Rules and Scheduled Rules
      • Writing Python Detections
        • Python Rule Caching
        • Data Models
        • Global Helper Functions
      • Modifying Detections with Inline Filters (Beta)
      • Derived Detections (Beta)
        • Using Derived Detections to Avoid Merge Conflicts
      • Using the Simple Detection Builder
      • Writing Simple Detections
        • Simple Detection Match Expression Reference
        • Simple Detection Error Codes
    • Correlation Rules (Beta)
      • Correlation Rule Reference
    • PyPanther Detections (Beta)
      • Creating PyPanther Detections
      • Registering, Testing, and Uploading PyPanther Detections
      • Managing PyPanther Detections in the Panther Console
      • PyPanther Detections Style Guide
      • pypanther Library Reference
      • Using the pypanther Command Line Tool
    • Signals
    • Policies
    • Testing
      • Data Replay (Beta)
    • Framework Mapping and MITRE ATT&CK® Matrix
  • Cloud Security Scanning
    • Cloud Resource Attributes
      • AWS
        • ACM Certificate
        • CloudFormation Stack
        • CloudWatch Log Group
        • CloudTrail
        • CloudTrail Meta
        • Config Recorder
        • Config Recorder Meta
        • DynamoDB Table
        • EC2 AMI
        • EC2 Instance
        • EC2 Network ACL
        • EC2 SecurityGroup
        • EC2 Volume
        • EC2 VPC
        • ECS Cluster
        • EKS Cluster
        • ELBV2 Application Load Balancer
        • GuardDuty Detector
        • GuardDuty Detector Meta
        • IAM Group
        • IAM Policy
        • IAM Role
        • IAM Root User
        • IAM User
        • KMS Key
        • Lambda Function
        • Password Policy
        • RDS Instance
        • Redshift Cluster
        • Route 53 Domains
        • Route 53 Hosted Zone
        • S3 Bucket
        • WAF Web ACL
  • Alerts & Destinations
    • Alert Destinations
      • Amazon SNS Destination
      • Amazon SQS Destination
      • Asana Destination
      • Blink Ops Destination
      • Custom Webhook Destination
      • Discord Destination
      • GitHub Destination
      • Google Pub/Sub Destination (Beta)
      • Incident.io Destination
      • Jira Cloud Destination
      • Jira Data Center Destination (Beta)
      • Microsoft Teams Destination
      • Mindflow Destination
      • OpsGenie Destination
      • PagerDuty Destination
      • Rapid7 Destination
      • ServiceNow Destination (Custom Webhook)
      • Slack Bot Destination
      • Slack Destination (Webhook)
      • Splunk Destination (Beta)
      • Tines Destination
      • Torq Destination
    • Assigning and Managing Alerts
      • Managing Alerts in Slack
    • Alert Runbooks
      • Panther-managed Policies Runbooks
        • AWS CloudTrail Is Enabled In All Regions
        • AWS CloudTrail Sending To CloudWatch Logs
        • AWS KMS CMK Key Rotation Is Enabled
        • AWS Application Load Balancer Has Web ACL
        • AWS Access Keys Are Used Every 90 Days
        • AWS Access Keys are Rotated Every 90 Days
        • AWS ACM Certificate Is Not Expired
        • AWS Access Keys not Created During Account Creation
        • AWS CloudTrail Has Log Validation Enabled
        • AWS CloudTrail S3 Bucket Has Access Logging Enabled
        • AWS CloudTrail Logs S3 Bucket Not Publicly Accessible
        • AWS Config Is Enabled for Global Resources
        • AWS DynamoDB Table Has Autoscaling Targets Configured
        • AWS DynamoDB Table Has Autoscaling Enabled
        • AWS DynamoDB Table Has Encryption Enabled
        • AWS EC2 AMI Launched on Approved Host
        • AWS EC2 AMI Launched on Approved Instance Type
        • AWS EC2 AMI Launched With Approved Tenancy
        • AWS EC2 Instance Has Detailed Monitoring Enabled
        • AWS EC2 Instance Is EBS Optimized
        • AWS EC2 Instance Running on Approved AMI
        • AWS EC2 Instance Running on Approved Instance Type
        • AWS EC2 Instance Running in Approved VPC
        • AWS EC2 Instance Running On Approved Host
        • AWS EC2 Instance Running With Approved Tenancy
        • AWS EC2 Instance Volumes Are Encrypted
        • AWS EC2 Volume Is Encrypted
        • AWS GuardDuty is Logging to a Master Account
        • AWS GuardDuty Is Enabled
        • AWS IAM Group Has Users
        • AWS IAM Policy Blocklist Is Respected
        • AWS IAM Policy Does Not Grant Full Administrative Privileges
        • AWS IAM Policy Is Not Assigned Directly To User
        • AWS IAM Policy Role Mapping Is Respected
        • AWS IAM User Has MFA Enabled
        • AWS IAM Password Used Every 90 Days
        • AWS Password Policy Enforces Complexity Guidelines
        • AWS Password Policy Enforces Password Age Limit Of 90 Days Or Less
        • AWS Password Policy Prevents Password Reuse
        • AWS RDS Instance Is Not Publicly Accessible
        • AWS RDS Instance Snapshots Are Not Publicly Accessible
        • AWS RDS Instance Has Storage Encrypted
        • AWS RDS Instance Has Backups Enabled
        • AWS RDS Instance Has High Availability Configured
        • AWS Redshift Cluster Allows Version Upgrades
        • AWS Redshift Cluster Has Encryption Enabled
        • AWS Redshift Cluster Has Logging Enabled
        • AWS Redshift Cluster Has Correct Preferred Maintenance Window
        • AWS Redshift Cluster Has Sufficient Snapshot Retention Period
        • AWS Resource Has Minimum Number of Tags
        • AWS Resource Has Required Tags
        • AWS Root Account Has MFA Enabled
        • AWS Root Account Does Not Have Access Keys
        • AWS S3 Bucket Name Has No Periods
        • AWS S3 Bucket Not Publicly Readable
        • AWS S3 Bucket Not Publicly Writeable
        • AWS S3 Bucket Policy Does Not Use Allow With Not Principal
        • AWS S3 Bucket Policy Enforces Secure Access
        • AWS S3 Bucket Policy Restricts Allowed Actions
        • AWS S3 Bucket Policy Restricts Principal
        • AWS S3 Bucket Has Versioning Enabled
        • AWS S3 Bucket Has Encryption Enabled
        • AWS S3 Bucket Lifecycle Configuration Expires Data
        • AWS S3 Bucket Has Logging Enabled
        • AWS S3 Bucket Has MFA Delete Enabled
        • AWS S3 Bucket Has Public Access Block Enabled
        • AWS Security Group Restricts Ingress On Administrative Ports
        • AWS VPC Default Security Group Restricts All Traffic
        • AWS VPC Flow Logging Enabled
        • AWS WAF Has Correct Rule Ordering
        • AWS CloudTrail Logs Encrypted Using KMS CMK
      • Panther-managed Rules Runbooks
        • AWS CloudTrail Modified
        • AWS Config Service Modified
        • AWS Console Login Failed
        • AWS Console Login Without MFA
        • AWS EC2 Gateway Modified
        • AWS EC2 Network ACL Modified
        • AWS EC2 Route Table Modified
        • AWS EC2 SecurityGroup Modified
        • AWS EC2 VPC Modified
        • AWS IAM Policy Modified
        • AWS KMS CMK Loss
        • AWS Root Activity
        • AWS S3 Bucket Policy Modified
        • AWS Unauthorized API Call
    • Tech Partner Alert Destination Integrations
  • Investigations & Search
    • Search
      • Search Filter Operators
    • Data Explorer
      • Data Explorer SQL Search Examples
        • CloudTrail logs queries
        • GitHub Audit logs queries
        • GuardDuty logs queries
        • Nginx and ALB Access logs queries
        • Okta logs queries
        • S3 Access logs queries
        • VPC logs queries
    • Visualization and Dashboards
      • Custom Dashboards (Beta)
      • Panther-Managed Dashboards
    • Standard Fields
    • Saved and Scheduled Searches
      • Templated Searches
        • Behavioral Analytics and Anomaly Detection Template Macros (Beta)
      • Scheduled Search Examples
    • Search History
    • Data Lakes
      • Snowflake
        • Snowflake Configuration for Optimal Search Performance
      • Athena
  • PantherFlow (Beta)
    • PantherFlow Quick Reference
    • PantherFlow Statements
    • PantherFlow Operators
      • Datatable Operator
      • Extend Operator
      • Join Operator
      • Limit Operator
      • Project Operator
      • Range Operator
      • Sort Operator
      • Search Operator
      • Summarize Operator
      • Union Operator
      • Visualize Operator
      • Where Operator
    • PantherFlow Data Types
    • PantherFlow Expressions
    • PantherFlow Functions
      • Aggregation Functions
      • Date/time Functions
      • String Functions
      • Array Functions
      • Math Functions
      • Control Flow Functions
      • Regular Expression Functions
      • Snowflake Functions
      • Data Type Functions
      • Other Functions
    • PantherFlow Example Queries
      • PantherFlow Examples: Threat Hunting Scenarios
      • PantherFlow Examples: SOC Operations
      • PantherFlow Examples: Panther Audit Logs
  • Enrichment
    • Custom Lookup Tables
      • Creating a GreyNoise Lookup Table
      • Lookup Table Examples
        • Using Lookup Tables: 1Password UUIDs
      • Lookup Table Specification Reference
    • Identity Provider Profiles
      • Okta Profiles
      • Google Workspace Profiles
    • Anomali ThreatStream
    • IPinfo
    • Snowflake Enrichment (Beta)
    • Tor Exit Nodes
    • TrailDiscover (Beta)
  • Panther AI (Beta)
    • Managing Panther AI Response History
  • System Configuration
    • Role-Based Access Control
    • Identity & Access Integrations
      • Azure Active Directory SSO
      • Duo SSO
      • G Suite SSO
      • Okta SSO
        • Okta SCIM
      • OneLogin SSO
      • Generic SSO
    • Panther Audit Logs
      • Querying and Writing Detections for Panther Audit Logs
      • Panther Audit Log Actions
    • Notifications and Errors (Beta)
      • System Errors
    • Panther Deployment Types
      • SaaS
      • Cloud Connected
        • Setting Up a Cloud Connected Panther Instance
      • Legacy Configurations
        • Snowflake Connected (Legacy)
        • Customer-configured Snowflake Integration (Legacy)
        • Self-Hosted Deployments (Legacy)
          • Runtime Environment
  • Panther Developer Workflows
    • Panther Developer Workflows Overview
    • Using panther-analysis
      • Public Fork
      • Private Clone
      • Panther Analysis Tool
        • Install, Configure, and Authenticate with the Panther Analysis Tool
        • Panther Analysis Tool Commands
        • Managing Lookup Tables and Enrichment Providers with the Panther Analysis Tool
      • CI/CD for Panther Content
        • Deployment Workflows Using Panther Analysis Tool
          • Managing Panther Content via CircleCI
          • Managing Panther Content via GitHub Actions
        • Migrating to a CI/CD Workflow
    • Panther API
      • REST API (Beta)
        • Alerts
        • Alert Comments
        • API Tokens
        • Data Models
        • Globals
        • Log Sources
        • Queries
        • Roles
        • Rules
        • Scheduled Rules
        • Simple Rules
        • Policies
        • Users
      • GraphQL API
        • Alerts & Errors
        • Cloud Account Management
        • Data Lake Queries
        • Log Source Management
        • Metrics
        • Schemas
        • Token Rotation
        • User & Role Management
      • API Playground
    • Terraform
      • Managing AWS S3 Log Sources with Terraform
      • Managing HTTP Log Sources with Terraform
    • pantherlog Tool
    • Converting Sigma Rules
    • MCP Server (Beta)
  • Resources
    • Help
      • Operations
      • Security and Privacy
        • Security Without AWS External ID
      • Glossary
      • Legal
    • Panther System Architecture
Powered by GitBook
On this page
  • Overview
  • Order of transformation execution
  • rename
  • copy
  • concat
  • split
  • mask
  • Obfuscation (hashing)
  • Redaction
  • isEmbeddedJSON

Was this helpful?

  1. Data Sources & Transports
  2. Custom Logs

Transformations

Mutate data structure upon ingest

PreviousLog Schema ReferenceNextScript Log Parser (Beta)

Last updated 6 months ago

Was this helpful?

Overview

Transformations are functions you can use in custom log source schemas to modify the shape of your data upon ingest into Panther. The data will then be stored in the new format.

Transformations help align stored data to the needs of detection and query logic, removing the need for ad-hoc data manipulation and expediting detection writing and search.

The following transformations are available:

You can further manipulate your data on ingest using the .

Order of transformation execution

There is a specific order in which transformations are performed, which ensures that the transformations are applied one after another in a predictable manner. The order of execution is .

Follow the defined order to accurately transform data. Each transformation in the sequence operates on the data in the state it was left after the previous transformation. Knowing this order maintains consistency and avoids unexpected results.

Combining transformations

Individual transformations can be combined in pairs or sequences to achieve more complex data transformations. This allows for greater flexibility and customization to meet specific data requirements and facilitate efficient detection creation and search operations.

Suppose there is a field that contains personal identification numbers (PINs). For security purposes, you want to rename the field to something less revealing while also applying a mask to redact the PINs.

To achieve this, you can use the rename transformation to change the field's name to something abstract. For instance, you could rename the field to userId.

Next, apply the mask transformation to the userId field to replace the digits of the PIN with a predefined number of asterisks. This way, the PIN remains hidden, ensuring data privacy.

You can define a field schema with both a rename and a mask directive like this:

- name: userId
  type: string
  rename:
    from: PIN
  mask:
    type: redact
    to: "*****"

You will achieve to transform a payload like this:

{
  "PIN": "1234"
}

To this:

{
  "userId": "*****"
}

rename

The rename transformation changes the name of a field. This can be useful if you want to standardize field names across data sources, improve the clarity of your data's structure, or adjust field names containing invalid characters or reserved keywords.

By defining a field schema with a rename directive, such as:

- name: user
  type: string
  rename: 
    from: "@user"
- name: role
  type: object
  fields:
   - name: level
     type: string
     rename:
       from: type

You will transform a payload like this:

{
  "@user": "john"
  "role": {
    "type": "admin"
  }
}

To this:

{
  "user": "john"
  "role": {
    "level": "admin"
  }
}

copy

By defining a field schema with a copy directive, such as:

- name: message
  type: string
  copy:
    from: attributes.message
- name: attributes
  type: json

You will transform a payload like this:

{
  "attributes": {
      "message": "hello there", 
      "user": "someone"
  }
}

To this:

{
  "message": "hello there",
  "attributes": {
      "message": "hello there", 
      "user": "someone"
  }
}

concat

The concat transformation allows you to concatenate multiple fields' values into the value of a new field. The resulting combined field can be used, for example, as a key for enrichment.

Fields whose type is timestamp cannot be used in concatenation operations.

To use concat, declare a string field to store the result of the concatenation. Within concat, define the paths, and optionally a separator. Within paths, you must use absolute paths to specify the existing schema fields you'd like to combine. The order of these fields determines the concatenation order. If separator is not defined, the default separator is an empty string ("").

By defining a field schema with a concat directive, such as:

- name: ip
  type: string
- name: ports
  type: object
  fields:
   - name: https
     type: int
- name: socket
  type: string
  concat:
   separator: ":"
   paths: 
    - ip
    - ports.https

You will transform a payload like this:

{
  "ip": "192.168.0.1"
  "ports": {
    "https": 443
  }
}

To this:

{
  "ip": "192.168.0.1"
  "ports": {
    "https": 443
  },
  "socket": "192.168.0.1:443"
}

split

Only fields with a type of string can be split into other fields (i.e., the value of split:from: must be a field that contains type: string).

To use split, declare a field of any primitive type (i.e., excluding object, array, and JSON) to store the result. Within the split directive, include the following required fields:

  • from: Provide the absolute path of the field to be divided.

  • separator: Provide the character to split on.

  • index: Provide the position of the value within the resulting array produced by the split.

By defining a field schema with a split directive, such as:

- name: socket
  type: string
- name: ip
  type: string
  split:
   from: socket
   separator: ":"
   index: 0
- name: port
  type: int
  split:
   from: socket
   separator: ":"
   index: 1  

You will transform a payload like this:

{
  "socket": "192.168.0.1:443"
}

To this:

{
  "socket": "192.168.0.1:443",
  "ip": "192.168.0.1",
  "port": 443
}

You can also use split to split array elements. For example, using the following schema:

- name: traffic
  type: array
  element: 
    type: object
    fields:
      - name: socket
        type: string
      - name: ip
        type: string
        indicators: [ip]
        split: 
          from: traffic.socket
          separator: ":"
          index: 0
      - name: port
        type: int
        split: 
          from: traffic.socket
          separator: ":"
          index: 1
       

You will transform a payload like this:

{
  "traffic": [
   {
     "socket": "192.168.0.1:443"
   },
   {
     "socket": "192.168.0.2:80"
   } 
  ]
}

To this:

{
  "traffic": [
   {
     "socket": "192.168.0.1:443",
     "ip": "192.168.0.1",
     "port": 443
   },
   {
     "socket": "192.168.0.2:80",
     "ip": "192.168.0.2",
     "port": 80
   } 
  ]
}

mask

The mask transformation enables you to conceal sensitive information in your logs. Masking is useful if you need to protect the confidentiality of certain data.

There are two masking techniques:

Obfuscation (hashing)

Hashing incoming data means you can enhance its security while still retaining its usability in the future. To strengthen the protection hashing provides, you can include a salt.

To use obfuscation, on the target field in your schema, include mask. Under mask, include type, and optionally salt.

The value of type is the hashing algorithm you want to use. Supported values include:

  • sha256

  • md5

  • sha1

  • sha512

The value of the optional salt key is a string of your choice. This value is appended to the field's value before it is hashed.

When using mask, the value of the target field's type must always be set as string. The actual input data can be of any type, but type: string is required because, after the value has been masked, it will be stored as a string in the data lake.

By defining a field schema with a mask directive such as:

- name: username
  type: string # Must be set as string (though all data types allowed)
  mask:
    type: sha256 
    salt: random_salt # Optional

You will transform a payload like this:

{
  "username": "john"
}

To this:

{
  "username": "98b4ceb956e9ed4539b0721add25cab0bacce4307cf3140c4430c1513476a3e4"
}

Redaction

Redacting incoming data means replacing it with a predefined value. This technique is useful if you'd like to ensure the sensitive information is not accessible or recoverable.

To use redaction, on the target field in your schema, include mask. Under mask, include type: redact, and optionally to.

The optional to key takes a string value that will replace the actual event value. If to is not included, its default, REDACTED, is used.

When using mask, the value of the target field's type must always be set as string. The actual input data can be of any type, but type: string is required because, after the value has been masked, it will be stored as a string in the data lake.

By defining a field schema with a mask directive such as:

- name: username
  type: string # Must be set as string (though all data types allowed)
  mask:
    type: redact 
    to: "XXXX" # Optional, default: "REDACTED"

You will transform a payload like this:

{
  "username": "john"
}

To this:

{
  "username": "XXXX"
}

isEmbeddedJSON

Sometimes JSON values are delivered embedded in a string.

To have Panther parse the escaped JSON inside the string, use an isEmbeddedJSON: true flag. This flag is valid for values of type object, array and json.

By defining a field schema with a isEmbeddedJSON directive such as:

- name: message
  type: object
  isEmbeddedJSON: true
  fields:
    - name: foo
      type: string

You will transform a payload like this:

{
  "timestamp": "2021-03-24T18:15:23Z",
  "message": "{\"foo\":\"bar\"}"
}

To this:

{
  "timestamp": "2021-03-24T18:15:23Z",
  "message": {
    "foo": "bar"
  }
}

The copy transformation copies the value of a nested field into another top-level field. This can be useful if you'd like to flatten your data's JSON structure. If desired, you can then mark your newly defined field as an .

The split transformation allows you to extract a specific value from a string field by splitting it based on a separator. The resulting split fields can be treated as individual schema fields, making it possible to designate them as . Split transformation can also help with data normalization into standardized fields, making it easier to handle unstructured data formats.

(also known as hashing): This technique hashes data, using an optional salt value. With this technique, the value keeps its referential integrity.

: This technique replaces sensitive values with REDACTED, or some other string you provide. With this technique, the value loses its referential integrity.

Note that masking a certain field means you cannot later use Panther's to query for its original value, but you can search for a hashed value.

copy
rename
concat
split
mask
isEmbeddedJSON
Script Log Parser
the sequence provided in the transformation list in the Overview, above
indicator
indicators
Obfuscation
Redaction
search tools