Custom Lookup Tables

Enrich events with your own stored data

Overview

Custom Lookup Tables (also referred to as simply "Lookup Tables") allow you to store and reference custom enrichment data in Panther. This means you can reference this added context in detections and alerts. It may be particularly useful to create Lookup Tables containing identity/asset information, vulnerability context, or network maps.

You can associate one or more log types with your Lookup Table, and then all logs of those types will contain enrichment data from your Lookup Table. See an example of writing a detection that references Lookup Table data below.

Note that there are also Panther-managed Lookup Tables, including Enrichment Providers like GreyNoise and IPinfo, as well as Identity Provider Profiles. Consider using Global helpers instead when extra information is only needed for a few specific detections and will not be frequently updated.

To increase the limit on the number of Lookup Tables or the size of Lookup Tables in your account, please contact your Panther support team.

How Lookup Tables work

Your configured Lookup Tables are associated with one or more log types, connected by foreign key fields called Selectors. Data enrichment begins prior to log events received by the detections engine, thus every incoming log event with a match in your Lookup Table will be enriched. If a match is found, a p_enrichment field is appended to the event and accessed within a detection using deep_get() or DeepKey. The p_enrichment field will contain:

One or more Lookup Table name(s) that matched the incoming log event
The name of the selector from the incoming log that matched the Lookup Table
The data from the Lookup Table that matched via the Lookup Table's primary key (including an injected p_match field containing the selector value that matched)

This is the structure of p_enrichment fields:

'p_enrichment': {
    <name of lookup table1>: {
        <name of selector>: {
            'p_match': <value of selector>,
	    <lookup key>: <lookup value>,
	    ...
	}
    }
}

How is data matched between logs and Lookup Tables?

When configuring a Lookup Table, you will define the following:

Primary key: a column in the Lookup Table
Selector(s): one or more fields within the log type(s) the Lookup Table is associated to

When Panther parses one of the associated logs, it compares the value of the selector(s) in the event with the values of the primary key in the Lookup Table. If a match is found, Panther adds the corresponding row from the Lookup Table to the log event's p_enrichment struct, and injects a p_match field containing the value that matched.

Prerequisites for configuring a Lookup Table

A schema specifically for your Lookup Table data.
- This describes the shape of your Lookup Table data.
Selector(s) from your incoming logs.
- The values from these selectors will be used to search for matches in your Lookup Table data.
A primary key for your Lookup Table data.
- This primary key is one of the fields you defined in your Lookup Table's schema. The value of the primary key is what will be compared with the value of the selector(s) from your incoming logs.
- See the below Primary key data types section to learn more about primary key requirements.
For local development and CI/CD: ensure you have the necessary configuration files in your environment.
- We recommend you make a fork of the panther-analysis repo to install the panther_analysis_tool.

Primary key data types

Your Lookup Table's primary key column must have one of the following data types:

String
Number
Array (of strings or numbers)
- Using an array lets you associate one row in your Lookup Table with multiple string or number primary key values. This prevents you from having to duplicate a certain row of data for multiple primary keys.

Example: string array vs. string primary key type

Perhaps you'd like to store user data in a Lookup Table so that incoming log events associated with a certain user are enriched with additional personal information. You'd like to match on the user's email address, which means the email field will be the primary key in the Lookup Table and the selector in the log events.

You are deciding whether the primary key column in your Lookup Table should be of type string or string array. First, review the below two events you might expect to receive from your log source:

# Incoming log event one
{
    "actor_email": "[email protected]",
    "action": "LOGIN"
}

# Incoming log event two
{
    "actor_email": "[email protected]",
    "action": "EXPORT_FILE"
}

Note that the two email addresses ([email protected] and [email protected]) belong to the same user, Jane Doe.

When Panther receives these events, you would like to use a Lookup Table to enrich each of them with Jane's full name and role. After enrichment, these events would look like the following:

# Log event one after enrichment
{
    "actor_email": "[email protected]",
    "action": "LOGIN",
    "p_enrichment": {
        "<lookup_table_name>": {
            "actor_email": {
                "full_name": "Jane Doe",
                "p_match":  "[email protected]",
                "role": "ADMIN"
            }
        }
    }
}

# Log event two after enrichment
{
    "actor_email": "[email protected]",
    "action": "EXPORT_FILE",
    "p_enrichment": {
        "<lookup_table_name>": {
            "actor_email": {
                "full_name": "Jane Doe",
                "p_match": "[email protected]",
                "role": "ADMIN"
            }
        }
    }
}

You can accomplish this enrichment by defining a Lookup Table with either:

(Recommended) A primary key column that is of type array of strings
A primary key column that is of type string

Using a Lookup Table with a primary key column that is of type array of strings, you can include Jane's multiple email addresses in one primary key entry, associated to one row of data. This might look like the following:

email (primary_key)

full_name

role

["[email protected]", "[email protected]"]

"Jane Doe"

"ADMIN"

Alternatively, you can define a Lookup Table with a primary key column that is of type string. However, because the match between the event and Lookup Table is made on the user's email address, and a user can have multiple email addresses (as is shown in Jane's case), you must duplicate the Lookup Table row for each email. This would look like the following:

email (primary_key)

full_name

role

"[email protected]"

"Jane Doe"

"ADMIN"

"[email protected]"

"Jane Doe"

"ADMIN"

While both options yield the same result (i.e., incoming log events are enriched in the same way), defining a Lookup Table with an array of strings primary key is recommended for its convenience and reduced proneness to maintenance error.

How to configure a Lookup Table

After fulfilling the prerequisites, Lookup tables can be created and configured using either of the following methods:

After choosing one of these methods, you can opt to work within the Panther Console or with the Panther Analysis Tool (PAT).

The maximum size for a row in a Lookup Table is 65535 bytes.

Option 1: Import via file upload

This option is best for data that is relatively static, such as information about AWS accounts or corporate subnets. You may want to set up a Lookup Table via a File Upload in the Panther Console. For example, a possible use case is adding metadata to distinguish developer accounts from production accounts in your AWS CloudTrail logs.

You can import via file upload through the Panther Console or with PAT:

Panther Console import via file upload

Log in to the Panther Console.
From the left sidebar, click Configure > Lookup Tables.
In the upper right side of the page, click Create New to add a new Lookup Table.
Configure the Lookup Table Basic Information:
- Enter a descriptive Lookup Name.
  - In the example screen shot, we use account_metadata.
- Enter a Description (optional) and a Reference (optional). Description is meant for content about the table, while Reference can be used to hyperlink to an internal resource.
- Next to Enabled? toggle the setting to Yes. Note: This is required to import your data later in this process.
Click Continue.
Configure the Associated Log Types:
- Select the Log Type from the dropdown.
- Type in the name of the Selectors, the foreign key fields from the log type you want enriched with your Lookup Table.
- You also can reference attributes in nested objects using JSON path syntax. For example, if you wanted to reference a field in a map you could do $.field.subfield.
- Click Add Log Type to add another if needed. In the example screen shot above, we selected AWS.CloudTrail logs and typed in accountID and recipientAccountID to represent keys in the CloudTrail logs.
Click Continue.
Configure the Table Schema. Note: If you have not already created a new schema, please see our documentation on creating schemas. You can also use your Lookup Table data to infer a schema. Once you have created a schema, you will be able to choose it from the dropdown on the Table Schema page while configuring a Lookup Table. Note: CSV schemas require column headers to work with Lookup Tables.
- Select a Schema Name from the dropdown.
- Select a Primary Key Name from the dropdown. This should be a unique column on the table, such as accountID.
Click Continue.
Drag and drop a file or click Select File to choose the file of your Lookup Table data to import. The file must be in .csv or .jsonl format.
Click Finish Setup. A source setup success page will populate.
Optionally, next to to Set an alarm in case this lookup table doesn't receive any data?, toggle the setting to YES to enable an alarm.
- Fill in the Number and Period fields to indicate how often Panther should send you this notification.
- The alert destinations for this alarm are displayed at the bottom of the page. To configure and customize where your notification is sent, see documentation on Panther Destinations.
Note: Notifications generated for a Lookup Table upload failing are accessible in the System Errors tab within the Alerts & Errors page in the Panther Console.

Once finished, you should be returned to the Lookup Table overview screen. Ensure that your new Lookup Table is listed.

PAT import via file upload

Note: Uploading data via Panther Analysis Tool works only for the small size of lookup data (< 1MB) that is mostly static. For larger or frequently changed files, we recommend using S3 to deliver them.

File setup

A Lookup Table requires the following files:

A YAML specification file containing the configuration for the table
A YAML file defining the schema to use when loading data into the table
A JSON or CSV file containing data to load into the table (optional, read further).

Folder setup

All files related to your Lookup Tables must be stored in a folder with a name containing lookup_tables. This could be a top-level lookup_tables directory, or sub-directories with names matching *lookup_tables*. You can use the Panther Analysis repo as a reference.

Writing the configuration files

It's usually prudent to begin writing the schema config first, because the table config will reference some of those values.

Create a YAML file for the schema, and save it with the rest of your custom schemas, outside the lookup_tables directory (for example, /schemas in the root of your panther analysis repo). This schema defines how to read the files you'll use to upload data to the table. If using a CSV file for data, then the schema should be able to parse CSV. The table schema is formatted the same as a log schema. For more information on writing schemas, read our documentation around managing Log Schemas.

Next, create a YAML file for the table configuration. For a Lookup Table with data stored in a local file, an example configuration would look like:

AnalysisType: lookup_table
LookupName: my_lookup_table # A unique display name
Schema: Custom.MyTableSchema # The schema defined in the previous step
Filename: ./my_lookup_table_data.csv # Relative path to data
Description: >
  A handy description of what information this table contains.
  For example, this table might convert IP addresses to hostnames
Reference: >
  A URL to some additional documentation around this table
Enabled: true # Set to false to stop using the table
LogTypeMap:
  PrimaryKey: ip                # The primary key of the table
  AssociatedLogTypes:           # A list of log types to match this table to
    - LogType: AWS.CloudTrail
      Selectors:
        - "sourceIPAddress"     # A field in CloudTrail logs
        - "p_any_ip_addresses"  # A panther-generated field works too
    - LogType: Okta.SystemLog
      Selectors:
        - "$.client.ipAddress"  # Paths to JSON values are allowed

Upload the schema file by running the update schemas command from the repository root: panther_analysis_tool update-custom-schemas ./schemas
Finally, from the root of the repo, upload the lookup table:panther_analysis_tool upload

Update Lookup Tables via Panther Analysis Tool:

Locate the YAML configuration file for the Lookup Table in question.
Open the file, and look for the field Filename. You should see a file path which leads to the data file.
Update or replace the file indicated in Filename.
Push your changes to Panther with the following code:
```
panther_analysis_tool upload
```
Optionally, you can specify only to upload the Lookup Table:
```
panther_analysis_tool upload --filter AnalysisType=lookup_table
```

Option 2: Sync via S3 source

In some cases, you may want to sync from an S3 source to set up a Lookup Table. For example, if you want to know what groups and permission levels are associated with the employees at your company. In this scenario, your company might have an AWS S3 source with an up-to-date copy of their Active Directory listing that includes groups and permissions information.

This option is best for a larger amount of data that updates more frequently from an S3 bucket. Any changes in the S3 bucket will sync to Panther.

You can sync via S3 through the Panther Console or with PAT:

Panther Console sync via s3

Log in to the Panther Console.
In the left sidebar, click Configure > Lookup Tables.
In the upper right side of the page, click Create New to add a new Lookup Table.
Configure the Lookup Table Basic Information:
- Enter a descriptive Lookup Name.
- Enter a Description (optional) and a Reference (optional).
  - Description is meant for content about the table, while Reference can be used to hyperlink to an internal resource.
- Make sure the Enabled? toggle is set to Yes.
  - Note: This is required to import your data later in this process.
Click Continue.
Configure the Associated Log Types:
- Select the Log Type from the dropdown.
- Type in the name of the Selectors, the foreign key fields from the log type you want enriched with your Lookup Table.
- Click Add Log Type to add another if needed. In the example screen shot above, we selected AWS.VPCFlow logs and typed in account to represent keys in the VPC Flow logs.
Click Continue.
Configure the Table Schema. Note: If you have not already created a new schema, please see our documentation on creating schemas. Once you have created a schema, you will be able to select it from the dropdown on the Table Schema page while configuring a Lookup Table.
1. Select a Schema Name from the dropdown.
2. Select a Primary Key Name from the dropdown. This should be a unique column on the table, such as accountID.
Click Continue.
On the "Choose Import Method" page, click Set Up next to "Sync Data from an S3 Bucket."
Set up your S3 source.
- Enter the Account ID, the 12-digit AWS Account ID where the S3 bucket is located.
- Enter the S3 URI, the unique path that identifies the specific S3 bucket.
- Optionally, enter the KMS Key if your data is encrypted using KMS-SSE.
- Enter the Update Period, the cadence your S3 source gets updated (defaulted to 1 hour).
Click Continue.
Set up an IAM Role.
- Please see the next section, Creating an IAM Role, for instructions on the three options available to do this.
Click Finish Setup. A source setup success page will populate.
Optionally, next to to Set an alarm in case this lookup table doesn't receive any data?, toggle the setting to YES to enable an alarm.
- Fill in the Number and Period fields to indicate how often Panther should send you this notification.
- The alert destinations for this alarm are displayed at the bottom of the page. To configure and customize where your notification is sent, see documentation on Panther Destinations.

Note: Notifications generated for a Lookup Table upload failing are accessible in the System Errors tab within the Alerts & Errors page in the Panther Console.

Creating an IAM Role

There are three options for creating an IAM Role to use with your Panther Lookup Table using an S3 source:

Create an IAM role using AWS Console UI

On the "Set Up an IAM role" page, during the process of creating a Lookup Table with an S3 source, locate the tile labeled "Using the AWS Console UI". On the right side of the tile, click Select.
Click Launch Console UI.
- You will be redirected to the AWS console in a new browser tab, with the template URL pre-filled.
- The CloudFormation stack will create an AWS IAM role with the minimum required permissions to read objects from your S3 bucket.
- Click the "Outputs" tab of the CloudFormation stack in AWS, and note the Role ARN.
Navigate back to your Panther account.
On the "Use AWS UI to set up your role" page, enter the Role ARN.
Click Finish Setup.

Create an IAM role using CloudFormation Template File

On the "Set Up an IAM role" page, during the process of creating a Lookup Table with an S3 source, locate the tile labeled "CloudFormation Template File". On the right side of the tile, click Select.
Click CloudFormation template, which downloads the template to apply it through your own pipeline.
Upload the template file in AWS:
1. Open your AWS console and navigate to the CloudFormation product.
2. Click Create stack.
3. Click Upload a template file and select the CloudFormation template you downloaded.
On the "CloudFormation Template" page in Panther, enter the Role ARN.
Click Finish Setup.

Create an IAM role manually

On the "Set Up an IAM role" page, during the process of creating a Lookup Table with an S3 source, click the link that says I want to set everything up on my own.

Create the required IAM role. You may create the required IAM role manually or through your own automation. The role must be named using the format PantherLUTsRole-${Suffix}(e.g., PantherLUTsRole-MyLookupTable).

The IAM role policy must include the statements defined below:

    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:GetBucketLocation",
            "Resource": "arn:aws:s3:::<bucket-name>",
            "Effect": "Allow"
        },
        {
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<bucket-name>/<input-file-path>",
            "Effect": "Allow"
        }
    ]
}

If your S3 bucket is configured with server-side encryption using AWS KMS, you must include an additional statement granting the Panther API access to the corresponding KMS key. In this case, the policy will look something like this:

    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:GetBucketLocation",
            "Resource": "arn:aws:s3:::<bucket-name>",
            "Effect": "Allow"
        },
        {
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<bucket-name>/<input-file-path>",
            "Effect": "Allow"
        },
        {
            "Action": ["kms:Decrypt", "kms:DescribeKey"],
            "Resource": "arn:aws:kms:<region>:<your-accound-id>:key/<kms-key-id>",
            "Effect": "Allow"
        }
    ]
}

On the "Setting up role manually" page in Panther, enter the Role ARN.
- This can be found in the "Outputs" tab of the CloudFormation stack in your AWS account.
Click Finish Setup, and you will be redirected to the Lookup Tables list page with your new Employee Directory table listed.

PAT sync via S3

File setup

A Lookup Table requires the following files:

A YAML specification file containing the configuration for the table
A YAML file defining the schema to use when loading data into the table
A JSON or CSV file containing data to load into the table (optional, read further).

Folder setup

Writing the configuration files

It's usually prudent to begin writing the schema config first, because the table config will reference some of those values.

Create a YAML file for the schema, and save it in the lookup table directory (for example, lookup_tables/my_table/my_table_schema.yml). This schema defines how to read the files you'll use to upload data to the table. If using a CSV file for data, then the schema should be able to parse CSV. The table schema is formatted the same as a log schema. For more information on writing schemas, read our documentation around managing Log Schemas.

Next, create a YAML file for the table configuration. For a Lookup Table with data stored in a file in S3, an example configuration would look like this:

AnalysisType: lookup_table
LookupName: my_lookup_table # A unique display name
Schema: Custom.MyTableSchema # The schema defined in the previous step
Refresh:
  RoleArn: arn:aws:iam::123456789012:role/PantherLUTsRole-my_lookup_table # A role in your organization's AWS account
  ObjectPath: s3://path/to/my_lookup_table_data.csv
  PeriodMinutes: 120 # Sync from S3 every 2 hours
Description: >
  A handy description of what information this table contains.
  For example, this table might convert IP addresses to hostnames
Reference: >
  A URL to some additional documentation around this table
Enabled: true # Set to false to stop using the table
LogTypeMap:
  PrimaryKey: ip                # The primary key of the table
  AssociatedLogTypes:           # A list of log types to match this table to
    - LogType: AWS.CloudTrail
      Selectors:
        - "sourceIPAddress"     # A field in CloudTrail logs
        - "p_any_ip_addresses"  # A panther-generated field works too
    - LogType: Okta.SystemLog
      Selectors:
        - "$.client.ipAddress"  # Paths to JSON values are allowed

Finally, from the root of the repo, upload the lookup table:panther_analysis_tool upload
Upload the schema file by running the update schemas command from the repository root: panther_analysis_tool update-custom-schemas ./schemas

Prerequisites

Before you can configure your Lookup Table to sync with S3, you'll need to have the following ready:

The ARN of an IAM role in AWS, which Panther can use to access the S3 bucket. For more information on setting up an IAM role for Panther, see the section on Creating an IAM Role.
The path to the file you intend to store data in. The path should be of the following format: s3://bucket-name/path_to_file/file.csv

Configuring Lookup Table for S3

Navigate to the YAML specification file for this Lookup Table.
In the file, locate (or add) the Refresh field.
Specify the RoleARN, ObjectPath, and PeriodMinutes fields. For specs on the allowed values, see our Lookup Table Config File Specification.
Save the config file, then upload your changes with panther_analysis_tool.

Writing a detection using Lookup Table data

After you configure a Lookup Table, you can write detections based on the additional context from your Lookup Table.

For example, if you configured a Lookup Table to distinguish between developer and production accounts in AWS CloudTrail logs, you might want receive an alert only if the following circumstances are both true:

A user logged in who did not have MFA enabled.
The AWS account is a production (not a developer) account.

See how to create a detection using Lookup Table data below:

In Python, you can use the deep_get helper function to retrieve the looked up field from p_enrichment using the foreign key field in the log. The pattern looks like this:

deep_get(event, 'p_enrichment', <Lookup Table name>, <foreign key in log>, <field in Lookup Table>)

The Lookup Table name, foreign key and field name are all optional parameters. If not specified, deep_get will return a hierarchical dictionary with all the enrichment data available. Specifying the parameters will ensure that only the data you care about is returned.

The rule would become:

from panther_base_helpers import deep_get
 def rule(event):
   is_production = deep_get(event, 'p_enrichment', 'account_metadata',
'recipientAccountId', 'isProduction')
   return not event.get('mfaEnabled') and is_production

In YAML, you can create an Enrichment match expression.

Detection:
  - Enrichment:
      Table: account_metadata
      Selector: recipientAccountId
      FieldPath: isProduction
    Condition: Equals
    Value: true
  - KeyPath: mfaEnabled
    Condition: Equals
    Value: false

The Panther rules engine will take the looked up matches and append that data to the event using the key p_enrichment in the following JSON structure:

{ 
    "p_enrichment": {
        <name of lookup table>: { 
            <key in log that matched>: <matching row looked up>,
            ...
	    <key in log that matched>: <matching row looked up>,
	}    
    }
}

Example:

 {
  "p_enrichment": {
      "account_metadata": {
          "recipientAccountId": {
              "accountID": "90123456", 
              "isProduction": false, 
              "email": "[email protected]",
              "p_match": "90123456"
              }
          }
      }
}

If the value of the matching log key is an array (e.g., the value of p_any_aws_accout_ids), then the lookup data is an array containing the matching records.

{ 
    "p_enrichment": {
        <name of lookup table>: { 
            <key in log that matched that is an array>: [
                <matching row looked up>,
                <matching row looked up>,
                <matching row looked up>
            ]
	}
     }
}

Example:

 {
  "p_enrichment": {
      "account_metadata": {
          "p_any_aws_account_ids": [
             {
              "accountID": "90123456", 
              "isProduction": false, 
              "email": "[email protected]",
              "p_match": "90123456"
              },
              {
              "accountID": "12345678", 
              "isProduction": true, 
              "email": "[email protected]",
              "p_match": "12345678"
              }
          ]
      }
  }
}

Testing detections that use enrichment

For rules that use p_enrichment, click Enrich Test Data in the upper right side of the JSON code editor to populate it with your Lookup Table data. This allows you to test a Python function with an event that contains p_enrichment.

Using Data Explorer with Lookup Tables

Query via Data Explorer

p_enrichment is not stored in the Data Lake, but you can join against the Lookup Table directly to any table in the Data Explorer with a query similar to the following:

with logs as 
(select * from my_logs), 
lookup as (select * from my_lookup_table) 
select logs.fieldA, lookup.fieldB 
from logs join lookup on logs.selector_field = lookup.key_field

For more information on using Data Explorer, please see the documentation: Data Explorer.

Lookup Tables generated in Data Explorer

Lookup Tables will generate a number of tables in the Data Explorer. There are 2 main types of tables generated:

The current Lookup Table version: example
- Contains the most up to date Lookup Table data
- Should be targeted in any saved queries - or anywhere you expect to see the most current data
- This table name will never change
- In the example above, the table is named example
The current History Table version: example_history
- Contains a version history of all data uploaded to the current Lookup Table
- The table schema is identical to the current Lookup Table (here named example) except for 2 additional fields:
  - p_valid_start
  - p_valid_end
- These fields can be used to view the state of the lookup table at any previous point in time

When a new schema is assigned to the Lookup Table, the past versions of the Lookup Table and the History Table are both preserved as well.

These past versions are preserved by the addition of a numeric suffix, _### and will be present for both the Lookup Table and the History Table. This number will increment by 1 each time the Schema associated with the Lookup Table is replaced, or each time the primary key of the Lookup Table changed.

The current Lookup Table and History Table are views that point at the highest numeric suffix table. This means when a new Lookup Table (called example below) is created, you will see 4 tables:

example
example_history
example_001
example_history_001

The current-version tables shown here (example and example_history) are views that are pointing at the respective underlying tables (suffixed with _001).

If a new schema is created, then _002 suffixed tables will be created, and the current-version tables will now point at those. The _001 tables will be no longer updated.

PreviousEnrichment NextLookup Table Examples

Last updated 1 year ago

Was this helpful?

Overview

How Lookup Tables work

How is data matched between logs and Lookup Tables?

Prerequisites for configuring a Lookup Table

Primary key data types

Example: string array vs. string primary key type

How to configure a Lookup Table

Option 1: Import via file upload

File setup

Folder setup

Writing the configuration files

Update Lookup Tables via Panther Analysis Tool:

Option 2: Sync via S3 source

Creating an IAM Role

Create an IAM role using AWS Console UI

Create an IAM role manually

File setup

Folder setup

Writing the configuration files

Prerequisites

Configuring Lookup Table for S3

Writing a detection using Lookup Table data

Testing detections that use enrichment

Using Data Explorer with Lookup Tables

Query via Data Explorer

View the Lookup Table data with Data Explorer

Lookup Tables generated in Data Explorer