Field Discovery

Capture unexpected fields in evolving log data

Overview

Field discovery helps ensure that if a log source you've connected in Panther changes the structure of the events it generates upstream, you won't lose the data in changed fields.

Log source schemas define how incoming raw events are parsed and stored in Panther by outlining which fields, for a given log type, Panther should expect to find. When an incoming event includes a field not defined in the corresponding schema:

  • Without field discovery enabled, data from this unrecognized field is dropped.

  • With field discovery enabled, the field is identified and its data stored. This means you can subsequently query data from these fields, and write detections referencing them.

Field discovery is available for custom log schemas and select Panther-managed schemas: GitHub.Webhook and GitHub.Audit.

Field discovery for the Panther-managed GitHub.Webhook and GitHub.Audit schemas is in open beta starting with Panther version 1.114, and is available to all customers. Please share any bug reports and feature requests with your Panther support team.

Enabling field discovery

Enabling field discovery for a custom schema

To enable field discovery for a custom schema:

  • Follow these Editing a custom schema instructions, making sure to toggle Field Discovery ON in the Basic Info section.

In a Basic Info section, a Field Discovery toggle is set to ON.

Enabling field discovery for a Panther-managed schema

No action is required to enable field discovery for Panther-managed schemas—it is enabled by default for all schemas that support it. Currently, this includes the GitHub.Webhook and GitHub.Audit schemas.

How a schema is changed when a field is discovered

When a field is discovered:

  • For a custom schema: The schema is updated to include the discovered field.

  • For a Panther-managed schema: The schema in your Panther instance is updated to include the discovered field. If the field is discovered with regularity and stability across multiple Panther instances, it may be promoted permanently to the Panther-managed schema.

Handling of special characters in field names

If the name of a discovered field contains a special character—i.e., a character that is not alphanumeric, an underscore (_), or a dash (-)— it will be transliterated using the algorithm below:

  • @ to at_sign

  • , to comma

  • ` to backtick

  • ' to apostrophe

  • $ to dollar_sign

  • * to asterisk

  • & to ambersand

  • ! to exclamation

  • % to percent

  • + to plus

  • / to slash

  • \ to backslash

  • # to hash

  • ~ to tilde

  • = to eq

Additionally, if a dash (-) or number is the first character of a field name, it will be transliterated. A dash becomes dash, and numbers are spelled out (e.g., 7 becomes seven).

All other ASCII characters (including space) will be replaced with an underscore (_). Non-ASCII characters are transliterated to their closest ASCII equivalent.

This transliteration affects only field names; values are not modified.

Limitations of field discovery

Field discovery currently has the following limitations:

  • The maximum number of top-level fields that can be discovered is 2,000. Within each object field, a maximum of 1,000 fields can be discovered.

    • There is no limitation on the number of overall fields discovered.

  • If your schema uses the csv parser and you are parsing CSV logs without a header, only fields included in the columns section of your schema will be discovered.

    • This does not apply if your schema uses the csv parser and you are parsing CSV logs with a header.

  • If your schema uses the fastmatch parser, only fields defined inside the match patterns will be discovered.

  • If your schema uses the regex parser, only fields defined inside the match patterns will be discovered.

Last updated

Was this helpful?