Custom Logs
Define, write, and manage custom schemas
Last updated
Was this helpful?
Define, write, and manage custom schemas
Last updated
Was this helpful?
Panther allows you to define your own custom log schemas. You can ingest custom logs into Panther via a , and your custom schemas will then normalize and classify the data.
This page explains how to determine how many custom schemas you need, infer, write, and manage custom schemas, as well as how to upload schemas with . For information on how to use pantherlog
to work with custom schemas, please see .
Custom schemas are identified by a Custom.
prefix in their name and can be used wherever a natively supported log type is used:
Log ingestion
You can onboard custom logs through a (e.g., HTTP webhook, S3, SQS, Google Cloud Storage, Azure Blob Storage)
Detections
You can write for custom schemas.
Investigations
You can query the data in and in . Panther will create a new table for the custom schema once you onboard a source that uses it.
There is no definitive rule for determining how many schemas you need to represent data coming from a custom source, as it depends on the intent of your various log events and the degree of field overlap between them.
In general, it's recommended to create the minimum number of schemas required for each log type's shape to be represented by its own schema (with room for some field variance between log types to be represented by the same schema). A rule of thumb is: if two different types of logs (e.g., application audit logs and security alerts) have less than 50% overlap in required fields, they should use different schemas.
In the table below, see example scenarios and their corresponding schema recommendations:
You have one type of log with fields A
, B
, and C
, and a different type of log with fields X
, Y
, and Z
.
Create two different schemas, one for each log type.
While it's technically possible to create one schema with all fields (A
, B
, C
, X
, Y
, Z
) marked as optional (i.e., required: false
), it's not recommended, as downstream operations like detection writing and searching will be made more difficult.
You have one type of log that always has fields A
, B
, and C
, and a different type of log that always has fields A
, B
, and Z
.
Create one schema, with fields A
and B
marked as required and fields C
and Z
marked as optional.
After you have determined how many schemas you need, you can define them.
There are multiple ways to define a custom schema. You can:
Infer one or more schemas from data.
Create a schema manually.
Instead of writing a schema manually, you can let the Panther Console or the pantherlog
CLI tool infer a schema (or multiple schemas) from your data.
When Panther infers a schema, note that if your data sample has:
A field of type object
with more than 200 fields, that field will be classified as type json
.
A field with mixed data types (i.e., it is an array with multiple data types, or the field itself has varying data types), that field will be classified as type json
.
There are multiple ways to infer a schema in Panther:
In the Panther Console:
In the CLI workflow:
To get started, follow these steps:
Log in to your Panther Console.
On the left sidebar, navigate to Configure > Schemas.
At the top right of the page next to the search bar, click Create New.
Enter a Schema ID, Description, and Reference URL.
The Description is meant for content about the table, while the Reference URL can be used to link to internal resources.
In the Schema section, in the Infer a schema from sample events tile, click Start.
In the Infer schema from sample logs modal, click one of the radio buttons:
Upload Sample file: Upload a sample set of logs: Drag a file from your system over the pop-up modal, or click Select file and choose the log file.
Note that Panther does not support CSV without headers for inferring schemas.
After uploading a file, Panther will display the raw logs in the UI. You can expand the log lines to view the entire raw log. Note that if you add another sample set, it will override the previously-uploaded sample.
Lines: Events are separated by a new line character.
JSON: Events are in JSON format.
JSON Array: Events are inside an array of JSON objects.
CloudWatch Logs: Events came from CloudWatch Logs.
Auto: Panther will automatically detect the appropriate stream type.
Click Infer Schema.
Panther will begin to infer a schema from the raw sample logs.
Panther will attempt to infer multiple timestamp formats.
To ensure the schema works properly against the sample logs you uploaded and against any changes you made to the schema, click Run Test.
This test will validate that the syntax of your schema is correct and that the log samples you have uploaded into Panther are successfully matching against the schema.
All successfully matched logs will appear under Matched; each log will display the column, field, and JSON view.
All unsuccessfully matched logs will appear under Unmatched; each log will display the error message and the raw log.
Click Save to publish the schema.
To create a custom schema manually:
In the Panther Console, navigate to Configure > Schemas.
Click Create New in the upper right corner.
Enter a Schema ID, Description, and Reference URL.
The Description is meant for content about the table, while the Reference URL can be used to link to internal resources.
In the Schema section, in the Create your schema from scratch tile, click Start.
In the Parser section, if your schema requires a parser other than the default (JSON) parser, select it. Learn more about the other parser options on the following pages:
In the Fields & Indicators section, write or paste your YAML log schema fields.
(Optional) In the Universal Data Model section, define Core Field mappings for your schema.
At the bottom of the window, click Run Test to verify your schema contains no errors.
Note that syntax validation only checks the syntax of the Log Schema. It can still fail to save due to name conflicts.
Click Save.
You can now navigate to Configure > Log Sources and add a new source or modify an existing one to use the new Custom.SampleAPI
_Log Type. Once Panther receives events from this source, it will process the logs and store them in the custom_sampleapi
table.
See the tabs below for instructions on writing schemas for JSON logs and for text logs.
To parse log files where each line is JSON you have to define a log schema that describes the structure of each log entry.
In the example schemas below, the first tab displays the JSON log structure and the second tab shows the Log Schema.
Minified JSON log example:
{"method":"GET","path":"/-/metrics","format":"html","controller":"MetricsController","action":"index","status":200,"params":[],"remote_ip":"1.1.1.1","user_id":null,"username":null,"ua":null,"queue_duration_s":null,"correlation_id":"c01ce2c1-d9e3-4e69-bfa3-b27e50af0268","cpu_s":0.05,"db_duration_s":0,"view_duration_s":0.00039,"duration_s":0.0459,"tag":"test","time":"2019-11-14T13:12:46.156Z"}
When creating or editing a custom schema, you can use field suggestions generated by Panther. To use this functionality:
In the Panther Console, click into the YAML schema editor.
To edit an existing schema, click Configure > Schemas > [name of schema you would like to edit] > Edit.
To create a new schema, click Configure > Schemas > Create New.
Press Command+I
on macOS (or Control+I
on PC).
The schema editor will display available properties and operations based on the position of the text cursor.
Panther allows custom schemas to be edited. Specifically, you can perform the following actions:
Add new fields.
Rename or delete existing fields.
Edit, add, or remove all properties of existing fields.
Modify the parser
configuration to fix bugs or add new patterns.
To edit a custom schema:
Navigate to your custom schema's details page in the Panther Console.
Click Edit in the upper-right corner of the details page.
Modify the schema.
To more easily see your changes (or copy or revert deleted lines), click Single Editor, then Diff View.
In the upper-right corner, click Update.
Click Run Test to check the YAML for structural compliance. Note that the rules will only be checked after you click Update. The update will be rejected if the rules are not followed.
Editing schema fields might require updates to related detections and saved queries. Click Related Detections in the alert banner displayed above the schema editor to view, update, and test the list of affected detections and saved queries.
Queries will work across changes to a Type provided the query does not use a function or operator which requires a field type that is not castable across Types.
Good example: The Type is edited from string
to int
where all existing values are numeric (i.e. "1"
). A query using the function sum
aggregates old and new values together.
Bad example: The Type is edited from string
to int
where some of the existing values are non-numeric (i.e. "apples"
). A query using the function sum
excludes values that are non-numeric.
This table shows which Types can be cast as each Type when running a query. Schema editing allows any Type to be changed to another Type.
boolean
same
yes
yes
yes
no
no
string
yes
same
numbers only
numbers only
numbers only
numbers only
int
yes
yes
same
yes
yes
numbers only
bigint
yes
yes
yes
same
yes
numbers only
float
yes
yes
yes
yes
same
numbers only
timestamp
no
yes
no
no
no
same
You can archive and unarchive custom schemas in Panther. You might choose to archive a schema if it's no longer used to ingest data, and you do not want it to appear as an option in various dropdown selectors throughout Panther. In order to archive a schema, it must not be in use by any log sources. Schemas that have been archived still exist indefinitely; it is not possible to permanently delete a schema.
Attempting to create a new schema with the same name as an archived schema will result in a name conflict, and prompt you to instead unarchive and edit the existing schema.
To archive or unarchive a custom schema:
In the Panther Console, navigate to Configure > Schemas.
Locate the schema you'd like to archive or unarchive.
On the right-hand side of the schema's row, click the Archive or Unarchive icon.
On the confirmation modal, click Continue.
To validate that a custom schema will work against your logs, you can test it against sample logs:
In the left-hand navigation bar in your Panther Console, click Configure > Schemas.
Click on a custom schema's name.
In the upper-right corner of the schema details page, click Test Schema.
Log source schemas in Panther define the log event fields that will be stored in Panther. When field discovery is enabled, data from fields in incoming log events that are not defined in the corresponding schema will not be dropped—instead, the fields will be identified, and the data will be stored. This means you can subsequently query data from these fields, and write detections referencing them.
If the name of a discovered field contains a special character—i.e., a character that is not alphanumeric, an underscore (_
), or a dash (-
)— it will be transliterated using the algorithm below:
@
to at_sign
,
to comma
`
to backtick
'
to apostrophe
$
to dollar_sign
*
to asterisk
&
to ambersand
!
to exclamation
%
to percent
+
to plus
/
to slash
\
to backslash
#
to hash
~
to tilde
=
to eq
All other ASCII characters (including space
) will be replaced with an underscore (_
). Non-ASCII characters are transliterated to their closest ASCII equivalent.
This transliteration affects only field names; values are not modified.
Field discovery currently has the following limitations:
The maximum number of top-level fields that can be discovered is 2,000. Within each object
field, a maximum of 1,000 fields can be discovered.
There is no limitation on the number of overall fields discovered.
The uploader command receives a base path as an argument and then proceeds to recursively discover all files with extensions .yml
and .yaml
.
It is recommended to keep schema files separately from other unrelated files, otherwise you may notice several unrelated errors for attempting to upload invalid schema files.
The uploader will check if an existing schema exists and proceed with the update or create a new one if no matching schema name is found.
If you have deduced that you need more than one schema and you'd like to use Panther's to generate them, it's recommended to do one of the following:
Use the method multiple times with samples from different log types
Send differently structured data to separate folders in a S3 bucket, then use the inference method
If you use either the or methods, you risk Panther generating a single schema that represents all log types sent to the source.
See .
See .
To infer a schema from sample data you've uploaded, see the tab, below.
To infer a schema from S3 data received in Panther, see the tab, below.
To infer one or more schemas from historical S3 data, see the tab, below.
To infer a schema from HTTP data received in Panther, see the tab, below.
Use the command.
You can generate a schema by uploading sample logs into the Panther Console. If you'd like to use the command line instead, follow the .
Optionally enable Field Discovery by clicking its toggle ON
. Learn more in .
Paste sample events(s): Directly paste or type sample events in the editor.
Select the appropriate Stream Type ().
Once the schema is generated, it will appear in the schema editor box.
To see the test results, click View Events.
You can generate and publish a schema for a custom log source from live data streaming from an S3 bucket into Panther. You will first in Panther, then , then .
Follow the instructions to without having a schema in place.
While viewing your log source's Overview tab, scroll down to the Attach a schema to start classifying data section.
You will see a S3 Prefixes & Schemas popup modal:
On the page you are directed to, you can view the raw data Panther has received at the bottom of the screen:
Click Infer Schema to generate a schema.
If you don't need to specify a specific prefix, you can leave this field empty to use the catch-all prefix that is called *
.
Click Done.
Review the schema and its fields by clicking its name.
Since the schema is in Draft, you can change, remove, or add fields as needed.
In the Test Schemas section at the top of the screen, click Run Test.
On the Test Schemas modal that pops up, select the Time Period you would like to test your schema against, then click Start Test.
Depending on the time range and amount of data, the test may take a few minutes to complete.
If there are Unmatched Events, inspect the errors and the JSON to decipher what caused the failures.
In the upper right corner, click Save.
Follow the instructions to without having a schema in place.
If you have onboarded the S3 source , that role must have the ListBucket
permission.
Alternatively, you can access the folder inspection of your S3 bucket via the success page after in Panther. From that page, click Attach or Infer Schemas.
Alternatively, if there is a folder or subfolder that you do not want Panther to process, select it and click Exclude.
If you have an existing schema that matches the data, click the Schema dropdown on the right side of the row, then select the schema:
Click Infer n
Schemas.
You can generate and publish a schema for a custom log source from live data streaming from an HTTP (webhook) source into Panther. You will first in Panther, then , then .
After creating your in Panther, you can view raw data coming into Panther and infer a schema from it:
Follow the in Panther.
While viewing your log source's Overview tab, scroll down to the Attach a schema to start classifying data section.
Once you see data populating within Raw Events, click Infer Schema.
Click Done.
Click the draft schema's name to review its inferred fields.
Since the schema is in Draft, you can add, remove, and otherwise change fields as needed.
In the Test Schemas section at the top of the screen, click Run Test.
In the Test Schemas pop-up modal, select the Time Period you would like to test your schema against, then click Start Test.
Depending on the time range and amount of data, the test may take a few minutes to complete.
If there are Unmatched Events, inspect the errors and the JSON to decipher what caused the failures.
In the upper right corner, click Save.
Optionally enable Automatic Field Discovery by clicking its toggle ON
. Learn more in .
The Schema section will default to using Separate Sections. If you'd like to write your entire schema in one editor window, click Single Editor.
You can use Panther-generated .
Learn more in .
You can also now write to match against these logs and query them using or .
Note that you can use the to generate your Log Schema.
You can edit the YAML specifications directly in the Panther Console or they can be . For more information on the structure and fields in a Log Schema, see the .
It's also possible to use the with JSON logs to perform transformations outside of .
.
.
You can use Panther-generated .
Archiving a schema does not affect any data ingested using that schema already stored in the data lake—it is still queryable using and . By default, archived schemas are not shown in the schema list view (visible on Configure > Schemas), but can be shown by modifying Status, within Filters, in the upper right corner. In , tables of archived schemas are not shown under Tables.
If you are archiving a schema and it is currently associated to one or more log sources, the confirmation modal will prompt you to first detach the schema. Once you have done so, click Refresh.
The "Test Schema against sample logs" feature found on the Schema Edit page in the Panther Console supports Lines, CSV (with or without headers), JSON, JSON Array, CloudWatch Logs, and Auto. See for examples.
Field discovery is currently only available for custom schemas, not Panther-managed ones. See .
If your schema uses the and you are parsing CSV logs without a header, only fields included in the columns
section of your schema will be discovered.
This does not apply if your schema uses the and you are parsing CSV logs with a header.
If your schema uses the , only fields defined inside the match
patterns will be discovered.
If your schema uses the , only fields defined inside the match
patterns will be discovered.
If you choose to maintain your log schemas outside of Panther, for example in order to keep them under version control and review changes before updating, you can upload the YAML files programmatically with the .
The schema
field must always be defined in the YAML file and be consistent with the existing schema name for an update to succeed. For a list of all available CI/CD fields see our .
Visit the Panther Knowledge Base to that answer frequently asked questions and help you resolve common errors and issues.