# Custom Logs

## Overview

Panther allows you to define your own custom log schemas. You can ingest custom logs into Panther via a [Data Transport](https://docs.panther.com/data-onboarding/data-transports), and your custom schemas will then normalize and classify the data.

This page explains how to determine how many custom schemas you need, infer, write, and manage custom schemas, as well as how to upload schemas with [Panther Analysis Tool (PAT)](https://docs.panther.com/panther-developer-workflows/detections-repo/pat). For information on how to use `pantherlog` to work with custom schemas, please see [`pantherlog` CLI tool](https://docs.panther.com/panther-developer-workflows/pantherlog).

Custom schemas are identified by a `Custom.` prefix in their name and can be used wherever a natively supported log type is used:

* Log ingestion
  * You can onboard custom logs through a [Data Transport](https://docs.panther.com/data-onboarding/data-transports) (e.g., HTTP webhook, S3, SQS, Google Cloud Storage, Azure Blob Storage)
* Detections
  * You can write [rules and scheduled rules](https://docs.panther.com/detections/rules) for custom schemas.
* Investigations
  * You can query the data in [Search](https://docs.panther.com/search/search-tool) and in [Data Explorer](https://docs.panther.com/search/data-explorer). Panther will create a new table for the custom schema once you onboard a source that uses it.

## Determine how many custom schemas you need

There is no definitive rule for determining how many schemas you need to represent data coming from a custom source, as it depends on the intent of your various log events and the degree of field overlap between them.

In general, it's recommended to create the minimum number of schemas required for each log type's shape to be represented by its own schema (with room for some field variance between log types to be represented by the same schema). A rule of thumb is: if two different types of logs (e.g., application audit logs and security alerts) have less than 50% overlap in required fields, they should use different schemas.

In the table below, see example scenarios and their corresponding schema recommendations:

<table data-full-width="false"><thead><tr><th width="343">Scenario</th><th>Schema recommendation</th></tr></thead><tbody><tr><td>You have one type of log with fields <code>A</code>, <code>B</code>, and <code>C</code>, and a different type of log with fields <code>X</code>, <code>Y</code>, and <code>Z</code>.</td><td><p>Create two different schemas, one for each log type.</p><p>While it's technically possible to create one schema with all fields (<code>A</code>, <code>B</code>, <code>C</code>, <code>X</code>, <code>Y</code>, <code>Z</code>) marked as optional (i.e., <code>required: false</code>), it's not recommended, as downstream operations like detection writing and searching will be made more difficult.</p></td></tr><tr><td>You have one type of log that always has fields <code>A</code>, <code>B</code>, and <code>C</code>, and a different type of log that always has fields <code>A</code>, <code>B</code>, and <code>Z</code>.</td><td>Create one schema, with fields <code>A</code> and <code>B</code> marked as required and fields <code>C</code> and <code>Z</code> marked as optional.</td></tr></tbody></table>

After you have determined how many schemas you need, you can define them.

{% hint style="info" %}
If you have deduced that you need more than one schema and you'd like to use Panther's [schema inference tools](#automatically-infer-the-schema-in-panther) to generate them, it's recommended to do one of the following:

* Use the [Inferring a custom schema from sample logs](#inferring-a-custom-schema-from-sample-logs) method multiple times with samples from different log types
* Send differently structured data to separate folders in a S3 bucket, then use the [Inferring custom schemas from historical S3 data](#inferring-custom-schemas-from-historical-s3-data) inference method

If you use either the [Inferring a custom schema from S3 data received in Panther](#inferring-a-custom-schema-from-s3-data-received-in-panther) or [Inferring a custom schema from HTTP data received in Panther](#inferring-a-custom-schema-from-http-data-received-in-panther) methods, you risk Panther generating a single schema that represents all log types sent to the source.
{% endhint %}

## How to define a custom schema

{% hint style="info" %}
For custom log types, Panther supports ingesting data sent in JSON, XML, or CSV (with or without headers) format. For [inferring schemas](#automatically-infer-the-schema-in-panther), Panther does not support CSV without headers.
{% endhint %}

There are multiple ways to define a custom schema. You can:

* Infer one or more schemas from data: see [Automatically infer the schema in Panther](#automatically-infer-the-schema-in-panther).
* Create a schema manually: see [Create the schema yourself](#create-the-schema-yourself).

## Automatically infer the schema in Panther

Instead of writing a schema manually, you can let the Panther Console or the `pantherlog` CLI tool infer a schema (or multiple schemas) from your data.

When Panther infers a schema, note that if your data sample has:

* A field of type `object` with more than 200 fields, that field will be classified as type `json`.
* A field with mixed data types (i.e., it is an array with multiple data types, or the field itself has varying data types), that field will be classified as type `json`.

### How to infer a schema

There are multiple ways to infer a schema in Panther:

* In the Panther Console:
  * To infer a schema from sample data you've uploaded, see the [Inferring a custom schema from sample logs](#inferring-a-custom-schema-from-sample-logs) tab, below.
  * To infer a schema from S3 data received in Panther, see the [Inferring a custom schema from S3 data received in Panther](#inferring-a-custom-schema-from-s3-data-received-in-panther) tab, below.
  * To infer one or more schemas from historical S3 data, see the [Inferring custom schemas from historical S3 data](#inferring-custom-schemas-from-historical-s3-data) tab, below.
  * To infer a schema from HTTP data received in Panther, see the [Inferring a custom schema from HTTP data received in Panther](#inferring-a-custom-schema-from-http-data-received-in-panther) tab, below.
* In the CLI workflow:
  * Use the [`pantherlog infer`](https://docs.panther.com/panther-developer-workflows/pantherlog#infer-generate-a-schema-from-json-log-samples) command.

{% tabs %}
{% tab title="Sample logs" %}
**Inferring a custom schema from sample logs**

You can generate a schema by uploading sample logs into the Panther Console. If you'd like to use the command line instead, follow the [instructions on using the pantherlog CLI tool here](https://docs.panther.com/panther-developer-workflows/pantherlog#infer-generate-a-schema-from-json-log-samples).

To get started, follow these steps:

1. Log in to your Panther Console.
2. In the left-hand navigation bar, click **Configure > Schemas.**
3. At the top right of the page next to the search bar, click **Create New**.
4. Enter a **Schema ID**, **Description**, and **Reference URL**.
   * The Description is meant for content about the table, while the Reference URL can be used to link to internal resources.
5. Optionally enable **Field Discovery** by clicking its toggle `ON`. Learn more on [Field Discovery](https://docs.panther.com/data-onboarding/field-discovery).
6. In the **Schema** section, in the **Infer a schema from sample events** tile, click **Start**.
7. In the **Infer schema from sample logs** modal, click one of the radio buttons:
   * **Upload Sample file**: Upload a sample set of logs: Drag a file from your system over the pop-up modal, or click **Select file** and choose the log file.
     * Panther does not support CSV without headers for inferring schemas unless [Panther AI](https://docs.panther.com/ai) is enabled.
   * **Paste sample events(s)**: Directly paste or type sample events in the editor.\
     ![In the Panther Console, there is a screen labeled "Infer Schema from Sample Logs." At the bottom of the screen shot, there is a section to Drag and drop a file or select a file to upload.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-5439a40bba6c016b9c3f8609ee26b12fb737462a%2Fimage.png?alt=media)
8. After uploading a file, Panther will display the raw logs in the UI. You can expand the log lines to view the entire raw log. Note that if you add another sample set, it will override the previously-uploaded sample.
9. Select the appropriate **Stream Type** ([view examples for each type here](https://docs.panther.com/data-onboarding/reference#stream-type)).
   * **Auto:** Panther will automatically detect the appropriate stream type.
   * **Lines:** Events are separated by a new line character.
   * **JSON:** Events are in JSON format.
   * **JSON Array:** Events are inside an array of JSON objects.
   * **CloudWatch Logs:** Events came from CloudWatch Logs.
   * **XML:** Events are in [XML format](https://docs.panther.com/data-onboarding/reference#xml-stream-type).
10. If you've uploaded JSON logs, click **Infer Schema**. (If you have uploaded non-JSON logs and have [Panther AI enabled](https://docs.panther.com/ai#enabling-panther-ai), click **Infer Schema with Panther AI**, then **Confirm**).
    * Panther will begin to infer a schema from the raw sample logs.
    * Panther will attempt to infer multiple timestamp formats.
    * Once the schema is generated, it will appear in the schema editor box.\
      ![](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-8a129f8bded6ee5faadbf83de819547070f6e916%2Fimage.png?alt=media)
11. To ensure the schema works properly against the sample logs you uploaded and against any changes you made to the schema, click **Run Test**.
    * This test will validate that the syntax of your schema is correct and that the log samples you have uploaded into Panther are successfully matching against the schema.
    * To see the test results, click **View Events**.\
      ![On the left is a "Test" button. To its right is the text "Schema test against 1 total raw events completed," then a "View Events" button.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-781851d4a70ae67e3461ae0bf07564199da91628%2Fimage.png?alt=media)
      * All successfully matched logs will appear under **Matched**; each log will display the column, field, and JSON view.
      * All unsuccessfully matched logs will appear under **Unmatched**; each log will display the error message and the raw log.
12. Click **Save** to publish the schema.

{% hint style="info" %}
Panther will infer from all logs uploaded, but will only display up to 100 logs to ensure fast response time when generating a schema.
{% endhint %}
{% endtab %}

{% tab title="S3 data received in Panther" %}
**Inferring a custom schema from S3 data received in Panther**

You can generate and publish a schema for a custom log source from live data streaming from an S3 bucket into Panther. You will first [view your S3 data](#view-raw-s3-data) in Panther, then [infer a schema](#infer-a-schema-from-raw-data), then [test the schema](#test-the-schema-with-raw-data).

**View raw S3 data**

After onboarding your S3 bucket into Panther, you can view raw data coming into Panther and infer a schema from it:

1. Follow the instructions to [onboard an S3 bucket onto Panther](https://docs.panther.com/data-onboarding/data-transports/aws/s3) without having a schema in place.
2. While viewing your log source's **Overview** tab, scroll down to the **Attach a schema to start classifying data** section.\
   ![The source overview page reads, "Attach a schema to start classifying data". Below, there are two options, each with their own Start button: "I want to add an existing schema" and "I want to generate a schema from raw events"](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-392612e4b519b4cd5e9cb7b4d283fa6793dc4a59%2FScreenshot%202023-04-11%20at%204.22.16%20PM.png?alt=media)
3. Choose from the following options:
   * **I want to add an existing schema:** Choose this option if you already created a schema and you know the S3 prefix you want Panther to read logs from. Click **Start** in the tile.
     * You will see a **S3 Prefixes & Schemas** popup modal:\
       ![On the S3 Prefixes & Filters screen, there is an area where you can enter a S3 prefix. There are additional buttons to "Add Exclusion Filters" and "Add schemas"](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-87325fcca5a528077c45039b137e3324c115135f%2FScreenshot%202023-04-11%20at%204.31.21%20PM.png?alt=media)
   * **I want to generate a schema from raw events:** Select this option to generate a schema from live data in this bucket and define which prefixes you want Panther to read logs from. Click **Start** in the tile.
     * Note that you may need to wait up to 15 minutes for data to start streaming into Panther.
     * On the page you are directed to, you can view the raw data Panther has received at the bottom of the screen:\
       ![The schema inference page is shown, with a Raw Events tile containing a number of raw JSON events in a table. In the leftmost column, each row has a "View JSON" button. The second column contains the raw events.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-07c25c75e0ad09f81c1f3a571e0bb146f628b90a%2FScreenshot%202023-04-11%20at%204.36.34%20PM.png?alt=media)
       * This data is displayed from `data-archiver`, a Panther-managed S3 bucket that retains raw logs for up to 15 days for every S3 log source.
       * Only raw log events that were placed in the S3 bucket *after* you configured the source in Panther will be visible, even if you've set the timespan to look further back.
       * If your raw events are JSON-formatted, you can view them as JSON by clicking **View JSON** in the left-hand column.

**Infer a schema from raw data**

If you chose to **I want to generate a schema from raw events** in the previous section, now you can infer a schema.

1. Once you see data populating in **Raw Events,** you can filter the events you'd like to infer a schema from by using the string Search, S3 Prefix, Excluded Prefix, and/or Time Period filters at the top of the **Raw Events** section.
2. Click **Infer Schema** to generate a schema.\
   ![The image shows a section in the Panther Console labeled "Raw Events." On the right, there is a blue button labeled "Infer Schema." At the top of Raw Events, there is a Search bar, fields for S3 Prefix and Excluded Prefix, and a dropdown menu labeled Time Period.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-7217dbaba2918124af2c2b23ab096a0a8ba39846%2FScreenshot%202023-04-11%20at%204.43.25%20PM.png?alt=media)
3. On the **Infer New Schema** modal that pops up, enter the following:
   * **New Schema Name:** The name of the schema that will map to the table in the data lake once the schema is published.
     * The name will always start with `Custom.` and must have a capital letter after.
   * **S3 Prefix:** Use an existing prefix that was set up prior to inferring the schema or a new prefix.
     * The prefix you choose will filter data from the corresponding prefix in the S3 bucket to the schema you've inferred.
     * If you don't need to specify a specific prefix, you can leave this field empty to use the catch-all prefix that is called `*`.\
       ![The image shows a section in the Panther Console labeled "Infer New Schema." At the top, there is a header labeled "Fill in new Schema name" and a field labeled "New Schema Name." Below that, there is a header labeled "Select S3 prefix" and fields labeled "S3 Prefix". At the bottom, there is a blue button labeled "Infer Schema."](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-37b77c5544f5a6f245dcae9363372315178a475b%2FScreenshot%202023-04-12%20at%2011.09.44%20AM.png?alt=media)
4. Click **Infer Schema**.
   * At the top of the page, you will see **'\<schema name>' was successfully inferred**.
     * Click **Done**.\
       ![The source page says the schema was successfully inferred. There is a Done button.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-30a5d70b08207463d0936a63364bbc3bf8f11d87%2FScreenshot%202023-04-12%20at%2011.12.08%20AM.png?alt=media)
   * The schema will then be placed in a **Draft** mode until you're ready to publish to production after testing.
5. Review the schema and its fields by clicking its name.\
   ![In the Schemas section, the schema called Custom.CaraS3Countries is shown, with a "Draft" label. Below it is a section to Test Schemas, with a Run Test button.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-c9dcebbf9a1450510bd6247b99241b9e0b062b2d%2FScreenshot%202023-04-12%20at%2011.13.49%20AM.png?alt=media)
   * Since the schema is in **Draft**, you can change, remove, or add fields as needed.\
     ![The image shows an example schema from the Panther Console. There is a field labeled "SchemaID" and it contains the text "Custom.CaraS3Countries." The Reference URL field and Description field are not filled in. The schema is in a code block labeled "Event Schema." At the bottom, there is a blue button labeled "Validate Schema."](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-967f517e77d5a7a62c01beace809641c9ea5b2e1%2FScreenshot%202023-04-12%20at%2011.15.28%20AM.png?alt=media)

**Test the schema with raw data**

Once your schemas and prefixes are defined, you can proceed to testing the schema configuration against raw data.

1. In the **Test Schemas** section at the top of the screen, click **Run Test**.\
   ![The image shows a section in the Panther Console labeled "Test Schemas." On the right, there is a blue button labeled "Run Test."](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-6431421b8c18dfbc99cc1eeaf5ea141be101af7b%2FScreenshot%202023-04-12%20at%2011.26.38%20AM.png?alt=media)
2. On the **Test Schemas** modal that pops up, select the **Time Period** you would like to test your schema against, then click **Start Test**.\
   ![The image shows a section in the Panther Console labeled "Test Schemas." The center of the image contains the text "Test how your schemas perform during a selected time period." At the bottom, there is a drop-down menu labeled "Time Period" with the option "Last 14 days" selected. To the right of that, there is a blue button labeled "Start Test."](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-fe5af04449baefda4c7463104915cd5222dddc56%2FScreenshot%202023-04-12%20at%2011.30.19%20AM.png?alt=media)
   * Depending on the time range and amount of data, the test may take a few minutes to complete.\
     ![A section from the Panther Console labeled "Test finished - Elapsed Time 00min 00sec." The page shows Test Started Date, Events Date Start, Events Date End, Stream Type, Schemas Tested, Data Scanned, Matched Events, and Unmatched events.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-b4f2b9511260c8fb1fa374368906bc118ce188c5%2FScreenshot%202023-04-12%20at%2011.31.34%20AM.png?alt=media)
   * Once the test is started, the results appear with the amount of matched and unmatched events.
     * **Matched Events** represent the number of events that would successfully classify against the schema configuration.
     * **Unmatched Events** represent the number of events that would not classify against the schema.
3. If there are **Unmatched Events**, inspect the errors and the JSON to decipher what caused the failures.\
   ![The "Test Finished" screen in the Panther Console shows a list of specific errors and raw data.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-2cbd9777aa1f6fba431642667fe5f0f35de541e5%2Ftest-unmatched-events.png?alt=media)
   * Click **Back to Schemas**, make changes as needed, and test the schema again.
4. Click **Back to Schemas**.
5. In the upper right corner, click **Save**.\
   ![On the source page, the schema name is shown. In the upper right corner is a Save button, which is circled.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-85a5ca5827ba1aab7545574d26759eabb352ac85%2FScreenshot%202023-04-12%20at%2011.40.58%20AM.png?alt=media)
   * The inferred schema is now attached to your log source.
     {% endtab %}

{% tab title="Historical S3 data" %}
**Inferring custom schemas from historical S3 data**

You can infer and save one or multiple schemas for a custom S3 log source from historical data in your S3 bucket (i.e., data that was added to the bucket *before* it was onboarded as a log source in Panther).

**Prerequisite: Onboard your S3 bucket to Panther**

* Follow the instructions to [onboard an S3 bucket onto Panther](https://docs.panther.com/data-onboarding/data-transports/aws/s3) without having a schema in place.
  * If you have onboarded the S3 source [with a custom IAM role](https://docs.panther.com/data-transports/aws/s3#i-want-to-set-everything-up-on-my-own), that role must have the `ListBucket` permission.

**Step 1: View the S3 bucket structure in Panther**

After creating your S3 bucket source in Panther, you can view your S3 bucket's structure and data in the Panther Console:

1. In the left-hand navigation bar of your Panther Console, click **Configure > Log Sources**.
2. Click into your S3 log source.
3. In the log source's **Overview** tab, scroll down to the **Attach a Schema to start classifying the data** section.
4. On the right side of the **I want to generate a schema from bucket data** tile, click **Start**.

   ![In Panther, in a log source's Overview tab, there is a "Start" button next to a tile labeled "I want to generate a schema from bucket data."](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-ae3b7d90f3b9f88223f53d655fb1753b802e06b9%2FScreenshot%202023-04-25%20at%201.12.51%20PM.png?alt=media)

   * You will be redirected to a folder inspection of your S3 bucket. Here, you can view and navigate through all folders and objects in the S3 bucket.

     <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-f11cbe163d4e94ad40b640ab84d86ee108a8c7ea%2FScreenshot%202023-04-27%20at%2010.27.46%20AM.png?alt=media" alt="The folder inspection view in the Panther Console" width="563"><figcaption></figcaption></figure>
   * Alternatively, you can access the folder inspection of your S3 bucket via the success page after [onboarding your S3 source](https://docs.panther.com/data-onboarding/data-transports/aws/s3) in Panther. From that page, click **Attach or Infer Schemas**.\
     ![On the success screen after onboarding an S3 source in Panther, there is a button labeled "Attach or infer schemas."](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-a78db2d97d5896ed1886070dbfd7fae44888505c%2FScreenshot%202023-04-25%20at%201.02.13%20PM.png?alt=media)

**Step 2: Navigate through your data**

* While viewing the folder inspection, click an object.
  * A slide-out panel will appear, displaying a preview of its events:

<figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-cca8b95257b473222b3bcb9736f6177a62d75039%2FScreenshot%202023-04-25%20at%201.24.49%20PM.png?alt=media" alt="In Panther, an S3 object is highlighted. A pop-over window is displaying a preview of its events." width="563"><figcaption></figcaption></figure>

If the events fail to render correctly (either generating an error or displaying events improperly), it's possible the wrong stream type has been chosen for the S3 bucket source. If this is the case, click **Selected Logs Format is n**:

<figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-e8cf170f6fa706886ec7b548deb623a3db8fbe7e%2FScreenshot%202023-04-25%20at%201.28.29%20PM.png?alt=media" alt="On the source&#x27;s folder selection view in the Panther Console, the option to select a stream type appears at the top." width="563"><figcaption></figcaption></figure>

**Step 3: Indicate if each folder has existing schema or a new one should be inferred**

After reviewing what's included in your bucket, you can determine if one or multiple schemas is necessary to represent all of the bucket's data. Next, you can select folders that include data with distinct structures and either infer a new schema, or assign an existing one.

1. Determine whether one or more schemas will need to be inferred from the data in your S3 bucket.
   * If all data in the S3 bucket is of the same structure (and therefore can be represented by one schema), you can leave the default **Infer New Schema** option selected on the bucket level. This generates a single schema for all data in the bucket.

     <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-f4d1f1d01d152a31e818a9f5131beb6540f05c6e%2Fimage.png?alt=media" alt="The &#x22;Infer 1 schema&#x22; button is in the upper right corner of the S3 folders page in the Panther Console." width="563"><figcaption></figcaption></figure>
   * If the S3 bucket includes data that need to be classified in multiple schemas, follow the steps below for each folder in the bucket:
     1. Select a folder and click **Include**.
        * Alternatively, if there is a folder or subfolder that you do *not* want Panther to process, select it and click **Exclude**.\
          ![](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-5fcd73cc431f1180f7131be791d1234d20239cb9%2Fimage.png?alt=media)
     2. If you have an existing schema that matches the data, click the **Schema** dropdown on the right side of the row, then select the schema:\
        ![The schema dropdown is expanded next to the data object.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-bf9969f0ca6889b3796c8d8baf251de1f3e01f6b%2FScreenshot%202023-04-25%20at%201.43.17%20PM.png?alt=media)
        * By default, each newly included folder has the **Infer New Schema** option selected.
2. Click **Infer `n` Schemas**.\
   ![](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-22b782c3292729767228d552f4b53c553e887967%2Fimage.png?alt=media)

**Step 4: Wait for schemas to be inferred**

The schema inference process may take up to 15 minutes. You can leave this page while the process completes. You can also stop this process early, and keep the schema(s) inferred during the time that the process ran.

<figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-4f9d0bed4bd9c9382dfc4772338eced52da35809%2FScreenshot%202023-04-25%20at%201.56.13%20PM.png?alt=media" alt="The source page in Panther shows the schema inference details, including an infer skipped and the number of events processed." width="563"><figcaption></figcaption></figure>

**Step 5: Review the results**

After the inference process is complete, you can view the resulting schemas and the number of events that were used during each schema's inference. You can also validate how each schema parses raw events.

1. Click the play icon on the right side of each row.

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-8bcebaa11bebe2dada6069edca169c917717795b%2Fimage.png?alt=media" alt="" width="563"><figcaption></figcaption></figure>
2. Click the **Events** tab to see the raw and normalized events.

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-9cacdcf6f276e3f28d6c7e6f8f1ae63fe6cc1a21%2Fimage.png?alt=media" alt="" width="563"><figcaption></figcaption></figure>
3. Click the **Schema** tab to see the generated schema.

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-11c8b6339596224c890503371d9b026ad898f812%2Fimage.png?alt=media" alt="" width="563"><figcaption></figcaption></figure>

**Step 6: Name the schema(s) and save source**

Before saving the source, name each of the newly inferred schemas with a unique name by clicking **Add name**.

<div align="center"><figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-47a2c5613a288cd17c68fd2837686146c12b3365%2FScreenshot%202023-04-25%20at%202.21.16%20PM.png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

After all new schemas have been named, you will be able to click **Save Source** in the upper right corner.
{% endtab %}

{% tab title="HTTP data received in Panther" %}
**Inferring a custom schema from HTTP data received in Panther**

You can generate and publish a schema for a custom log source from live data streaming from an HTTP (webhook) source into Panther. You will first [view your HTTP data](#view-raw-http-data) in Panther, then [infer a schema](#infer-a-schema-from-raw-data-1), then [test the schema](#test-the-schema-with-raw-data-1).

**View raw HTTP data**

After creating your [HTTP source](https://docs.panther.com/data-onboarding/data-transports/http) in Panther, you can view raw data coming into Panther and infer a schema from it:

1. Follow the [instructions to set up an HTTP log source](https://docs.panther.com/data-transports/http#how-to-set-up-an-http-log-source-in-panther) in Panther.
   * Do not select a schema during HTTP source setup.
2. While viewing your log source's **Overview** tab, scroll down to the **Attach a schema to start classifying data** section.\
   ![The Overview tab of the detail page of an HTTP source called "HTTP Holding Tank" is shown. There is a Basic Info section with fields like Source ID, HTTP Ingest URL, etc. Below, there is a section titled "Attach a schema to start classifying data." Within it are two options: I want to add an existing schema, and I want to generate a schema.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-5447fb77d4aef7179e1a1675143cc358298d7183%2Fhttpholdingtank.webp?alt=media)
3. Choose from the following options:
   * **I want to add an existing schema:** Choose this option if you already created a schema. Click **Start** in the tile.
     * You will be navigated to the HTTP source edit page, where you can make a selection in the **Schemas - Optional** field:

       <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-c03face0c363c3dba14fc47ff02f399aed9855e4%2FScreenshot%202023-09-12%20at%201.41.54%20PM.png?alt=media" alt="The edit page for an HTTP source is shown. In the Basic Information section, the &#x22;Schemas - Optional&#x22; dropdown field is open, but no selections have been made." width="375"><figcaption><p>HTTP source edit page</p></figcaption></figure>
   * **I want to generate a schema:** Select this option to generate a schema from live data. Click **Start** in the tile.
     * Note that you may need to wait a few minutes after `POST`ing the events to the HTTP endpoint for them to be visible in Panther.
     * On the page you are directed to, under **Raw Events**, you can view the raw data Panther has received within the last week:

       <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-42ea113882672838c9cdb5ab36ef9d1197d248b6%2FScreenshot%202023-09-12%20at%201.44.03%20PM.png?alt=media" alt="An HTTP source schema attachment page is shown. There is an arrow pointing to the section at the bottom, called &#x22;Raw Events.&#x22; Various JSON events are included in this section. There is a blue &#x22;Infer Schema&#x22; button."><figcaption><p>HTTP Raw events</p></figcaption></figure>
     * This data is displayed from `data-archiver`, a Panther-managed S3 bucket that retains raw HTTP source logs for 15 days.

**Infer a schema from raw data**

If you choose **I want to generate a schema** in the previous section, now you can infer a schema.

1. Once you see data populating within **Raw Events**, click **Infer Schema**.\
   ![An HTTP source schema attachment page is shown. There is a section at the bottom called "Raw Events." Various JSON events are included in this section. There is an arrow pointing to a blue "Infer Schema" button.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-5b3001286d7577292a5d00fe809732320700bc6d%2FScreenshot%202023-09-12%20at%201.47.58%20PM.png?alt=media)
2. On the **Infer New Schema** modal that pops up, enter the:
   * **New Schema Name:** Enter a descriptive name. It will always start with `Custom.` and must have a capital letter after.
3. Click **Infer Schema**.
   * At the top of the page, you will see **'\<schema name>' was successfully inferred**.
4. Click **Done**.\
   ![Text reads "'Custom.HttpHoldingTank' was successfully inferred." Below, there is a Done button, which is circled.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-e0b07b0af68547d35e93520b39f4efe7dac434e9%2FScreenshot%202023-09-12%20at%201.50.20%20PM.png?alt=media)
   * The schema will be placed in **Draft** mode until you're ready to publish it, after testing.
5. Click the draft schema's name to review its inferred fields.\
   ![Under a "Schema(s)" header is "Custom.HttpHoldingTank" with a "Draft" label. It is circled.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-d523d0791496dfcf737d38864fa64de93f32dfe5%2FScreenshot%202023-09-12%20at%201.51.34%20PM.png?alt=media)
   * Since the schema is in **Draft**, you can add, remove, and otherwise change fields as needed.\
     ![The edit schema view is shown. There are fields for Schema ID, Reference URL, and Description. Below, is the schema itself, in a code editor.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-759f088c08a2afa41e6c83aca4c75e20bec41400%2FScreenshot%202023-09-12%20at%201.52.34%20PM.png?alt=media)

**Test the schema with raw data**

Once your schema is defined, you can proceed to test the schema configuration against raw data.

1. In the **Test Schemas** section at the top of the screen, click **Run Test**.\
   ![Under a "Schema(s)" header is "Custom.HttpHoldingTank" with a "Draft" label. In the bottom right corner, under a "Test Schemas" header, is a "Run Test" button, which is circled.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-ad47964bee16d2ec2faa7f9c1d852574c243b3cc%2FScreenshot%202023-09-12%20at%201.53.28%20PM.png?alt=media)
2. In the **Test Schemas** pop-up modal, select the **Time Period** you would like to test your schema against, then click **Start Test**.\
   ![The "Test Schemas" modal has a "Time Period" dropdown selection and a "Start Test" button.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-29c59b9b6dfd08216c3300c9f767c1e7a5bf95c9%2FScreenshot%202023-09-12%20at%201.54.59%20PM.png?alt=media)
   * Depending on the time range and amount of data, the test may take a few minutes to complete.\
     ![The HTTP Source schema test page is shown. It shows "18 Matched Events" and "0 Unmatched Events." There is a blue "Back to Schemas" button.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-8ad44d2c53183b3b50a69918ed42d5e5330f7b9d%2FScreenshot%202023-09-12%20at%201.55.32%20PM.png?alt=media)
   * Once the test is started, the results appear with the amount of matched and unmatched events.
     * **Matched Events** represent the number of events that would successfully classify against the schema configuration.
     * **Unmatched Events** represent the number of events that would not classify against the schema.
3. If there are **Unmatched Events**, inspect the errors and the JSON to decipher what caused the failures.\
   ![A list of JSON logs is shown under an "Unmatched Events" header. There are two columns, "Raw Events" and "Error"](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-362e0d802c608e8e667edc4f30dd8a98f0b72ccc%2FScreenshot%202023-09-12%20at%201.57.17%20PM.png?alt=media)
   * Click **Back to Schemas**, make changes as needed, and test the schema again.
4. Click **Back to Schemas**.
5. In the upper right corner, click **Save**.\
   ![The HTTP Source schema edit page is shown, and its "Save" button in the upper-right corner is circled.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-6c31023c148ef1caf981b7f62aa6dda76dbc4644%2FScreenshot%202023-09-12%20at%201.58.03%20PM.png?alt=media)
   * The inferred schema is now attached to your log source.
   * Log events that were sent to the HTTP source before it had a schema attached, which were used to infer the schema, are then ingested into Panther.
     {% endtab %}
     {% endtabs %}

## Create the schema yourself

### How to create a custom schema manually

To create a custom schema manually:

1. In the left-hand navigation bar of your Panther Console, click **Configure** > **Schemas**.
2. In the upper right corner, click **Create New**.
3. Enter a **Schema ID**, **Description**, and **Reference URL**.
   * The Description is meant for content about the table, while the Reference URL can be used to link to internal resources.
4. Optionally enable **Automatic Field Discovery** by clicking its toggle `ON`. Learn more on [Field Discovery](https://docs.panther.com/data-onboarding/field-discovery).
5. In the **Schema** section, in the **Create your schema from scratch** tile, click **Start**.
   * The **Schema** section will default to using **Separate Sections**. If you'd like to write your entire schema in one editor window, click **Single Editor**.\
     ![To the right of a "Schema" header is a toggle with two values: Separate Sections and Single Editor.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-698807c490df7c03ac197a5a6504b98d7a64e554%2FScreenshot%202024-06-12%20at%209.48.23%20AM.png?alt=media)
6. In the **Parser** section, if your schema requires a parser other than the **Default (JSON/XML)** parser, select it. Learn more about the other parser options on the following pages:
   * [Script Log Parser](https://docs.panther.com/data-onboarding/custom-log-types/script-parser)
   * [Fastmatch Log Parser](https://docs.panther.com/data-onboarding/custom-log-types/fastmatch-parser)
   * [Regex Log Parser](https://docs.panther.com/data-onboarding/custom-log-types/regex-parser)
   * [CSV Log Parser](https://docs.panther.com/data-onboarding/custom-log-types/csv-parser)
7. In the **Fields & Indicators** section, write or paste your YAML log schema fields.
   * See [Writing schemas](#writing-schemas) to learn more about schema composition.
   * You can use Panther-generated [schema field suggestions](#schema-field-suggestions).
8. (Optional) In the **Universal Data Model** section, define Core Field mappings for your schema.
   * Learn more in [Mapping Core Fields in Custom Log Schemas](https://docs.panther.com/search/panther-fields#mapping-core-fields-in-custom-log-schemas).
9. At the bottom of the window, click **Run Test** to verify your schema contains no errors.
   * Note that syntax validation only checks the syntax of the Log Schema. It can still fail to save due to name conflicts.
10. Click **Save**.

You can now navigate to **Configure > Log Sources** and add a new source or modify an existing one to use the new `Custom.SampleAPI` \_Log Type. Once Panther receives events from this source, it will process the logs and store them in the `custom_sampleapi` table.

You can also now write [detections](https://docs.panther.com/detections) to match against these logs and query them using [Search](https://docs.panther.com/search/search-tool) or [Data Explorer](https://docs.panther.com/search/data-explorer).

### Writing schemas

See the tabs below to learn more about how to write a schema for JSON, XML, and text logs.

{% tabs %}
{% tab title="JSON logs" %}
**Writing a schema for JSON logs**

To parse log files where each line is JSON, you must define a log schema that describes the structure of each log entry.

You can edit the YAML specifications directly in the Panther Console or they can be [prepared offline in your editor/IDE of choice](https://docs.panther.com/data-onboarding/reference#using-json-schema-in-an-ide). For more information on the structure and fields in a *Log Schema*, see the [Log Schema Reference](https://docs.panther.com/data-onboarding/custom-log-types/reference).

It's also possible to use the [`starlark` parser](https://docs.panther.com/data-onboarding/custom-log-types/script-parser) with JSON logs to perform transformations outside of [those that are natively supported by Panther](https://docs.panther.com/data-onboarding/custom-log-types/transformations).

In the example schemas below, the first tab displays the JSON log structure and the second tab shows the Log Schema.

{% tabs %}
{% tab title="JSON log example" %}

```json
{
  "method": "GET",
  "path": "/-/metrics",
  "format": "html",
  "controller": "MetricsController",
  "action": "index",
  "status": 200,
  "params": [],
  "remote_ip": "1.1.1.1",
  "user_id": null,
  "username": null,
  "ua": null,
  "queue_duration_s": null,
  "correlation_id": "c01ce2c1-d9e3-4e69-bfa3-b27e50af0268",
  "cpu_s": 0.05,
  "db_duration_s": 0,
  "view_duration_s": 0.00039,
  "duration_s": 0.0459,
  "tag": "test",
  "time": "2019-11-14T13:12:46.156Z"
}
```

**Minified JSON log example**:

{% hint style="info" %}
Leverage this **Minified JSON Log Example** when using the `pantherlog` tool or generating a schema within the Panther Console.
{% endhint %}

`{"method":"GET","path":"/-/metrics","format":"html","controller":"MetricsController","action":"index","status":200,"params":[],"remote_ip":"1.1.1.1","user_id":null,"username":null,"ua":null,"queue_duration_s":null,"correlation_id":"c01ce2c1-d9e3-4e69-bfa3-b27e50af0268","cpu_s":0.05,"db_duration_s":0,"view_duration_s":0.00039,"duration_s":0.0459,"tag":"test","time":"2019-11-14T13:12:46.156Z"}`
{% endtab %}

{% tab title="Log schema example" %}

```yaml
fields:
- name: time
  description: Event timestamp
  required: true
  type: timestamp
  timeFormats: 
   - rfc3339
  isEventTime: true
- name: method
  description: The HTTP method used for the request
  type: string
- name: path
  description: The path used for the request
  type: string
- name: remote_ip
  description: The remote IP address the request was made from
  type: string
  indicators: [ ip ] # the value will be appended to `p_any_ip_addresses` if it's a valid ip address
- name: duration_s
  description: The number of seconds the request took to complete
  type: float
- name: format
  description: Response format
  type: string
- name: user_id
  description: The id of the user that made the request
  type: string
- name: params
  type: array
  element:
    type: object
    fields:
    - name: key
      description: The name of a Query parameter
      type: string
    - name: value
      description: The value of a Query parameter
      type: string
- name: tag
  description: Tag for the request
  type: string
- name: ua
  description: UserAgent header
  type: strinll
```

{% endtab %}
{% endtabs %}
{% endtab %}

{% tab title="XML logs" %}
**Writing a schema for XML logs**

Panther intermediately parses XML logs into JSON, which means you can use all the tools available for JSON logs described in the JSON logs tab. Learn how Panther parses XML into JSON in [XML stream type](https://docs.panther.com/data-onboarding/reference#xml-stream-type), then create your schema accordingly.

Note that because XML does not support data types other than strings, all values in the corresponding JSON representation will be depicted as strings (e.g., `"ip": "192.168.1.100"`). When defining your schema, you can use the appropriate types for each field, as seen in the Log schema example below.

{% tabs %}
{% tab title="XML log example" %}
Raw XML log:

```xml
<log>
    <id>12345</id>
    <timestamp>2023-11-14T13:12:46.156Z</timestamp>
    <event type="security" priority="high">
        <message>Unauthorized access attempt detected</message>
        <source>
            <ip>192.168.1.100</ip>
            <user>admin</user>
        </source>
        <details>
            <action>login_failed</action>
            <reason>invalid_credentials</reason>
        </details>
    </event>
</log>
```

{% endtab %}

{% tab title="JSON equivalent example" %}
How the raw XML log is [converted into JSON](https://docs.panther.com/data-onboarding/reference#xml-stream-type):

```json
{
  "id": "12345",
  "timestamp": "2023-11-14T13:12:46.156Z",
  "event": {
    "type": "security",
    "priority": "high",
    "message": "Unauthorized access attempt detected",
    "source": {
      "ip": "192.168.1.100",
      "user": "admin"
    },
    "details": {
      "action": "login_failed",
      "reason": "invalid_credentials"
    }
  }
}
```

{% endtab %}

{% tab title="Log schema example" %}
How the log schema to parse this log would look:

```yaml
fields:
- name: id
  description: Unique log identifier
  type: string
  required: true
- name: timestamp
  description: Event timestamp
  type: timestamp
  timeFormats: 
   - rfc3339
  isEventTime: true
- name: event
  description: Event details
  type: object
  fields:
  - name: type
    description: Type of event
    type: string
  - name: priority
    description: Event priority level
    type: string
  - name: message
    description: Event message
    type: string
  - name: source
    description: Source information
    type: object
    fields:
    - name: ip
      description: Source IP address
      type: string
      indicators: [ ip ]
    - name: user
      description: Username
      type: string
  - name: details
    description: Additional event details
    type: object
    fields:
    - name: action
      description: Action performed
      type: string
    - name: reason
      description: Reason for action (if applicable)
      type: string
```

{% endtab %}
{% endtabs %}
{% endtab %}

{% tab title="Text logs" %}
**Writing a schema for text logs**

Panther handles logs that are not structured as JSON/XML by using a 'parser' that translates each log line into key/value pairs and feeds it as JSON to the rest of the pipeline. You can define a text parser using the `parser` field of the *Log Schema*. Panther provides the following parsers for non-JSON/XML formatted logs:

<table data-header-hidden><thead><tr><th width="182.74652099609375">Name</th><th>Description</th></tr></thead><tbody><tr><td>Name</td><td>Description</td></tr><tr><td><a href="custom-log-types/fastmatch-parser">fastmatch</a></td><td>Match each line of text against one or more simple patterns</td></tr><tr><td><a href="custom-log-types/regex-parser">regex</a></td><td>Use regular expression patterns to handle more complex matching, such as conditional fields, case-insensitive matching, etc.</td></tr><tr><td><a href="custom-log-types/csv-parser">csv</a></td><td>Treat log files as CSV mapping column names to field names</td></tr><tr><td><a href="custom-log-types/script-parser">starlark</a></td><td>Parse text logs, or perform transformations on json logs</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

### Schema field suggestions

When creating or editing a custom schema, you can use field suggestions generated by Panther. To use this functionality:

1. In the Panther Console, click into the YAML schema editor.
   * To edit an existing schema, click **Configure** > **Schemas** > \[name of schema you would like to edit] > **Edit**.
   * To create a new schema, click **Configure** > **Schemas** > **Create New**.
2. Press `Command+I` on macOS (or `Control+I` on PC).
   * The schema editor will display available properties and operations based on the position of the text cursor.

     ![A YAML schema editor is shown. Below the cursor is a box with various field suggestions, including concat, copy, description, indicators, mask, etc.](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-5994172165705efb58eb28e3847dbe5e083bc26d%2FScreenshot%202023-11-27%20at%202.42.15%20PM.png?alt=media)

## Managing custom schemas

### Editing a custom schema

Panther allows custom schemas to be edited. Specifically, you can perform the following actions:

* Add new fields.
* Rename or delete existing fields.
* Edit, add, or remove all properties of existing fields.
* Modify the `parser` configuration to fix bugs or add new patterns.
* [Archive or unarchive the schema](#archiving-and-unarchiving-a-custom-schema).
* [Enable or disable field discovery](https://docs.panther.com/field-discovery#enabling-field-discovery-1).

{% hint style="info" %}
After editing a field's `type`, any newly ingested data will match the new typed, while any previously ingested data will retain its old type.
{% endhint %}

To edit a custom schema:

1. Navigate to your custom schema's details page in the Panther Console.
2. In the upper-right corner of the details page, click **Edit**.

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-e8605e4dc7ea1aed0cdce32019d3b54a4ed1b86c%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>
3. Modify the schema as desired.
   * You can use Panther-generated [schema field suggestions](#schema-field-suggestions).
   * To more easily see your changes (or copy or revert deleted lines), click **Single Editor,** then **Diff View**.

     <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-5799bf847f9a6051e07a5723c3eb5ed012e08805%2FdiffViewSchemaEditor.png?alt=media" alt="The Schema editor is shown, and the &#x22;Single Editor&#x22; and &#x22;Diff View&#x22; buttons are shown. One field has been changed, from event_time to new_name."><figcaption></figcaption></figure>
4. In the upper-right corner, click **Update**.

Click **Run Test** to check the YAML for structural compliance. Note that the rules will only be checked after you click **Update**. The update will be rejected if the rules are not followed.

#### Update related detections and saved queries

Editing schema fields might require updates to related detections and saved queries. Click **Related Detections** in the alert banner displayed above the schema editor to view, update, and test the list of affected detections and saved queries.

<figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-0811c7a3b35787d7b51247115a2e93de7f768b63%2Fcustom.schema.png?alt=media" alt="A schema&#x27;s name is shown, &#x22;Custom.A&#x22;—to its right are three buttons: Upload Sample Logs, Cancel, and Update."><figcaption></figcaption></figure>

#### **Query implications**

Queries will work across changes to a **Type** provided the query does not use a function or operator which requires a field type that is not castable across **Types**.

* **Good example**: The **Type** is edited from `string` to `int` where all existing values are numeric (i.e. `"1"`). A query using the function `sum` aggregates old and new values together.
* **Bad example**: The **Type** is edited from `string` to `int` where some of the existing values are non-numeric (i.e. `"apples"`). A query using the function `sum` excludes values that are non-numeric.

#### Query castability table

This table shows which **Types** can be cast as each **Type** when running a query. Schema editing allows any **Type** to be changed to another **Type**.

<table><thead><tr><th width="135">Type From -> To</th><th width="96">boolean</th><th width="86">string</th><th width="100">int</th><th width="102">bigint</th><th width="102">float</th><th>timestamp</th></tr></thead><tbody><tr><td>boolean</td><td>same</td><td>yes</td><td>yes</td><td>yes</td><td>no</td><td>no</td></tr><tr><td>string</td><td>yes</td><td>same</td><td>numbers only</td><td>numbers only</td><td>numbers only</td><td>numbers only</td></tr><tr><td>int</td><td>yes</td><td>yes</td><td>same</td><td>yes</td><td>yes</td><td>numbers only</td></tr><tr><td>bigint</td><td>yes</td><td>yes</td><td>yes</td><td>same</td><td>yes</td><td>numbers only</td></tr><tr><td>float</td><td>yes</td><td>yes</td><td>yes</td><td>yes</td><td>same</td><td>numbers only</td></tr><tr><td>timestamp</td><td>no</td><td>yes</td><td>no</td><td>no</td><td>no</td><td>same</td></tr></tbody></table>

### Archiving and unarchiving a custom schema

You can archive and unarchive custom schemas in Panther. You might choose to archive a schema if it's no longer used to ingest data, and you do not want it to appear as an option in various dropdown selectors throughout Panther. In order to archive a schema, it must not be in use by any log sources. Schemas that have been archived still exist indefinitely; it is not possible to permanently delete a schema.

Archiving a schema does not affect any data ingested using that schema already stored in the data lake—it is still queryable using [Data Explorer](https://docs.panther.com/search/data-explorer) and [Search](https://docs.panther.com/search/search-tool). By default, archived schemas are not shown in the schema list view (visible on **Configure** > **Schemas**), but can be shown by modifying **Status**, within **Filters**, in the upper right corner. In [Data Explorer](https://docs.panther.com/search/data-explorer), tables of archived schemas are not shown under **Tables**.

Attempting to create a new schema with the same name as an archived schema will result in a name conflict, and prompt you to instead unarchive and edit the existing schema.

To archive or unarchive a custom schema:

1. In the Panther Console, navigate to **Configure** > **Schemas**.
   * Locate the schema you'd like to archive or unarchive.
2. On the right-hand side of the schema's row, click the **Archive** or **Unarchive** icon.

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-10776143f6386f18e873bfe23b4eb0822899aa45%2Farchiveschema.png?alt=media" alt="Two schema rows are shown, one that is currently archived and one that is currently unarchived. The archive/unarchive icons in each of their rows is circled."><figcaption></figcaption></figure>

   * If you are archiving a schema and it is currently associated to one or more log sources, the confirmation modal will prompt you to first detach the schema. Once you have done so, click **Refresh**.\
     ![An Archive Schema modal says, "Prior to archiving Custom.HarryPotterFake2, it must be detached from all associated Log Sources." A list of associated log sources is shown, with only one value: Carrie Tines Test](https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-7ba052b56409b8da6c1cef102045170403ed4bee%2FScreenshot%202023-03-07%20at%209.34.55%20AM.png?alt=media)
3. On the confirmation modal, click **Continue**.

### Testing a custom schema

{% hint style="info" %}
The "Test Schema against sample logs" feature found on the Schema Edit page in the Panther Console supports Lines, CSV (with or without headers), JSON, JSON Array, XML, CloudWatch Logs, and Auto. See [Stream Types](https://docs.panther.com/data-onboarding/reference#stream-type) for examples.

Additionally, the above log formats can be compressed using the following formats:

* gzip
* zstd (without dictionary)

Multi-line logs are supported for JSON and JSONArray formats.
{% endhint %}

To validate that a custom schema will work against your logs, you can test it against sample logs:

1. In the left-hand navigation bar in your Panther Console, click **Configure > Schemas**.
2. Click on a custom schema's nam&#x65;**.**
3. In the upper-right corner of the schema details page, click **Test Schema**.

   <figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-bfc9ab86accd41a32ada43dacc42bab377592e1f%2FtestSchema.png?alt=media" alt="A schema&#x27;s name is shown. To its right are two buttons: Test Schema and Clone."><figcaption></figcaption></figure>

## Uploading log schemas with the Panther Analysis Tool

If you choose to maintain your log schemas outside of the Panther Console, perhaps to keep them under version control and review changes before updating, you can upload the YAML files programmatically with the [Panther Analysis Tool](https://docs.panther.com/panther-developer-workflows/detections-repo/pat) (PAT).

The uploader command receives a base path as an argument and then proceeds to recursively discover all files with extensions `.yml` and `.yaml`.

{% hint style="info" %}
It's recommended to store schema files separately from unrelated files, otherwise you may receive errors during upload for attempting to upload invalid schema files.
{% endhint %}

```
panther_analysis_tool update-custom-schemas --path ./schemas
```

The uploader will check if an existing schema exists and proceed with the update or create a new one if no matching schema name is found.

{% hint style="warning" %}
The `schema` field must always be defined in the YAML file and be consistent with the existing schema name for an update to succeed. For a list of all available CI/CD fields see our [Log Schema Reference](https://docs.panther.com/data-onboarding/custom-log-types/reference#ci-cd-schema-fields).
{% endhint %}

Schemas uploaded via PAT are validated against the same criteria as updates made in the Panther Console.

## Troubleshooting custom logs

Visit the Panther Knowledge Base to [view articles about custom log sources](https://help.panther.com/Data_Sources/Custom_Logs) that answer frequently asked questions and help you resolve common errors and issues.
