Log Schema Reference
In this guide, you will find common fields used to build YAML-based schemas when onboarding Custom Log Types and Lookup Table schemas.
LogSchema fields
Each log schema contains the following fields:
fields
([]FieldSchema
)The fields in each Log Event.
parser
(ParserSpec
)A parser that can convert non-JSON logs to JSON and/or perform custom transformations
CI/CD schema fields
Additionally, schemas defined using a CI/CD workflow can contain the following fields:
schema
(string
)The name of the schema
description
(string
)A short description that will appear in the UI
referenceURL
(string
)A link to an external document which specifies the log structure. Often, this is a link to a 3rd party's documentation.
fieldDiscoveryEnabled
(boolean
)Indicates whether field discovery will be enabled for this schema or not.
Example
The example below contains the CI/CD fields mentioned above.
schema: Custom.MySchema
description: (Optional) A handy description so I know what the schema is for.
referenceURL: (Optional) A link to some documentation on the logs this schema is for.
fieldDiscoveryEnabled: true
parser:
csv:
delimiter: ','
hasHeader: true
fields:
- name: action
type: string
required: true
- name: time
type: timestamp
timeFormats:
- unix
ParserSpec
A ParserSpec specifies a parser to use to convert non-JSON input to JSON. Only one of the following fields can be specified:
fastmatch
(FastmatchParser{}
): Usefastmatch
parserLearn more on Fastmatch Log Parser.
regex
(RegexParser{}
): Useregex
parserLearn more on Regex Log Parser.
csv
(CSVParser{}
): Usecsv
parserNote: The
columns
field is required when there are multiple CSV schemas in the same log source.Learn more on CSV Log Parser.
script
: Usescript
parserLearn more on Script Log Parser.
See the fields for fastmatch
, regex
, and csv
in the tabs below.
Parser fastmatch
fields
fastmatch
fieldsmatch
([]string
): One or more patterns to match log lines against. This field cannot be empty.emptyValues
([]string
): Values to consider asnull
.expandFields
(map[string]string
): Additional fields to be injected by expanding text templates.trimSpace
(bool
): Trim space surrounding each value.
FieldSchema
A FieldSchema defines a field and its value. The field is defined by:
name
(string
)The name of the field.
required
(boolean
)If the field is required or not.
description
(string
)Some text documenting the field.
copy
(object
)If present, the field's value will be copied from the referenced
object
.
rename
(object
)If present, the field's name will be changed.
concat
(object
)If present, the field's value will be the combination of the values of two or more other fields.
split
(object
)If present, the field's value will be extracted from another string field by splitting it based on a separator.
mask
(object
)If present, the field's value will be masked.
Its value is defined using the fields of a ValueSchema
.
ValueSchema
A ValueSchema
defines a value and how it should be processed. Each ValueSchema
has a type
field that can be of the following values:
Type Values
Description
string
A string value
int
A 32-bit integer number in the range -2147483648
, 2147483647
smallint
A 16-bit integer number in the range -32768
, 32767
bigint
A 64-bit integer number in the range -9223372036854775808
, 9223372036854775807
float
A 64-bit floating point number
boolean
A boolean value true
/ false
timestamp
A timestamp value
array
A JSON array where each element is of the same type
object
A JSON object of known keys
json
Any valid JSON value (JSON object, array, number, string, boolean)
The fields of a ValueSchema
depend on the value of the type
.
Type
Field
Value
Description
object
fields
(required)
An array of FieldSpec
objects describing the fields of the object.
timestamp
timeFormats
(required)
[]String
An array specifying the formats to use for parsing the timestamp (see Timestamps)
timestamp
isEventTime
Boolean
A flag to tell Panther to use this timestamp as the Log Event Timestamp.
Timestamps
Timestamps are defined by setting the type
field to timestamp
and specifying the timestamp format using the timeFormats
field.
Panther always stores timestamp
values in Universal Coordinated Time (UTC). This means:
If a
timestamp
field value indicates a timezone other than UTC (with a UTC offset), Panther converts it to UTC.For example, if an incoming
timestamp
field had a value of2025-07-02T00:15:30-08:00
(where the-08:00
offset means it's in Pacific Standard Time [PST]), Panther will store it as2025-07-02 08:15:30.000000000
(converted to UTC).
If a
timestamp
field value does not indicate a timezone, Panther assumes it is in UTC and stores it as-is.
See the allowed timeFormats
values below:
rfc3339
2022-04-04T17:09:17Z
The most common timestamp format.
unix_auto
1649097448
(seconds)
1649097491531
(milliseconds)
1649097442000000
(microseconds)
1649097442000000000
(nanoseconds)
Timestamp expressed in time passed since UNIX epoch time. It can handle seconds, milliseconds, microseconds, and nanoseconds.
unix
1649097448
Timestamp expressed in seconds since UNIX epoch time. It can handle fractions of seconds as a decimal part.
unix_ms
1649097491531
Timestamp expressed in milliseconds since UNIX epoch time.
unix_us
1649097442000000
Timestamp expressed in microseconds since UNIX epoch time.
unix_ns
1649097442000000000
Timestamp expressed in nanoseconds since UNIX epoch time. Scientific float notation is supported.
The timeFormats
field was introduced in Panther v1.46 to support multiple timestamp formats in custom log schemas. While timeFormat
is still supported for log sources set up before v1.46, use timeFormats
for all new schemas.
Defining a custom format
You can also define a custom format by using strftime notation. For example:
# The field is a timestmap using a custom timestamp format like "2020-09-14 14:29:21"
- name: ts
type: timestamp
timeFormats:
- "%Y-%m-%d %H:%M:%S" # note the quotes required for proper YAML syntax
Panther's strftime format supports using %N
code to parse nanoseconds. For example:
%H:%M:%S.%N
can be used to parse 11:12:13.123456789
Using multiple time formats
When multiple time formats are defined, each of them will be tried sequentially until successful parsing is achieved:
- name: ts
type: timestamp
timeFormats:
- rfc3339
- unix
Timestamp values can be marked with isEventTime: true
to tell Panther that it should use this timestamp as the p_event_time
field. It is possible to set isEventTime
on multiple fields. This may be useful in situations where logs have optional or mutually exclusive fields holding event time information. Since there can only be a single p_event_time
for every log event, the priority is defined using the order of fields in the schema.
Working with timeFormats
in schema tests
timeFormats
in schema testsWhen writing schema tests to be run with the pantherlog test
command:
If your schema field has a single
timeFormats
value, for backwards compatibility, configurations will retain the same format.If your schema field has multiple
timeFormats
values, you must define the timestamp field value in theresult
payload formatted asYYYY-MM-DD HH:MM:SS.fffffffff
.
Example with a single timeFormats
value:
- name: singleFormatTimestamp
type: timestamp
timeFormats:
- unix
input: >
{
"singleFormatTimestamp": "1666613239"
}
result: >
{
"singleFormatTimestamp": "1666613239"
}
Example with multiple timeFormats
values:
- name: multipleFormatTimestamp
type: timestamp
timeFormats:
- unix
- rfc3339
input: >
{
"multipleFormatTimestamp": "1666613239"
}
result: >
{
"multipleFormatTimestamp": "2022-10-24 12:07:19.459326000"
}
Indicators
Values of string
type can be used as "indicators." To mark a field as an indicator, set the indicators
field to an array of indicator scanner names (more than one may be used). This will instruct Panther to store the value of this field in the relevant p_any_
field.
For a list of values that are valid to use in the indicators
field, see Standard Fields.
For example:
# Will scan the value as IP address and store it to `p_any_ip_addresses`
- name: remote_ip
type: string
indicators: [ ip ]
# Will scan the value as a domain name and/or IP address.
# Will store the result in `p_any_domain_names` and/or `p_any_ip_addresses`
- name: target_url
type: string
indicators: [ url ]
Validate
Under the validate
key, you can specify conditions for the field's value that must be met in order for an incoming log to match this schema.
It's also possible to use validate
on the element
key (where type: string
) to perform validation on each element of an array value.
allow
and deny
validation
allow
and deny
validationYou can validate values of string
type by declaring an allowlist or denylist. Only logs with field values that match (or do not match) the values in allow
/deny
will be parsed with this schema. This means you can have multiple log types that have common overlapping fields but differ on values of those fields.
# Will only allow 'login' and 'logout' event_type values to match this log type
- name: event_type
type: string
validate:
allow: [ "login", "logout"]
# Will match if log has any event_type value other than 'login' and 'logout'
- name: event_type
type: string
validate:
deny: [ "login", "logout"]
# Will match logs with a severities field with value 'info' or 'low'
- name: severities
type: array
element:
type: string
validate:
allow: ["info", "low"]
ip
and cidr
format validation
ip
and cidr
format validationValues of string
type can be restricted to match well-known formats. Currently, Panther supports the ip
and cidr
formats to require that a string value be a valid IP address or CIDR range.
ip
and cidr
validation can be combined with allow
or deny
rules but doing so is somewhat redundant—for example, if you allow two IP addresses, then adding an ip
validation will simply ensure that your validation will not include false positives if the IP addresses in your list are not valid.
# Will allow valid ipv4 IP addresses e.g. 100.100.100.100
- name: address
type: string
validate:
ip: "ipv4"
# Will allow valid ipv6 CIDR ranges
# e.g. 2001:0db8:85a3:0000:0000:0000:0000:0000/64
- name: address
type: string
validate:
cidr: "ipv6"
# Will allow any valid ipv4 or ipv6 address
- name: address
type: string
validate:
ip: "any"
# All elements of the addresses array must be valid ipv4 ID addresses
- name: addresses
type: array
element:
type: string
validate:
ip: "ipv4"
Using JSON schema in an IDE
If your code editor or integrated development environment (IDE) supports JSON Schema, you can configure it to use this schema file for Panther schemas and this schema-tests file for schema tests. Doing so will allow you to receive suggestions and error messages while developing Panther schemas and their tests.
JetBrains custom JSON schemas
See the JetBrains documentation for instructions on how to configure JetBrains IDEs to use custom JSON Schemas.
VSCode custom JSON schemas
See the VSCode documentation for instructions on how to configure VSCode to use JSON Schemas.
Stream type
While performing certain actions in the Panther Console, such as configuring an S3 bucket for Data Transport or inferring a custom schema from raw logs, you need to select a log stream type.
View example log events for each type below.
Auto
Panther will automatically detect the appropriate stream type.
n/a
Lines
Events are separated by a new line character.
"10.0.0.1","[email protected]","France"
"10.0.0.2","[email protected]","France"
"10.0.0.3","[email protected]","France"
JSON
Events are in JSON format.
{
"ip": "10.0.0.1",
"un": "[email protected]",
"country": "France"
}
OR
{ "ip": "10.0.0.1", "un": "[email protected]", "country": "France" }{ "ip": "10.0.0.2", "un": "[email protected]", "country": "France" }{ "ip": "10.0.0.3", "un": "[email protected]", "country": "France" }
OR
{ "ip": "10.0.0.1", "un": "[email protected]", "country": "France" }
{ "ip": "10.0.0.2", "un": "[email protected]", "country": "France" }
{ "ip": "10.0.0.3", "un": "[email protected]", "country": "France"OR
JSON Array
Events are inside an array of JSON objects.
OR Events are inside an array of JSON objects, which is the value to a key in a top-level object. This can be known as an "enveloped array."
[
{ "ip": "10.0.0.1", "username": "[email protected]", "country": "France" },
{ "ip": "10.0.0.2", "username": "[email protected]", "country": "France" },
{ "ip": "10.0.0.3", "username": "[email protected]", "country": "France" }
]
OR
{ "events": [
{ "ip": "10.0.0.1", "username": "[email protected]", "country": "France" },
{ "ip": "10.0.0.2", "username": "[email protected]", "country": "France" },
{ "ip": "10.0.0.3", "username": "[email protected]", "country": "France" }
]
}
CloudWatch Logs
Events came from CloudWatch Logs.
{
"owner": "111111111111",
"logGroup": "services/foo/logs",
"logStream": "111111111111_CloudTrail/logs_us-east-1",
"messageType": "DATA_MESSAGE",
"logEvents": [
{
"id": "31953106606966983378809025079804211143289615424298221568",
"timestamp": 1432826855000,
"message": "{\"ip\": \"10.0.0.1\", \"user\": \"[email protected]\", \"country\": \"France\"}"
},
{
"id": "31953106606966983378809025079804211143289615424298221569",
"timestamp": 1432826855000,
"message": "{\"ip\": \"10.0.0.2\", \"user\": \"[email protected]\", \"country\": \"France\"}"
},
{
"id": "31953106606966983378809025079804211143289615424298221570",
"timestamp": 1432826855000,
"message": "{\"ip\": \"10.0.0.3\", \"user\": \"[email protected]\", \"country\": \"France\"}"
}
]
}
Last updated
Was this helpful?