Script Log Parser
Parse incoming logs with script defined in Starlark configuration language
Overview
script is one of the possible values of the parser key in a custom log schema. This parser lets you specify the transformations Panther should perform on each incoming log event using the Starlark configuration language, which shares many syntax similarities with Python. The script parser in Panther can handle both structured (JSON) and unstructured events.
You might benefit from using the script parser when you'd like to:
Perform transformations on the data, but the Panther-provided schema transformations are insufficient
Understanding the script parser
script parserDefining a function
functionWhen using the script parser, you must implement a Starlark function. The function takes in a string and must return a non-empty dictionary. The returned dictionary defines the format of the output event.
Available functions
The script parser can use any of the primitives described in the Starlark specification. Additionally, you can use the following functions:
json.decode
Decodes a JSON string to a dictionary
json.encode
Encodes a dictionary to a JSON string
base64.decode
Decodes a base64-encoded string
base64.encode
Performs base64 encoding on a string
Restrictions
The following restrictions apply to your script:
Raising exceptions is not allowed.
Imports are not allowed.
Handling JSON
While script is mainly intended to be used for text logs, it can also be used for JSON logs in cases where you want to perform transformations outside of the ones that are natively supported by Panther. For this reason, the script parser comes pre-loaded with a json module that allows you to convert JSON from type string to dictionary.
For example, the following configuration will create a new field called is_panther_employee that will be true if the actor email has the panther.com domain, and false otherwise.
parser:
script:
function: |
def parse(log):
event = json.decode(log)
if event['actor']['email'].endswith('@panther.com'):
event['is_panther_employee'] = True
else:
event['is_panther_employee'] = False
return eventFor ease of understanding, the above parse function is shown below with Python syntax highlighting:
def parse(log):
event = json.decode(log)
if event['actor']['email'].endswith('@panther.com'):
event['is_panther_employee'] = True
else:
event['is_panther_employee'] = False
return eventExample using script
scriptImagine the following log line, using the Apache Common Log format, is sent to Panther:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326To parse this log type using script, we'll define the following function:
def parse(log):
fields = log.split(" ")
return {
'remote_ip': fields[0],
'identity': fields[1],
'user': fields[2],
'timestamp': ' '.join(fields[3:5]).strip('[]'),
'request_uri': ' '.join(fields[5:8]).strip('"'),
"status": int(fields[8]),
"bytes_sent": int(fields[9])
}And use the following schema fields:
fields:
- name: remote_ip
type: string
indicators:
- ip
- name: identity
type: string
- name: user
type: string
- name: timestamp
type: timestamp
isEventTime: true
timeFormats:
- '%d/%b/%Y:%H:%M:%S %z'
- name: method
type: string
- name: request_uri
type: string
- name: protocol
type: string
- name: status
type: int
- name: bytes_sent
type: bigintAfter the log above is normalized with this parser, it becomes:
{
"bytes_sent":2326,
"identity": "-",
"method":"GET",
"protocol":"HTTP/1.0",
"remote_ip":"127.0.0.1",
"request_uri":"/apache_pb.gif",
"status":200,
"timestamp":"2000-10-10 20:55:36.000000000",
"user":"frank"
}Last updated
Was this helpful?

