# Script Log Parser

## Overview

`script` is one of the possible values of the [`parser` key](/data-onboarding/custom-log-types/reference.md#parserspec) in a custom log schema. This parser lets you specify the transformations Panther should perform on each incoming log event using the [Starlark configuration language](https://bazel.build/rules/language), which shares many syntax similarities with Python. The `script` parser in Panther can handle both structured (JSON) and unstructured events.

You might benefit from using the `script` parser when you'd like to:

* Parse unstructured logs, but the other parser options ([`csv`](/data-onboarding/custom-log-types/csv-parser.md), [`fastmatch`](/data-onboarding/custom-log-types/fastmatch-parser.md), [`regex`](/data-onboarding/custom-log-types/regex-parser.md)) are insufficient
* Perform transformations on the data, but the Panther-provided [schema transformations](/data-onboarding/custom-log-types/transformations.md) are insufficient

## Understanding the `script` parser

### Defining a `function`

When using the `script` parser, you must implement a Starlark `function`. The function takes in a [string](https://github.com/google/starlark-go/blob/master/doc/spec.md#strings) and must return a non-empty [dictionary](https://github.com/google/starlark-go/blob/master/doc/spec.md#dictionaries). The returned dictionary defines the format of the output event.

### Available functions

The `script` parser can use any of the primitives described in the [Starlark specification](https://github.com/google/starlark-go/blob/master/doc/spec.md). Additionally, you can use the following functions:

<table><thead><tr><th width="177.61370849609375">Function name</th><th>Description</th></tr></thead><tbody><tr><td>json.decode</td><td>Decodes a JSON string to a dictionary</td></tr><tr><td>json.encode</td><td>Encodes a dictionary to a JSON string</td></tr><tr><td>base64.decode</td><td>Decodes a base64-encoded string</td></tr><tr><td>base64.encode</td><td>Performs base64 encoding on a string</td></tr></tbody></table>

### Restrictions

The following restrictions apply to your script:

* Raising exceptions is not allowed.
* Imports are not allowed.

### Handling JSON

While `script` is mainly intended to be used for text logs, it can also be used for JSON logs in cases where you want to perform transformations outside of [the ones that are natively supported by Panther](/data-onboarding/custom-log-types/transformations.md). For this reason, the `script` parser comes pre-loaded with a `json` module that allows you to convert JSON from type string to dictionary.

For example, the following configuration will create a new field called `is_panther_employee` that will be `true` if the actor email has the `panther.com` domain, and `false` otherwise.

```yaml
parser:
  script:
    function: |
      def parse(log):
        event = json.decode(log)
        if event['actor']['email'].endswith('@panther.com'):
          event['is_panther_employee'] = True
        else:
          event['is_panther_employee'] = False
        return event
```

For ease of understanding, the above `parse` function is shown below with Python syntax highlighting:

```python
def parse(log):
  event = json.decode(log)
  if event['actor']['email'].endswith('@panther.com'):
    event['is_panther_employee'] = True
  else:
    event['is_panther_employee'] = False
  return event
```

## Example using `script`

Imagine the following log line, using the Apache Common Log format, is sent to Panther:

<pre><code><strong>127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
</strong></code></pre>

To parse this log type using `script`, we'll define the following function:

```python
def parse(log):
  fields = log.split(" ")
  return {
    'remote_ip': fields[0],
    'identity': fields[1],
    'user': fields[2],
    'timestamp': ' '.join(fields[3:5]).strip('[]'),
    'request_uri': ' '.join(fields[5:8]).strip('"'),
    "status": int(fields[8]),
    "bytes_sent": int(fields[9])
  }
```

And use the following schema fields:

```yaml
fields:
  - name: remote_ip
    type: string
    indicators:
      - ip
  - name: identity
    type: string
  - name: user
    type: string
  - name: timestamp
    type: timestamp
    isEventTime: true
    timeFormats:
     - '%d/%b/%Y:%H:%M:%S %z'
  - name: method
    type: string
  - name: request_uri
    type: string
  - name: protocol
    type: string
  - name: status
    type: int
  - name: bytes_sent
    type: bigint
```

After the log above is normalized with this parser, it becomes:

```json
{
    "bytes_sent":2326,
    "identity": "-",
    "method":"GET",
    "protocol":"HTTP/1.0",
    "remote_ip":"127.0.0.1",
    "request_uri":"/apache_pb.gif",
    "status":200,
    "timestamp":"2000-10-10 20:55:36.000000000",
    "user":"frank"
}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.panther.com/data-onboarding/custom-log-types/script-parser.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
