# Fastmatch Log Parser

## Overview

The `fastmatch` parser uses simple string patterns that specify the position of fields within a log line. As the name suggests, it is very fast and should be considered the preferred method to parse text logs. It can handle most structured text log cases where the order of fields is known. In fact, it is so fast you can specify multiple patterns that will be tested in order, so you can 'solve' cases where there are a few variations in the structure of the log line.

### Example using fastmatch

We will be using the following example log line that is using Apache Common Log format:

```
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
```

Here is how we would define a log schema for this log type using `fastmatch`:

{% tabs %}
{% tab title="Console " %}
In the Panther Console, we would follow the [How to create a custom schema manually instructions](https://docs.panther.com/data-onboarding/custom-log-types/..#how-to-create-a-custom-schema-manually), selecting the **FastMatch** parser.

<figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-154ae17854fc2eb08a2bd19c5349e4c463939557%2Fimage.png?alt=media" alt="In a &#x22;Schema&#x22; section, &#x22;FastMatch&#x22; is selected for a Parser field. There are various form fields shown, such as Match Patterns, Empty Values, and Skip Prefix."><figcaption></figcaption></figure>

In the **Fields & Indicators** section (below the **Parser** section shown in the screenshot above), we would define the fields:

```yaml
fields:
  - name: remote_ip
    type: string
    indicators:
      - ip
  - name: identity
    type: string
  - name: user
    type: string
  - name: timestamp
    type: timestamp
    isEventTime: true
    timeFormats:
     - '%d/%b/%Y:%H:%M:%S %z'
  - name: method
    type: string
  - name: request_uri
    type: string
  - name: protocol
    type: string
  - name: status
    type: int
  - name: bytes_sent
    type: bigint
```

{% endtab %}

{% tab title="Full YAML representation" %}

```yaml
parser:
  fastmatch:
    # Define an array of patterns to match against.
    # In this example we only use one pattern because the log format is the same for all lines.
    # If we wanted to include the Apache Extended Log format, we could provide an additional pattern.
    match:
      - '%{remote_ip} %{identity} %{user} [%{timestamp}] "%{method} %{request_uri} %{protocol}" %{status} %{bytes_sent}'
    emptyValues: [ '-' ] # specify that `-` string values are considered null
fields:
  - name: remote_ip
    type: string
    indicators:
      - ip
  - name: identity
    type: string
  - name: user
    type: string
  - name: timestamp
    type: timestamp
    isEventTime: true
    timeFormats:
     - '%d/%b/%Y:%H:%M:%S %z'
  - name: method
    type: string
  - name: request_uri
    type: string
  - name: protocol
    type: string
  - name: status
    type: int
  - name: bytes_sent
    type: bigint
```

{% endtab %}
{% endtabs %}

## Understanding fastmatch patterns

The patterns use `%{field_name}` placeholders to set where in the log line a field is expected. For example, to match this text:

```
2020-10-10T14:32:05 [FOO_SERVICE@127.0.0.1] [DEBUG] "" Something when wrong
```

We can use this pattern (surrounded by single quotes for clarity):

```yaml
'%{timestamp} [%{service}@%{ip}] [%{log_level}] %{message}'
```

If you are defining a schema in the Panther Console, you will input your patterns into the **Match Patterns** field:

<figure><img src="https://4011785613-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LgdiSWdyJcXPahGi9Rs-2910905616%2Fuploads%2Fgit-blob-59d206a2b4880541d062670cff1c067f18ea2968%2Fimage.png?alt=media" alt="A &#x22;Match Patterns&#x22; field is shown. Its description is, &#x22;Define patterns to match log entries and extract structured information from unstructured log lines.&#x22;"><figcaption></figcaption></figure>

### Delimiters

The text between two consecutive fields defines the 'delimiter' between them. Delimiters cannot be empty.

In the pattern in the example above, we cannot omit the `"@"` between `service` and `ip`.

The field *preceding* a delimiter cannot contain the delimiter text. In the example above:

* `timestamp` cannot contain space `" "`
* `service` cannot contain `"@"`
* `ip` cannot contain `"] ["`
* `log_level` cannot contain `"] "`

### Anonymous fields

Field placeholders without names (`%{}`) are ignored.

### Tail capture

If the last field in a pattern does not have any delimiter text *after* it, it will capture *everything* until the end of the text. In the example above `message` will capture `"Something when wrong"`

### Handling quotes

In some cases fields can be quoted within the text:

```
2020-10-10T14:32:05 "Some quoted text with \"escaped quotes\" inside"
```

To properly unescape such fields just surround the field placeholder with quotes:

```
%{timestamp} "%{message}"
```

This works for both single and double quotes.
