CSV Log Parser
Overview
Using the csv
log parser, CSV files are parsed by converting each row into a simple JSON object, mapping keys to values. To do this, each column must be given a name.
CSV logs without header
To parse CSV logs without a header row, Panther needs to know which names to assign to each column.
Let's assume our logs are CSV with 7 columns: year, month, day, time, action, ip_address, message. Some example rows of this file could be:
# Access logs for 20200901
2020,09,01,10:35:23, SEND ,192.168.1.3,"PING"
2020,09,01,10:35:25, RECV ,192.168.1.3,"PONG"
2020,09,01,10:35:25, RESTART ,-,"System restarts"
We would use the following LogSchema to define log type:
In the Panther Console, we would follow the How to create a custom schema manually instructions, selecting the CSV parser.

In the Fields & Indicators section (below the Parser section shown in the screenshot above), we would define the fields:
fields:
- name: timestamp
type: timestamp
timeFormats:
- rfc3339
isEventTime: true
required: true
- name: action
type: string
required: true
- name: ip_address
type: string
indicators: [ip]
- name: message
type: string
CSV logs with header
To parse CSV files that start with a header row, there are two options:
Use the names defined in the header as the names for the JSON fields
If your CSV files have a header and you do not explicitly define columns (letting them instead by defined by the header), do not combine this schema in the same log source (or single S3 prefix) with other schemas. Doing so could cause logs to be improperly classified.
Disregard the header and define column names explicitly the same way you would for header-less CSV files
To use the names in the header the configuration for the parser should be:
parser:
csv:
delimiter: ","
# Setting 'hasHeader' to true without specifying a 'columns' field,
# tells Panther to set the column names from values in the header.
hasHeader: true
# In case you want to rename a column you can use the 'expandFields' directive
expandFields:
# Let's assume that the header contains '$cost' as column name and you want to 'normalize' it as 'cost_us_dollars'
"cost_us_dollars": '%{$cost}'
To ignore the header and define your set of names for the columns use:
parser:
csv:
delimiter: ","
# Setting 'hasHeader' to true while also specifying a 'columns' field,
# tells Panther to ignore the header and use the names in the 'columns' array
hasHeader: true
columns:
- foo
- bar
- baz
Last updated
Was this helpful?