Regex Log Parser
Last updated
Was this helpful?
Last updated
Was this helpful?
For text log types with more complex structure, you can use the regex
parser.
The regex
parser uses named groups in regular expressions to extract field values from each line of text. You can use grok syntax (i.e. %{PATTERN_NAME:field_name}
) to build complex expressions taking advantage of the built-in patterns provided by Panther or by defining your own.
Panther's log processor uses the for regular expressions. RE2
does not support some operations common to other regular expression engines, such as lookbehind
. Be sure to check any expressions or grok patterns you copy/paste from other systems.
For example to match the text
We can use this grok syntax with this pattern:
Which is the rough equivalent of this 'raw' regular expression:
Using the regex
parser we will define a log type for Juniper.Audit
logs. Panther already , but we will be using them here because they have variable conflicting forms and can only be 'solved' by using regex
parser.
The sample logs for Juniper.Audit
are:
Here is how we would define a log schema for these logs using regex
:
In the Fields & Indicators section (below the Parser section shown in the screenshot above), we would define the fields:
The following tables detail the built-in Panther regex patterns you can use.
DATA
.*?
GREEDYDATA
.*
NOTSPACE
\S+
SPACE
\s*
WORD
\b\w+\b
QUOTEDSTRING
"(?:\.|[^\"]+)+"|""|'(?:\.|[^\']+)+'|''
HEXDIGIT
[0-9a-fAF]
UUID
%{HEXDIGIT}{8}-(?:%{HEXDIGIT}{4}-){3}%{HEXDIGIT}{12}
INT
[+-]?(?:[0-9]+)
BASE10NUM
[+-]?(?:[0-9]+(?:.[0-9]+)?)|.[0-9]+
NUMBER
%{BASE10NUM}
BASE16NUM
(?:0[xX])?%{HEXDIGIT}+
POSINT
\b[1-9][0-9]*\b
NONNEGINT
\b[0-9]+\b
CISCOMAC
(?:[A-Fa-f0-9]{4}.){2}[A-Fa-f0-9]{4}
WINDOWSMAC
(?:[A-Fa-f0-9]{2}-){5}[A-Fa-f0-9]{2}
COMMONMAC
(?:[A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2}
MAC
%{CISCOMAC}|%{WINDOWSMAC}|%{COMMONMAC}
IPV6
\b(?:(?:(?:%{HEXDIGIT}{1,4}:){7}(?:%{HEXDIGIT}{1,4}|:))|(?:(?:%{HEXDIGIT}{1,4}:){6}(?::%{HEXDIGIT}{1,4}|(?:(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(?:.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(?:(?:%{HEXDIGIT}{1,4}:){5}(?:(?:(?::%{HEXDIGIT}{1,4}){1,2})|:(?:(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(?:.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|((%{HEXDIGIT}{1,4}:){4}(((:%{HEXDIGIT}{1,4}){1,3})|((:%{HEXDIGIT}{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|((%{HEXDIGIT}{1,4}:){3}(((:%{HEXDIGIT}{1,4}){1,4})|((:%{HEXDIGIT}{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|((%{HEXDIGIT}{1,4}:){2}(((:%{HEXDIGIT}{1,4}){1,5})|((:%{HEXDIGIT}{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|((%{HEXDIGIT}{1,4}:){1}(((:%{HEXDIGIT}{1,4}){1,6})|((:%{HEXDIGIT}{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:%{HEXDIGIT}{1,4}){1,7})|((:%{HEXDIGIT}{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\b
IPV4INT
25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]
IPV4
\b(?:(?:%{IPV4INT}).){3}(?:%{IPV4INT})\b
IP
%{IPV6}|%{IPV4}
HOSTNAME
\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(.?|\b)
IPORHOST
%{IP}|%{HOSTNAME}
HOSTPORT
%{IPORHOST}:%{POSINT}
USERNAME
[a-zA-Z0-9._-]+
UNIXPATH
(?:/[\w_%!$@:.,-]?/?)(\S+)?
WINPATH
(?:[A-Za-z]:|\)(?:\[^\?])+
PATH
(?:%{UNIXPATH}|%{WINPATH})
TTY
(?:/dev/(pts|tty([pq])?)(\w+)?/?(?:[0-9]+))
URIPROTO
[A-Za-z]+(?:+[A-Za-z+]+)?
URIHOST
%{IPORHOST}(?::%{POSINT})?
URIPATH
(?:/[A-Za-z0-9$.+!*'(){},~:;=@#%_-]*)+
URIPARAM
?[A-Za-z0-9$.+!*'|(){},~@#%&/=:;_?-[]<>]*
URIPATHPARAM
%{URIPATH}(?:%{URIPARAM})?
URI
%{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?
MONTH
\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\b MONTHNUM 0?[1-9]|1[0-2]
MONTHNUM
0?[1-9]|1[0-2]
MONTHNUM2
0[1-9]|1[0-2]
MONTHDAY
(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]
DAY
\b(?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)\b
YEAR
(?:\d\d){1,2}
HOUR
2[0123]|[01]?[0-9]
MINUTE
[0-5][0-9]
SECOND
(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?
KITCHEN
%{HOUR}:%{MINUTE}
TIME
%{HOUR}:%{MINUTE}:%{SECOND}
DATE_US
%{MONTHNUM}[/-]%{MONTHDAY}[/-]%{YEAR}
DATE_EU
%{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}
ISO8601_TIMEZONE
(?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
ISO8601_SECOND
(?:%{SECOND}|60)
TIMESTAMP_ISO8601
%{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
DATE
%{DATE_US}|%{DATE_EU}
DATETIME
%{DATE}[- ]%{TIME}
TZ
[A-Z]{3}
TZOFFSET
[+-]\d{4}
TIMESTAMP_RFC822
%{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} %{TZ}
TIMESTAMP_RFC2822
%{DAY}, %{MONTHDAY} %{MONTH} %{YEAR} %{TIME} %{ISO8601_TIMEZONE}
TIMESTAMP_OTHER
%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{TZ} %{YEAR}
TIMESTAMP_EVENTLOG
%{YEAR}%{MONTHNUM2}%{MONTHDAY}%{HOUR}%{MINUTE}%{SECOND}
SYSLOGTIMESTAMP
%{MONTH} +%{MONTHDAY} %{TIME}
HTTPDATE
%{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{TZOFFSET}
NS
NOTSPACE
QS
QUOTEDSTRING
HOST
HOSTNAME
PID
POSINT
USER
USERNAME
In the Panther Console, we would follow the , selecting the Regex parser.