SPL and regular expressions
Regular expressions in the Splunk Search Processing Language (SPL) are Perl Compatible Regular Expressions (PCRE).
You can use regular expressions with the rex
and regex
commands. You can also use regular expressions with evaluation functions such as match
and replace
. See Evaluation functions in the Search Manual.
The following sections provide guidance on regular expressions in SPL searches.
Pipe characters
A pipe character ( | ) is used in regular expressions to specify an OR condition. For example, A | B
means A or B.
Because pipe characters are used to separate commands in SPL, you must enclose a regular expression that uses the pipe character in quotation marks. The following search shows how to use quotation marks around a pipe character, which is interpreted by SPL as a search for the text "expression" OR "with pipe"..
...|regex "expression | with pipe"
Backslash characters in regular expressions
The backslash character ( \ ) is used in regular expressions to escape any special characters that have meaning in regular expressions, such as periods ( . ), double quotation marks ( " ), and backslashes themselves. For example, the period character is used in a regular expression to match any character, except a line break character. If you want to match a period character, you must escape the period character by specifying \.
in your regular expression.
In searches that include a regular expression that contains a double backslash, like the file path c:\\temp
, the search interprets the first backslash as a regular expression escape character. The file path is interpreted as c:\temp
, because one of the backslashes is removed. You must escape both backslash characters in the file path by specifying 4 consecutive backslashes for the root portion of the file path, such as c:\\\\temp
. For a longer file path, such as c:\\temp\example
, you can specify c:\\\\temp\\example
in your regular expression in the search string.
One reason you might need extra escaping backslashes in your searches is that the Splunk platform parses text twice; once for SPL and then again for regular expressions. Each parse applies its own use of backslashes in layers and treats each backslash as a special character that needs an additional backslash to make it literal. As a result, \\
in SPL becomes \
before it is parsed as a regular expression, and \\\\
in SPL becomes \\
before it is parsed as a regular expression.
See Backslashes in the Search Manual.
Avoid extra escaping backslash characters
To avoid using extra escaping backslashes in your searches, you can use the octal code \134
or the hexadecimal code \x5c
in your regular expression. These codes are equivalent to the backslash character and get around the need to double-escape backslashes. For example, consider the following search, which extracts the characters ABC
that follow 2 backslashes:
| makeresults
| eval example="xyz\\ABC"
| rex field=example max_match=3 ".*\\\(?<extract>.*)"
The search results look something like this:
time | example | extract |
---|---|---|
2023-09-20 17:20:59 | xyz\ABC | ABC |
Instead of using 3 backslashes, you can get the same search results using \x5c
in the regular expression, like this:
| makeresults
| eval example="xyz\\ABC"
| rex field=example max_match=3 ".*\x5c(?<extract>.*)"
More about regular expressions
For more information:
- See Extract fields using regular expressions
- See About Splunk regular expressions in the Knowledge Manager Manual.
Field expressions | About search optimization |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!