
rex
Description
Use this command to either extract fields using regular expression named groups, or replace or substitute characters in a field using sed expressions.
The rex
command matches the value of the specified field against the unanchored regular expression and extracts the named groups into fields of the corresponding names. If a field is not specified, the regular expression is applied to the _raw field.
Note: Running rex
against the _raw field might have a performance impact.
When mode=sed, the given sed expression used to replace or substitute characters is applied to the value of the chosen field. If a field is not specified, the sed expression is applied to _raw. This sed-syntax is also used to mask sensitive data at index-time.
Use the rex
command for search-time field extraction or string replacement and character substitution.
Syntax
rex [field=<field>] ( <regex-expression> [max_match=<int>] [offset_field=<string>] ) | (mode=sed <sed-expression>)
Required arguments
- regex-expression
- Syntax: "<string>"
- Description: The PCRE regular expression that defines the information to match and extract from the specified field. Quotation marks are required.
- mode
- Syntax: mode=sed
- Description: Specify to indicate that you are using a sed (UNIX stream editor) expression.
- sed-expression
- Syntax: "<string>"
- Description: When mode=sed, specify whether to replace strings (s) or substitute characters (y) in the matching regular expression. No other sed commands are implemented. Quotation marks are required. Sed mode supports the following flags: global (g) and Nth occurrence (N), where N is a number that is the character location in the string.
Optional arguments
- field
- Syntax: field=<field>
- Description: The field that you want to extract information from.
- Default:
_raw
- max_match
- Syntax: max_match=<int>
- Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields are multivalued fields.
- Default: 1, use 0 to mean unlimited.
- offset_field
- Syntax: offset_field=<string>
- Description: If provided, a field is created with the name specified by
<string>
. This value of the field has the endpoints of the match in terms of zero-offset characters into the matched field. For example, if therex
expression is "(?<tenchars>.{10})", this matches the first ten characters of the field, and the offset_field contents is "0-9". - Default: unset
Sed expression
When using the rex command in sed mode, you have two options: replace (s) or character substitution (y).
The syntax for using sed to replace (s) text in your data is: "s/<regex>/<replacement>/<flags>"
- <regex> is a PCRE regular expression, which can include capturing groups.
- <replacement> is a string to replace the regex match. Use
\n
for backreferences, where "n" is a single digit. - <flags> can be either: g to replace all matches, or a number to replace a specified match.
The syntax for using sed to substitute characters is: "y/<string1>/<string2>/"
- This substitutes the characters that match <string1> with the characters in <string2>.
Usage
Splunk Enterprise uses perl-compatible regular expressions (PCRE).
When you use regular expressions in searches, you need to be aware of how characters such as pipe ( | ) and backslash ( \ ) are handled. See SPL and regular expressions in the Search Manual.
For general information about regular expressions, see Splunk Enterprise regular expressions in the Knowledge Manager Manual.
Examples
Example 1:
Extract "from" and "to" fields using regular expressions. If a raw event contains "From: Susan To: Bob", then from=Susan
and to=Bob
.
... | rex field=_raw "From: (?<from>.*) To: (?<to>.*)"
Example 2:
Extract "user", "app" and "SavedSearchName" from a field called "savedsearch_id" in scheduler.log events. If savedsearch_id=bob;search;my_saved_search
then user=bob
, app=search
and SavedSearchName=my_saved_search
... | rex field=savedsearch_id "(?<user>\w+);(?<app>\w+);(?<SavedSearchName>\w+)"
Example 3:
Use sed
syntax to match the regex to a series of numbers and replace them with an anonymized string.
... | rex field=ccnumber mode=sed "s/(\d{4}-){3}/XXXX-XXXX-XXXX-/g"
Example 4:
Display IP address and ports of potential attackers.
sourcetype=linux_secure port "failed password" | rex "\s+(?<ports>port \d+)" | top src_ip ports showperc=0
This search used rex to extract the port field and values. Then, it displays a table of the top source IP addresses (src_ip) and ports the returned with the search for potential attackers.
See also
PREVIOUS reverse |
NEXT rtorder |
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15
Comments
The syntax for using sed to substitute characters is: "y//"<br /><br />Is this a typo? Should it be "y///" ?<br />Seems pretty important to get right since it is demonstrating the syntax.
Tinhuty, sed-mode only supports g and N flags. I've updated the topic. Thank you!
How to do case insensitive in the sed mode? I tried flag "I" or "i", neither is working.
Thanks, Netmonkey. That is definitely worth pointing out!
The example shown for the mode=sed operates on the whole _raw field.<br /><br />... | rex mode=sed "s/(\\d{4}-){3}/XXXX-XXXX-XXXX-/g"<br /><br />It should be noted that the field= can and, in most cases, should be used with mode=sed. Without it, regex on the whole _raw field may have a performance impact (depending on the size of your logged events).<br /><br />i.e.<br /><br />... | rex field=ccnumber mode=sed "s/(\\d{4}-){4}/XXXX-XXXX-XXXX-XXXX/g"
Actually, quotes are required around all regular and sed expressions. (It might work without them, in some cases, but...) Sorry for the confusion.
Regular expressions containing commas (as plain text or as repeat counts) must be quoted.
Splaktar, thanks for catching that typo!