Splunk Cloud

Search Reference

Download manual as PDF

Download topic as PDF

rex

Description

Use this command to either extract fields using regular expression named groups, or replace or substitute characters in a field using sed expressions.

The rex command matches the value of the specified field against the unanchored regular expression and extracts the named groups into fields of the corresponding names.

When mode=sed, the given sed expression used to replace or substitute characters is applied to the value of the chosen field. This sed-syntax is also used to mask sensitive data at index-time. Read about using sed to anonymize data in the Getting Data In Manual.

If a field is not specified, the regular expression or sed expression is applied to the _raw field. Running the rex command against the _raw field might have a performance impact.

Use the rex command for search-time field extraction or string replacement and character substitution.

Syntax

The required syntax is in bold.

rex [field=<field>]
( <regex-expression> [max_match=<int>] [offset_field=<string>] ) | (mode=sed <sed-expression>)

Required arguments

You must specify either <regex-expression> or mode=sed <sed-expression>.

regex-expression
Syntax: "<string>"
Description: The PCRE regular expression that defines the information to match and extract from the specified field. Quotation marks are required.
mode
Syntax: mode=sed
Description: Specify to indicate that you are using a sed (UNIX stream editor) expression.
sed-expression
Syntax: "<string>"
Description: When mode=sed, specify whether to replace strings (s) or substitute characters (y) in the matching regular expression. No other sed commands are implemented. Quotation marks are required. Sed mode supports the following flags: global (g) and Nth occurrence (N), where N is a number that is the character location in the string.

Optional arguments

field
Syntax: field=<field>
Description: The field that you want to extract information from.
Default: _raw
max_match
Syntax: max_match=<int>
Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields are multivalued fields. Use 0 to specify unlimited matches. Multiple matches apply to the repeated application of the whole pattern. If your regex contains a capture group that can match multiple times within your pattern, only the last capture group is used for multiple matches.
Default: 1
offset_field
Syntax: offset_field=<string>
Description: If provided, a field is created with the name specified by <string>. This value of the field has the endpoints of the match in terms of zero-offset characters into the matched field. For example, if the rex expression is "(?<tenchars>.{10})", this matches the first ten characters of the field, and the offset_field contents is "0-9".
Default: No default

Usage

The rex command is a distributable streaming command. See Command types.

rex command or regex command?

Use the rex command to either extract fields using regular expression named groups, or replace or substitute characters in a field using sed expressions.

Use the regex command to remove results that do not match the specified regular expression.

Regular expressions

Splunk SPL uses perl-compatible regular expressions (PCRE).

When you use regular expressions in searches, you need to be aware of how characters such as pipe ( | ) and backslash ( \ ) are handled. See SPL and regular expressions in the Search Manual.

For general information about regular expressions, see Splunk Enterprise regular expressions in the Knowledge Manager Manual.

Sed expressions

When using the rex command in sed mode, you have two options: replace (s) or character substitution (y).

The syntax for using sed to replace (s) text in your data is: "s/<regex>/<replacement>/<flags>"

  • <regex> is a PCRE regular expression, which can include capturing groups.
  • <replacement> is a string to replace the regex match. Use \n for back references, where "n" is a single digit.
  • <flags> can be either g to replace all matches, or a number to replace a specified match.

The syntax for using sed to substitute characters is: "y/<string1>/<string2>/"

  • This substitutes the characters that match <string1> with the characters in <string2>.

Examples

1. Extract email values using regular expressions

Extract email values from events to create from and to fields in your events. For example, you have events such as:

Mon Mar 19 20:16:27 2018 Info: Bounced: DCID 8413617 MID 19338947 From: <MariaDubois@example.com> To: <zecora@buttercupgames.com> RID 0 - 5.4.7 - Delivery expired (message too old) ('000', ['timeout']) 

Mon Mar 19 20:16:03 2018 Info: Delayed: DCID 8414309 MID 19410908 From: <WeiZhang@example.com> To: <mcintosh@buttercupgames.com> RID 0 - 4.3.2 - Not accepting messages at this time ('421', ['4.3.2 try again later']) 

Mon Mar 19 20:16:02 2018 Info: Bounced: DCID 0 MID 19408690 From: <Exit_Desk@sample.net> To: <lyra@buttercupgames.com> RID 0 - 5.1.2 - Bad destination host ('000', ['DNS Hard Error looking up mahidnrasatyambsg.com (MX):  NXDomain']) 

Mon Mar 19 20:15:53 2018 Info: Delayed: DCID 8414166 MID 19410657 From: <Manish_Das@example.com> To: <dash@buttercupgames.com> RID 0 - 4.3.2 - Not accepting messages at this time ('421', ['4.3.2 try again later']) 

When the events were indexed, the From and To values were not identified as fields. You can use the rex command to extract the field values and create from and to fields in your search results.

The from and to lines in the _raw events follow an identical pattern. Each from line is From: and each to line is To:. The email addresses are enclosed in angle brackets. You can use this pattern to create a regular expression to extract the values and create the fields.

source="cisco_esa.txt" | rex field=_raw "From: <(?<from>.*)> To: <(?<to>.*)>"

You can remove duplicate values and return only the list of address by adding the dedup and table commands to the search.

source="cisco_esa.txt" | rex field=_raw "From: <(?<from>.*)> To: <(?<to>.*)>" | dedup from to | table from to

The results look something like this: This image shows the results of the search. There are two columns, from and to, that display email addresses.

2. Extract from multi-valued fields using max_match

You can use the max_match argument to specify that the regular expression runs multiple times to extract multiple values from a field.

For example, use the makeresults command to create a field with multiple values:

| makeresults | eval test="a$1,b$2"


The results look something like this:

_time test
2019-12-05 11:15:28 a$1,b$2

To extract each of the values in the test field separately, you use the max_match argument with the rex command. For example:

...| rex field=test max_match=0 "((?<field>[^$]*)\$(?<value>[^,]*),?)"

The results look something like this:

_time field test value
2019-12-05 11:36:57 a

b

a$1,b$2 1

2


3. Extract values from a field in scheduler.log events

Extract "user", "app" and "SavedSearchName" from a field called "savedsearch_id" in scheduler.log events. If savedsearch_id=bob;search;my_saved_search then user=bob , app=search and SavedSearchName=my_saved_search

... | rex field=savedsearch_id "(?<user>\w+);(?<app>\w+);(?<SavedSearchName>\w+)"

4. Use a sed expression

Use sed syntax to match the regex to a series of numbers and replace them with an anonymized string.

... | rex field=ccnumber mode=sed "s/(\d{4}-){3}/XXXX-XXXX-XXXX-/g"

5. Display IP address and ports of potential attackers

Display IP address and ports of potential attackers.

sourcetype=linux_secure port "failed password" | rex "\s+(?<ports>port \d+)" | top src_ip ports showperc=0

This search used rex to extract the port field and values. Then, it displays a table of the top source IP addresses (src_ip) and ports the returned with the search for potential attackers.

See also

extract, kvform, multikv, regex, spath, xmlkv

Last modified on 06 December, 2019
PREVIOUS
reverse
  NEXT
rtorder

This documentation applies to the following versions of Splunk Cloud: 8.0.2001, 7.1.3, 7.1.6, 7.2.4, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters