Splunk® Data Stream Processor

Function Reference

On April 3, 2023, Splunk Data Stream Processor will reach its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.

Parse regex (rex)

Extract or rename fields using regular expression named capture groups, or edit fields using a sed expression.

The rex function matches the value of the specified field against the unanchored regular expression and extracts the named groups into fields of the corresponding names.

When mode=sed, the given sed expression used to replace or substitute characters is applied to the value of the chosen field. This sed-syntax can also be used to mask sensitive data.

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<S>>
This function outputs the same collection of records but with a different schema S.

Arguments

Argument Input Description UI example
field string The field that you want to extract information from. body
pattern string The Java regular expression that defines the information to match and extract from the specified field. /[(?<timestamp>\d+)].*/
max_match int Optional. Controls the number of times the regular expression is matched. If greater than 1, the resulting fields are multivalued fields. Use 0 for unlimited matches. Defaults to 1. 10
offset_field string Optional. If provided, a field is created with the name specified by <string>. This value of the field has the endpoints of the match in terms of zero-offset characters into the matched field. For example, if the rex expression is (?<tenchars>.{10}), this matches the first ten characters of the field, and the offset_field contents is 0-9. newofield
mode string Specify to indicate that you are using a sed (UNIX stream editor) expression. sed

Using a sed expression

When using the rex function in sed mode, you have two options: replace (s) or character substitution (y).

The syntax for using sed to replace (s) text in your data is: "s/<regex>/<replacement>/<flags>"

  • <regex> is a Java regular expression, which can include capturing groups.
  • <replacement> is a string to replace the regex match. Use \n for backreferences, where "n" is a single digit.
  • <flags> can be either: g to replace all matches, or a number to replace a specified match.

The syntax for using sed to substitute characters is: "y/<string1>/<string2>/"

  • This substitutes the characters that match <string1> with the characters in <string2>.

DSL example

This example extracts email values using regular expressions:

rex(events, "messages", "From: (?<from>.*) To: (?<to>.*)", 10, "newofield", null);

This example uses a <sed-expression>:

rex(events, "ccnumber", "s/(d{4}-){3}/XXXX-XXXX-XXXX-/g", null, null, "sed");
Last modified on 10 February, 2020
Parse delimited   Rename

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.0.1


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters