Add a regular expression field
You can add a regular expression field to any dataset in your data model. Regular expression fields turn the named groups in regular expression strings into separate data model fields. You can arrange for the regular expression to extract fields from the _raw
event text as well as specific field values.
- In the Data Model Editor, open the dataset you'd like to add a regular expression field to.
- For an overview of the Data Model Editor, see Design data models.
- Click Add Field and select Regular Expression.
- This takes you to the Add Fields with a Regular Expression page.
- Under Extract From select the field that you want to extract from.
- The Extract From list should include all of the fields currently found in your dataset, with the addition of
_raw
. If your regular expression is designed to extract one or more fields from values of a specific field, choose that field from the Extract From list. On the other hand, if your regular expression is designed to parse the entire event string, choose _raw from the Extract From list.
- The Extract From list should include all of the fields currently found in your dataset, with the addition of
- Provide a Regular Expression.
- The regular expression must have at least one named group. Each named expression in the regular expression is extracted as a separate field. Field names cannot include whitespace, single quotes, double quotes, curly braces, or asterisks.
- After you provide a regular expression, the named group(s) appear under Field(s).
- Note: Regular expression fields currently do not support sed mode or sed expressions.
- (Optional) Provide different Display Name values for the field(s).
- Field Display Name values cannot include asterisk characters.
- (Optional) Correct field Type values.
- They will be given String by default.
- (Optional) Change field Flag values to whatever is appropriate for your needs.
- (Optional) Click Preview to get a look at how well the fields are represented in the dataset.
- For more information about previewing fields, see "Preview regular expression field representation," below.
- Click Save to save your changes.
- You will be returned to the Data Model Editor. The regular expression fields will be added to the list of calculated dataset fields.
For a primer on regular expression syntax and usage, see Regular-Expressions.info. You can test your regex by using it in a search with the rex search command. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions.
Preview regular expression field representation
When you click Preview after defining one or more field extraction fields, Splunk software runs the regular expression against the datasets in your dataset that have the Extract From field you've selected (or against raw data if you're extracting from _raw
) and shows you the results. The preview results appear underneath the setup fields, in a set of four or more tabbed pages. Each of these tabs shows you information taken from a sample of events in the dataset. You can determine how this sample is determined by selecting an option from the Sample list, such as First 1000 events or Last 24 hours. You can also determine how many events appear per page (default is 20).
If the preview doesn't return any events it could indicate that you need to adjust the regular expression, or that you have selected the wrong Extract From field.
The All tab
The All tab gives you a quick sense of how prevalent events that match the regular expression are in the event data. You can see an example of the All tab in action in the screen capture near the top of this topic.
It shows you an unfiltered sample of the events that have the Extract From field in their data. For example, if the Extract From field you've selected is uri_path
this tab displays only events that have a uri_path
value.
The first column indicates whether the event matched the regular expression or not. Events that match have green checkmarks. Non-matching events have red "x" marks.
The second column displays the value of the Extract From field in the event. If the Extract From field is _raw
, the entire event string is displayed. The remaining columns display the field values extracted by the regular expression, if any.
The Match and Non-Match tabs
The Match and Non-Match tabs are similar to the All tab except that they are filtered to display either just events that match the regular expression or just events that do not match the regular expression. These tabs help you get a better sense of the field distribution in the sample, especially if the majority of events in the sample fall in either the matching or non-matching event set.
The field tab(s)
Each field named in the regular expression gets its own tab. A field tab provides a quick summary of the value distribution in the chosen sample of events. It's set up as a top values list, organized by Count and percentage. If you don't see the values you're expecting, or if the value distribution you are seeing seems off to you, this can be an indication that you need to fine-tune your regular expression.
You can also increase the sample size to find rare field values or values that appear further back in the past. In the example below, setting Sample to First 10,000 events uncovered a number of values for the path
field that do not appear when only the first 1,000 events are sampled.
The top value tables in field tabs are drilldown-enabled. You can click on a row to see all of the events represented by that row. For example, if you are looking at the path
field and you see that there are 6 events with the path /numa/
, you can click on the /numa/
row to go to a list that displays the 6 events where path="/numa/"
.
Add a lookup field | Add a Geo IP field |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!