Extract fields in a search
When events are added to and indexed by Splunk, Splunk software processes and extracts fields from the events. This process is called field extraction.
Splunk software automatically extracts host
, source
, and sourcetype
values, timestamps, and several other default fields when it indexes incoming events. Splunk software also extracts fields that appear in your event data as key=value
pairs.
Not every piece of event data is extracted into fields.
You might want to create additional field extractions for a specific search. These custom field extractions are not permanent field extractions. The fields are extracted for a specific search that the extraction is added to.
You must specify a regular expression, using the Regular Expression 2 (RE2) syntax, to identify the data to extract. The name you use for the capture group becomes the name of the extracted field. The matched values become the values of the extracted field.
While you can use the rex
command to perform field extractions, the method described in this topic provides an interface where you can see the impact of your regular expression on a set of events.
The following sections include examples of how to extract a field from the _raw
field and extract a field from another field.
Extract a field from _raw
Consider the following set of events:
Wed Feb 14 2023 23:16:57 mailsv1 sshd[4590]: Failed password for apache from 78.111.167.117 port 3801 ssh2 Wed Feb 14 2023 15:51:38 mailsv1 sshd[1991]: Failed password for grumpy from 76.169.7.252 port 1244 ssh2 Mon Feb 12 2023 09:31:03 mailsv1 sshd[5800]: Failed password for invalid user guest from 66.69.195.226 port 2903 ssh2 Sun Feb 11 2023 14:12:56 mailsv1 sshd[1565]: Failed password for invalid user noone from 187.231.45.62 port 1092 ssh2 Sun Feb 11 2023 07:09:29 mailsv1 sshd[3560]: Failed password for games from 187.231.45.62 port 3752 ssh2 Sat Feb 10 2023 03:25:43 mailsv1 sshd[2442]: Failed password for invalid user admin from 211.166.11.101 port 1797 ssh2 Fri Feb 09 2023 21:45:20 mailsv1 sshd[1689]: Failed password for invalid user guest from 222.41.213.238 port 2658 ssh2 Fri Feb 09 2023 06:27:34 mailsv1 sshd[2226]: Failed password for invalid user noone from 199.15.234.66 port 3366 ssh2 Fri Feb 09 2023 18:32:51 mailsv1 sshd[5710]: Failed password for agarcia from 209.160.24.63 port 1775 ssh2 Thu Feb 08 2023 08:42:11 mailsv1 sshd[3202]: Failed password for invalid user noone from 175.44.1.172 port 2394 ssh2
You want to extract the user name for only the events that include the phrase "invalid user", and place the user name into a field called "failedUser".
- Run a search on the dataset which contains the information that you want to extract.
- In the results pane, on the
_raw
field, select the Options menu . - In the Options menu, select Extract fields from _raw. The Extract fields dialog box appears.
- In the Regular expression field, specify a named capture group using RE2 syntax. You can select a regular expression from the Insert from library list or enter the regular expression directly in the field.
- For this set of events, because the user name appears after the phrase "invalid user", enter the phrase "invalid user" in the Regular expression field.
- From the Insert from library list, select username. This regular expression will capture the word immediately after the phrase "invalid user". That word can be comprised of one or more alphabetic or numeric characters.
- "Username" in the regular expression specifies the name of the field to place the extracted data in. Change "Username" to "failedUser".
- The regular expression should look like this:
invalid user (?P<failedUser>[a-zA-Z0-9._-]+)
.
The namefailedUser
and appears in the Fields to extract list.
- Use the Events preview to validate that your regular expression is highlighting the information that you want to extract.
- Click Apply to perform the field extraction. A
rex
command with the regular expression is added to your SPL2 search statement. The new fieldfailedUser
appears in the Fields list on the Data tab.
Extract a field from another field
Consider the following fields from the Search Experience sample data:
_time | method | path |
---|---|---|
8:54:26 AM 12 June 2022 | POST | /category.screen |
8:54:25 AM 12 June 2022 | GET | /oldlink |
8:54:24 AM 12 June 2022 | POST | /cart.do |
11:37:51 PM 11 June 2022 | GET | /product.screen |
11:37:51 PM 11 June 2022 | GET | /cart/error.do |
You want to extract the extension from the path.
- Run a search on the sample dataset.
- Select the
path
field from the Fields list to display the field in the results pane. - In the results pane, select the Options menu for the
path
field. - In the Options menu, select Extract fields from path. The Extract fields dialog box appears.
- In the Regular expression field, enter a named capture group using RE2 syntax. Because you only want to extract the extension, the regular expression should look like this:
\.(?P<extension>(.*$))
.
- Use the Events preview to validate that your regular expression is highlighting the information that you want to extract.
- Click Apply to perform the field extraction. A
rex
command with the regular expression is added to your SPL2 search statement. The new fieldextension
appears in the Fields list on the Data tab.
Creating charts | Keyboard shortcuts |
This documentation applies to the following versions of Splunk Cloud Platform™: search2preview
Feedback submitted, thanks!