Extract fields

Use Extract Fields functionality to parse the data in your source types and create field extractions.

Parse data

To extract fields from your data, you must parse the data for each of the source types in your add-on. The Field Extractor supports parsing for the following data formats:

Unstructured Data. Typically used for log files.
Table. Data in tabular formats, such as comma-separated values (CSV) and tab-separated values (TSV).
Key Value. Data that contains key-value pairs.
JSON. Data in the JavaScript Object Notation (JSON) format.
XML. Data in the Extensible Markup Language (XML) format.

To parse data for a source type and extract fields

On your add-on homepage, click Extract Fields on the Add-on Builder navigation bar.
On the Extract Fields page, from Sourcetype, select a source type to parse.
From Format, select the data format of the data. Any detected format type is automatically selected and you can change the format type as needed. If you aren't sure what format type you need and a format type has not been automatically selected, use "Unstructured Data" as the format type.
Click Parse.

Extract fields

After parsing the data, the Add-on Builder displays the results on a summary page.

If you are satisfied with the results, click Save.
If you want to try parsing again using a different format, click Cancel to return to the previous page.

After data for a source type has been parsed, the source type is added to the table on the Extract Fields page.

To retrieve parsed field extractions, click Load Results for the source type.

Unstructured Data

The Add-on Builder's field extractor displays a selection of events in groups, along with the extracted fields. Use this display to:

Select one or more groups to represent the data.
Display the regular expression that the field extractor used, and modify it to improve the field extraction.
Click on individual field names to include or exclude the field for extraction.
Click the Edit icon next to a field name to edit the field name.
Click the Trash icon next to a field name to remove its capture group from the regular expression.

Table

The Table format is used with tabular data and lets you:

Change how data is parsed by selecting the delimiter character that is used to separate fields. To specify a different character, click Other and enter the character.
Change the field names after you have selected the correct delimiter. Note that each time you change delimiters, the number of columns might change and cause you to lose changes to field names.

Key Value

The Key Value format is used with data containing key-value pairs and lets you do the following:

Change how data is parsed. For Extraction Methods, you can select:

Auto to let the Add-on Builder parse data automatically.
Delimiters to use delimiters.
Regex to use regular expressions.

For Delimiters, select the delimiters for the key-value pairs:

Specify the pair delimiter character, which is used to separate key-value pairs.
Using the example key_a=value_a, key_b=value_b, the correct character is a comma.
Specify the key-value delimiter character, which is used to separate keys and values.
Using the example key_a=value_a, key_b=value_b, the correct character is an equals sign.

For Regex: select the regular expression to use, or create your own.

JSON

The JSON format is used with JSON data. There are no additional parsing options.

XML

The XML format is used with XML data. There are no additional parsing options.

Troubleshooting

What if I need to upload different sample data?

If you decide that you need to upload a different sample data file for a source type, for example you want to clean the data first, go to Add Sample Data, delete the sample data, then upload additional data files.

A regular expression had too many capture groups, what do I do?

This error is displayed after attempting to parse a file, and the regular expression created by the Field Extractor contains more than 100 capture groups (fields).

This error might indicate a problem with the Event Break setting for the source type:

Go to Add Sample Data.
Edit the source type and select a different option for Event Break.
Upload the sample events again. Because the Event Break option is applied when indexing the data, changing this value does not affect events that have already been indexed.
Parse the data again.

The sample data might contain an event that is too long:

Edit the sample data file by splitting the long lines to clean up the data.
Go back to Add Sample Data.
Upload the sample events again.
Parse the data again.

Why are the field names not detected in my tabular data?

The Add-on Builder uses the first 1000 events for field extraction. If your data contains more than 1000 events, the parser cannot automatically detect the field names.

The parser assumes that all entries except the table header contain a timestamp. If entries in your tabular data do not contain a timestamp, the parser will not correctly detect which entry is the table header.

Learn more

For more information, see the following Splunk Enterprise documentation:

About fields in the Knowledge Manager Manual
Build field extractions with the field extractor in the Knowledge Manager Manual
Field Extractor: Select Fields step in the Knowledge Manager Manual

Extract fields

Parse data

Extract fields

Unstructured Data

Table

Key Value

JSON

XML

Troubleshooting

What if I need to upload different sample data?

A regular expression had too many capture groups, what do I do?

Why are the field names not detected in my tabular data?

Learn more

Comments

Extract fields

Was this topic useful?