Fields appear in event data as searchable name/value pairings such as
ip_address=192.168.1.1. They are the building blocks of searches, reports, and data models in Splunk software. When you run a search on your event data, Splunk software looks for fields in that data.
Note: Field names are often referred to as keys. The acronym kv is short for key/value.
Look at the following example search.
This search finds events with
status fields that have a value of
404. When you run this search, Splunk software does not look for events with any other
status value. It also does not look for events containing other fields that share
404 as a value. As a result, this search returns a set of results that are more focused than you get if you used
404 in the search string.
Fields often appear in events as
key=value pairs such as
user_name=Fred. But in many events, field values appear in fixed, delimited positions without identifying keys. For example, you might have events where the
user_name value always appears by itself after the timestamp and the
Nov 15 09:32:22 00224 johnz Nov 15 09:39:12 01671 dmehta Nov 15 09:45:23 00043 sting Nov 15 10:02:54 00676 lscott
Splunk software can identify these fields using a custom field extraction.
About field extraction
As Splunk software processes events, it extracts fields from them. This process is called field extraction.
Splunk software automatically extracts some fields
Splunk software extracts some fields from your events without assistance. It automatically extracts
sourcetype values, timestamps, and several other default fields when it indexes incoming events.
It also extracts fields that appear in your event data as
key=value pairs. This process of recognizing and extracting k/v pairs is called field discovery. You can disable field discovery to improve search performance.
When fields appear in events without their keys, Splunk software uses pattern-matching rules called regular expressions to extract those fields as complete k/v pairs. With a properly configured regular expression, Splunk software can extract
user_id=johnz from the previous sample event. Splunk software comes with several field extraction configurations that use regular expressions to identify and extract fields from event data.
For more information about field discovery and an example of automatic field extraction, see "When Splunk software extracts fields," in this manual.
For more information on how Splunk software uses regular expressions to extract fields, see "About Splunk software regular expressions," in this manual.
To get all of the fields in your data, create custom field extractions
To use the power of Splunk software search, create additional field extractions. Custom field extractions allow you to capture and track information that is important to your needs, but which is not automatically discovered and extracted by Splunk software. Any field extraction configuration you provide must include a regular expression that tells Splunk software how to find the field that you want to extract.
All field extractions, including custom field extractions, are tied to a specific
host value. For example, if you create an
ip field extraction, you might tie the extraction configuration for
Custom field extractions should take place at search time, but in certain rare circumstances you can arrange for some custom field extractions to take place at index time. See When Splunk Enterprise extracts fields in this manual.
Before you create custom field extractions, get to know your data
Before you begin to create field extractions, ensure that you are familiar with the formats and patterns of the event data associated with the
host that you are working with. One way is to investigate the predominant event patterns in your data with the Patterns tab. See "Identify event patterns with the Patterns tab" in the Search Manual.
Here are two events from the same source type, an apache server web access log.
22.214.171.124 - - [03/Jun/2014:20:49:53 -0700] "GET /wp-content/themes/aurora/style.css HTTP/1.1" 200 7464 "http://www.splunk.com/download" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; Trident/5.0)” 10.1.10.14 - - [03/Jun/2014:20:49:33 -0700] "GET / HTTP/1.1" 200 75017 "-" "Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)"
While these events contain different strings and characters, they are formatted in a consistent manner. They both present values for fields such as
method, and so on in a reliable order.
Reliable means that the
method value is always followed by the
URI value, the
URI value is always followed by the
status value, the
status value is always followed by the
bytes value, and so on. When your events have consistent and reliable formats, you can create a field extraction that accurately captures multiple field values from them.
For contrast, look at this set of Cisco ASA firewall log events:
While these events contain field values that are always space-delimited, they do not share a reliable format like the preceding two events. In order, these events represent:
1. A group policy change
2. An IGMP request
3. A TCP connection
4. A firewall access denial for a request from a specific IP
Because these events differ so widely, it is difficult to create a single field extraction that can apply to each of these event patterns and extract relevant field values.
In situations like this, where a specific host, source type, or source contains multiple event patterns, you may want to define field extractions that match each pattern, rather than designing a single extraction that can apply to all of the patterns. Inspect the events to identify text that is common and reliable for each pattern.
Using required text in field extractions
In the last four events, the string of numbers that follows
%ASA-#- have specific meanings. You can find their definitions in the Cisco documentation. When you have unique event identifiers like these in your data, specify them as required text in your field extraction. Required text strings limit the events that can match the regular expression in your field extraction.
Specifying required text is optional, but it offers multiple benefits. Because required text reduces the set of events that it scans, it improves field extraction efficiency and decreases the number of false-positive field extractions.
The field extractor utility enables you to highlight text in a sample event and specify that it is required text.
Methods of custom field extraction in Splunk software
As a knowledge manager you oversee the set of custom field extractions created by users of your Splunk deployment, and you might define specialized groups of custom field extractions yourself. The ways that you can do this include:
- The field extractor utility, which generates regular expressions for your field extractions.
- Adding field extractions through pages in Settings. You must provide a regular expression.
- Manual addition of field extraction configurations at the
.conffile level. Provides the most flexibility for field extraction.
The field extraction methods that are available to Splunk software users are described in the following sections. All of these methods enable you to create search-time field extractions. To create an index-time field extraction, choose the third option: Configure field extractions directly in configuration files.
Let the field extractor build extractions for you
The field extractor utility leads you step-by-step through the field extraction design process. It provides two methods of field extraction: regular expressions and delimiter-based field extraction. The regular expression method is useful for extracting fields from unstructured event data, where events may follow a variety of different event patterns. It is also helpful if you are unfamiliar with regular expression syntax and usage, because it generates regular expressions and lets you validate them.
The delimiter-based field extraction method is suited to structured event data. Structured event data comes from sources like SQL databases and CSV files, and produces events where all fields are separated by a common delimiter, such as commas, spaces, or pipe characters. Regular expressions usually are not necessary for structured data events from a common source.
With the regular expression method of the field extractor you can:
- Set up a field extraction by selecting a sample event and highlighting fields to extract from that event.
- Create individual extractions that capture multiple fields.
- Improve extraction accuracy by detecting and removing false positive matches.
- Validate extraction results by using search filters to ensure specific values are being extracted.
- Specify that fields only be extracted from events that have a specific string of required text.
- Review stats tables of the field values discovered by your extraction.
- Manually configure regular expression for the field expression yourself.
With the delimiter method of the field extractor you can:
- Identify a delimiter to extract all of the fields in an event.
- Rename specific fields as appropriate.
- Validate extraction results.
The field extractor can only build search time field extractions that are associated with specific sources or source types in your data (no hosts).
For more information about using the field extractor, see "Build field extractions with the field extractor" in this manual.
Define field extractions with the Field extractions and Field transformations pages
You can use the Field extractions and Field transformations pages in Settings to define and maintain complex extracted fields in Splunk Web.
This method of field extraction creation lets you create a wider range of field extractions than you can generate with the field extractor utility. It requires that you have the following knowledge.
- Understand how to design regular expressions.
- Have a basic understanding of how field extractions are configured in
If you create a custom field extraction that extracts its fields from
_raw and does not require a field transform, use the field extractor utility. The field extractor can generate regular expressions, and it can give you feedback about the accuracy of your field extractions as you define them.
Use the Field Extractions page to create basic field extractions, or use it in conjunction with the Field Transformations page to define field extraction configurations that can do the following things.
- Reuse the same regular expression across multiple sources, source types, or hosts.
- Apply multiple regular expressions to the same source, source type, or host.
- Use a regular expression to extract fields from the values of another field.
The Field extractions and Field transformations pages define only search time field extractions.
See the following topics in this manual:
Configure field extractions directly in configuration files
To get complete control over your field extractions, add the configurations directly into
transforms.conf. This method lets you create field extractions with capabilities that extend beyond what you can create with Splunk Web methods such as the field extractor utility or the Settings pages. For example, with the configuration files, you can set up:
- Delimiter-based field extractions.
- Extractions for multivalue fields.
- Extractions of fields with names that begin with numbers or underscores. This action is typically not allowed unless key cleaning is disabled.
- Formatting of extracted fields.
See "Create and maintain search-time extractions through configuration files," in this manual.
You can create index-time field extractions only by configuring them in
transforms.conf. Adding to the default set of indexed fields can result in search performance and indexing problems. But if you must create additional index-time field extractions, see "Create custom fields at index time" in the Getting Data In manual.
Create custom calculated fields and multivalue fields
Two kinds of custom fields can be persistently configured with the help of
.conf files: calculated fields and multivalue fields.
Multivalue fields can appear multiple times in a single event, each time with a different value. To configure custom multivalue fields, make changes to
fields.conf as well as to
props.conf. See "Configure multivalue fields" in this manual.
Calculated fields provide values that are calculated from the values of other fields present in the event, with the help of
eval expressions. Configure them in
props.conf. See "Define calculated fields" in this manual.
Build field extractions into search strings
The following search commands facilitate the search-time extraction of fields in different ways:
See "Extract fields with search commands," in the Search Manual. Alternatively you can look up each of these commands in the Search Reference.
Field extractions facilitated by search commands apply only to the results returned by the searches in which you use these commands. You cannot use these search commands to create reusable extractions that persist after the search is completed. For that, use the field extractor utility, configure extractions with the Settings pages, or set up configurations directly in the
Disable or delete knowledge objects
When Splunk software extracts fields
This documentation applies to the following versions of Splunk® Enterprise: 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.3.14, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11