Configure multivalue fields
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Configure multivalue fields
Multivalue fields are fields that can appear multiple times in an event and have a different value for each appearance. One of the more common examples of multivalue fields is that of email address fields, which typically appears two to three times in a single sendmail event--once for the sender, another time for the list of recipients, and possibly a third time for the list of Cc addresses, if one exists. If all of these fields are labeled identically (as "AddressList," for example), they lose meaning that they might otherwise have if they're identified separately as "From", "To", and "Cc".
Splunk parses multivalue fields at search time, and enables you to process the values in the search pipeline. Search commands that work with multivalue fields include makemv, mvcombine, mvexpand, and nomv. For more information on these and other commands see the topic on multivalue fields in the User manual, and the Search Reference manual.
Use the TOKENIZER key to configure multivalue fields in fields.conf. TOKENIZER uses a regular expression to tell Splunk how to recognize and extract multiple field values for a recurring field in an event. Edit fields.conf in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/.
For more information on configuration files in general, see "About configuration files" in the Admin manual.
For a primer on regular expression syntax and usage, see Regular-Expressions.info. You can test regexes by using them in searches with the rex search command. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions.
Configure a multivalue field via fields.conf
Define a multivalue field by adding a stanza for it in fields.conf. Then add lines that use the TOKENIZER key and an identical regular expression to locate the multivalue field and assign different values to each occurrence of it in the event.
Note: If you have other attributes to set for a multivalue field, set them in the same stanza underneath tokenizer. See the fields.conf topic in the Admin manual for more information.
[<field value 1>] tokenizer = $REGEX [<field value 2>] tokenizer = $REGEX
- Multivalue fields should only be extracted at search time.
- Set
<field value>to the name of a field you've defined inprops.conf. - For
tokenizer, define a regular expression that identifies the multivalue field for Splunk. - Repeat the above for each value of the field. Note that the regex stays the same for each field value.
- Splunk will give the first occurrence of the field the first value, the second occurrence of the field the second value, and so on.
Note: It's possible to use TOKENIZE on an indexed field, but if you do so, that field will become unsearchable by Splunk. To avoid this only use TOKENIZE on fields that are extracted at search time.
Example
The following examples from $SPLUNK_HOME/etc/system/README/fields.conf.example break email fields To, From, and CC into multiple values. It will extract the list of adresses in each field, To, From, and Cc, into the multiple values for those fields. This allows searching for an email address against any of the fields, for example sourcetype=imap cc=jrodman@fflanda.com
[To] TOKENIZER = (\w[\w\.\-]*@[\w\.\-]*\w) [From] TOKENIZER = (\w[\w\.\-]*@[\w\.\-]*\w) [Cc] TOKENIZER = (\w[\w\.\-]*@[\w\.\-]*\w)
This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6 , 4.0.7 , 4.0.8 , 4.0.9 , 4.0.10 , 4.0.11 View the Article History for its revisions.