Splunk® Enterprise

Knowledge Manager Manual

Download manual as PDF

This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Example transform field extraction configurations

These examples present transform field extraction use cases that require you to configure one or more field transform stanzas in transforms.conf and then reference them in a props.conf field extraction stanza.

Configure a field extraction that uses multiple field transforms

You can create transforms that pull field name/value pairs from events, and you can create a field extraction that references two or more field transforms.

Scenario

You have logs that contain multiple field name/field value pairs. While the fields vary from event to event, the pairs always appear in one of two formats.

The logs often come in this format:

[fieldName1=fieldValue1] [fieldName2=fieldValue2]

However, sometimes they are more complicated, logging multiple name/value pairs as a list where the format looks like:

[headerName=fieldName1] [headerValue=fieldValue1], [headerName=fieldName2] [headerValue=fieldValue2]

The list items are separated by commas, and each fieldName is matched with a corresponding fieldValue. In this scenario, you want to pull out the field names and values so that the search results are

fieldName1=fieldValue1
fieldName2=fieldValue2

Here's an example of an HTTP request event that combines both of the above formats.

[method=GET] [IP=10.1.1.1] [headerName=Host] [headerValue=www.example.com], [headerName=User-Agent] [headerValue=Mozilla], [headerName=Connection] [headerValue=close] [byteCount=255]

You want to develop a single field extraction that would pull the following field/value pairs from that event.

method=GET
IP=10.1.1.1
Host=www.example.com
User-Agent=Mozilla
Connection=close
byteCount=255

Solution

You want to design two different regular expressions that are optimized for each format. One regular expression will identify events with the first format and pull out all of the matching field/value pairs. The other regular expression will identify events with the other format and pull out those field/value pairs.

Create two unique transforms in transforms.conf--one for each regex--and then connect them in the corresponding field extraction stanza in props.conf.

Steps

  1. The first transform you add to transforms.conf catches the fairly conventional [fieldName1=fieldValue1] [fieldName2=fieldValue2] case.
    [myplaintransform]
    REGEX=\[(?!(?:headerName|headerValue))([^\s\=]+)\=([^\]]+)\]
    FORMAT=$1::$2
    
  2. The second transform added to transforms.conf catches the slightly more complex [headerName=fieldName1] [headerValue=fieldValue1], [headerName=fieldName2] [headerValue=fieldValue2] case:
    [mytransform]
    REGEX= \[headerName\=(\w+)\],\s\[headerValue=([^\]]+)\]
    FORMAT= $1::$2
    
    Both transforms use the <fieldName>::<fieldValue> FORMAT to match each field name in the event with its corresponding value. This setting in FORMAT enables Splunk Enterprise to keep matching the regular expression against a matching event until every matching field/value combination is extracted.
  3. This field extraction stanza, created in props.conf, references both of the field transforms:
    [mysourcetype]
    KV_MODE=none
    REPORT-a = mytransform, myplaintransform
    

Besides using multiple field transforms, the field extraction stanza also sets KV_MODE=none. This disables automatic key-value field extraction for the identified source type while letting your manually defined extractions continue. This ensures that these new regular expressions are not overridden by automatic field extraction, and it also helps increase your search performance.

For more information on automatic key-value field extraction, see Automatic key-value field extraction for search-time data.

Configure delimiter-based field extractions

You can use the DELIMS attribute in field transforms to configure field extractions for events where field values or field/value pairs are separated by delimiters such as commas, colons, tab spaces, and more.

You have a recurring multiline event where a different field/value pair sits on a separate line, and each pair is separated by a colon followed by a tab space. Here's a sample event:

ComponentId:     Application Server
ProcessId:   5316
ThreadId:    00000000
ThreadName:  P=901265:O=0:CT
SourceId:    com.ibm.ws.runtime.WsServerImpl
ClassName:   
MethodName:  
Manufacturer:    IBM
Product:     WebSphere
Version:     Platform 7.0.0.7 [BASE 7.0.0.7 cf070942.55]
ServerName:  sfeserv36Node01Cell\sfeserv36Node01\server1
TimeStamp:   2010-04-27 09:15:57.671000000
UnitOfWork:  
Severity:    3
Category:    AUDIT
PrimaryMessage:  WSVR0001I: Server server1 open for e-business
ExtendedMessage: 

Steps

  1. Configure the following stanza in transforms.conf:
    [activity_report]
    DELIMS = "\n", ":\t"
    
    This states that the field/value pairs in the event are on separate lines ("\n"), and then specifies that the field name and field value on each line is separated by a colon and tab space (":\t").
  2. Rewrite the props.conf stanza above as:
    [activitylog]
    LINE_BREAKER = [-]{8,}([\r\n]+)
    SHOULD_LINEMERGE = false
    REPORT-activity = activity_report
    

These two brief configurations will extract the same set of fields as before, but they leave less room for error and are more flexible.

Handling events with multivalue fields

You can use the MV_ADD attribute to extract fields in situations where the same field is used more than once in an event, but has a different value each time. Ordinarily, Splunk Enterprise only extracts the first occurrence of a field in an event; every subsequent occurrence is discarded. But when MV_ADD is set to true in transforms.conf, Splunk Enterprise treats the field like a multivalue field and extracts each unique field/value pair in the event.

Example

You have a set of events.

event1.epochtime=1282182111 type=type1 value=value1 type=type3 value=value3
event2.epochtime=1282182111 type=type2 value=value4 type=type3 value=value5 type=type4 value=value6

The type and value fields are repeated several times in each event. In order to have search type=type3 return both events or to run a count(type) report on the two events that returns 5, create a custom multivalue extraction of the type field for these events.

Steps Set up your transforms.conf and props.conf files to configure multivalue extraction.

  1. In transforms.conf, add the following.
    [mv-type]
    REGEX = type=(?<type>\s+)
    MV_ADD = true
    
  2. In props.conf for your sourcetype or source, set the following.
    REPORT-type = mv-type
PREVIOUS
Example inline field extraction configurations
  NEXT
Configure multivalue fields with fields.conf

This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.5.0, 6.5.1, 6.5.1612 (Splunk Cloud only), 6.5.2, 6.5.3, 6.5.4, 6.6.0


Comments

In the multivalue example, shouldn't the REGEX use \S+ instead of \s+? it would seem the expression would only match and return one or more whitespace characters.

DalJeanis
January 23, 2017

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters