Example transform field extraction configurations

These examples present transform field extraction use cases that require you to configure one or more field transform stanzas in transforms.conf and then reference them in a props.conf field extraction stanza.

Configure a field extraction that uses multiple field transforms

You can create transforms that pull field name/value pairs from events, and you can create a field extraction that references two or more field transforms.

Scenario

You have logs that contain multiple field name/field value pairs. While the fields vary from event to event, the pairs always appear in one of two formats.

The logs often come in this format:

[fieldName1=fieldValue1] [fieldName2=fieldValue2]

However, sometimes they are more complicated, logging multiple name/value pairs as a list where the format looks like:

[headerName=fieldName1] [headerValue=fieldValue1], [headerName=fieldName2] [headerValue=fieldValue2]

The list items are separated by commas, and each fieldName is matched with a corresponding fieldValue. In this scenario, you want to pull out the field names and values so that the search results are

fieldName1=fieldValue1
fieldName2=fieldValue2

Here's an example of an HTTP request event that combines both of the above formats.

[method=GET] [IP=10.1.1.1] [headerName=Host] [headerValue=www.example.com], [headerName=User-Agent] [headerValue=Mozilla], [headerName=Connection] [headerValue=close] [byteCount=255]

You want to develop a single field extraction that would pull the following field/value pairs from that event.

method=GET
IP=10.1.1.1
Host=www.example.com
User-Agent=Mozilla
Connection=close
byteCount=255

Solution

You want to design two different regular expressions that are optimized for each format. One regular expression will identify events with the first format and pull out all of the matching field/value pairs. The other regular expression will identify events with the other format and pull out those field/value pairs.

Create two unique transforms in transforms.conf--one for each regex--and then connect them in the corresponding field extraction stanza in props.conf.

Steps

The first transform you add to transforms.conf catches the fairly conventional [fieldName1=fieldValue1] [fieldName2=fieldValue2] case.
```
[myplaintransform]
REGEX=\[(?!(?:headerName|headerValue))([^\s\=]+)\=([^\]]+)\]
FORMAT=$1::$2
```
The second transform added to transforms.conf catches the slightly more complex [headerName=fieldName1] [headerValue=fieldValue1], [headerName=fieldName2] [headerValue=fieldValue2] case:
```
[mytransform]
REGEX=\[headerName\=([^\]]+)\]\s\[headerValue=([^\]]+)\]
FORMAT=$1::$2
```
Both transforms use the <fieldName>::<fieldValue> FORMAT to match each field name in the event with its corresponding value. This setting in FORMAT enables Splunk Enterprise to keep matching the regular expression against a matching event until every matching field/value combination is extracted.
This field extraction stanza, created in props.conf, references both of the field transforms:
```
[mysourcetype]
KV_MODE=none
REPORT-a=mytransform, myplaintransform
```

Besides using multiple field transforms, the field extraction stanza also sets KV_MODE=none. This disables automatic key-value field extraction for the identified source type while letting your manually defined extractions continue. This ensures that these new regular expressions are not overridden by automatic field extraction, and it also helps increase your search performance.

For more information on automatic key-value field extraction, see Automatic key-value field extraction for search-time data.

Configure delimiter-based field extractions

You can use the DELIMS attribute in field transforms to configure field extractions for events where field values or field/value pairs are separated by delimiters such as commas, colons, tab spaces, and more.

You have a recurring multiline event where a different field/value pair sits on a separate line, and each pair is separated by a colon followed by a tab space. Here's a sample event:

ComponentId:     Application Server
ProcessId:   5316
ThreadId:    00000000
ThreadName:  P=901265:O=0:CT
SourceId:    com.ibm.ws.runtime.WsServerImpl
ClassName:   
MethodName:  
Manufacturer:    IBM
Product:     WebSphere
Version:     Platform 7.0.0.7 [BASE 7.0.0.7 cf070942.55]
ServerName:  sfeserv36Node01Cell\sfeserv36Node01\server1
TimeStamp:   2010-04-27 09:15:57.671000000
UnitOfWork:  
Severity:    3
Category:    AUDIT
PrimaryMessage:  WSVR0001I: Server server1 open for e-business
ExtendedMessage:

Steps

Configure the following stanza in transforms.conf:
```
[activity_report]
DELIMS="\n", ":\t"
```
This states that the field/value pairs in the event are on separate lines ("\n"), and then specifies that the field name and field value on each line is separated by a colon and tab space (":\t").

Rewrite the props.conf stanza above as:

[activitylog]
LINE_BREAKER=[-]{8,}([\r\n]+)
SHOULD_LINEMERGE=false
REPORT-activity=activity_report

These two brief configurations will extract the same set of fields as before, but they leave less room for error and are more flexible.

Handling events with multivalue fields

You can use the MV_ADD attribute to extract fields in situations where the same field is used more than once in an event, but has a different value each time. Ordinarily, Splunk Enterprise only extracts the first occurrence of a field in an event; every subsequent occurrence is discarded. But when MV_ADD is set to true in transforms.conf, Splunk Enterprise treats the field like a multivalue field and extracts each unique field/value pair in the event.

Example

You have a set of events.

event1.epochtime=1282182111 type=type1 value=value1 type=type3 value=value3
event2.epochtime=1282182111 type=type2 value=value4 type=type3 value=value5 type=type4 value=value6

The type and value fields are repeated several times in each event. In order to have search type=type3 return both events or to run a count(type) report on the two events that returns 5, create a custom multivalue extraction of the type field for these events.

Steps Set up your transforms.conf and props.conf files to configure multivalue extraction.

In transforms.conf, add the following.

[mv-type]
REGEX=type=(?<type>\s+)
MV_ADD=true

In props.conf for your sourcetype or source, set the following.
REPORT-type=mv-type

Related answers from Splunk Community

Example transform field extraction configurations

Configure a field extraction that uses multiple field transforms

Scenario

Solution

Configure delimiter-based field extractions

Handling events with multivalue fields

Example

Comments

Example transform field extraction configurations

Was this topic useful?