Example transform field extraction configurations
These examples present transform field extraction use cases that require you to configure one or more field transform stanzas in transforms.conf
and then reference them in a props.conf
field extraction stanza.
Configure a field extraction that uses multiple field transforms
You can create transforms that pull field name/value pairs from events, and you can create a field extraction that references two or more field transforms.
Scenario
You have logs that contain multiple field name/field value pairs. While the fields vary from event to event, the pairs always appear in one of two formats.
The logs often come in this format:
[fieldName1=fieldValue1] [fieldName2=fieldValue2]
However, sometimes they are more complicated, logging multiple name/value pairs as a list where the format looks like:
[headerName=fieldName1] [headerValue=fieldValue1], [headerName=fieldName2] [headerValue=fieldValue2]
The list items are separated by commas, and each fieldName
is matched with a corresponding fieldValue
. In this scenario, you want to pull out the field names and values so that the search results are
fieldName1=fieldValue1 fieldName2=fieldValue2
Here's an example of an HTTP request event that combines both of the above formats.
[method=GET] [IP=10.1.1.1] [headerName=Host] [headerValue=www.example.com], [headerName=User-Agent] [headerValue=Mozilla], [headerName=Connection] [headerValue=close] [byteCount=255]
You want to develop a single field extraction that would pull the following field/value pairs from that event.
method=GET IP=10.1.1.1 Host=www.example.com User-Agent=Mozilla Connection=close byteCount=255
Solution
You want to design two different regular expressions that are optimized for each format. One regular expression will identify events with the first format and pull out all of the matching field/value pairs. The other regular expression will identify events with the other format and pull out those field/value pairs.
Create two unique transforms in transforms.conf
--one for each regex--and then connect them in the corresponding field extraction stanza in props.conf
.
Steps
- The first transform you add to
transforms.conf
catches the fairly conventional[fieldName1=fieldValue1] [fieldName2=fieldValue2]
case.[myplaintransform] REGEX=\[(?!(?:headerName|headerValue))([^\s\=]+)\=([^\]]+)\] FORMAT=$1::$2
- The second transform added to
transforms.conf
catches the slightly more complex[headerName=fieldName1] [headerValue=fieldValue1], [headerName=fieldName2] [headerValue=fieldValue2]
case:[mytransform] REGEX=\[headerName\=([^\]]+)\]\s\[headerValue=([^\]]+)\] FORMAT=$1::$2
<fieldName>::<fieldValue>
FORMAT
to match each field name in the event with its corresponding value. This setting inFORMAT
enables Splunk Enterprise to keep matching the regular expression against a matching event until every matching field/value combination is extracted. - This field extraction stanza, created in
props.conf
, references both of the field transforms:[mysourcetype] KV_MODE=none REPORT-a=mytransform, myplaintransform
Besides using multiple field transforms, the field extraction stanza also sets KV_MODE=none. This disables automatic key-value field extraction for the identified source type while letting your manually defined extractions continue. This ensures that these new regular expressions are not overridden by automatic field extraction, and it also helps increase your search performance.
For more information on automatic key-value field extraction, see Automatic key-value field extraction for search-time data.
Configure delimiter-based field extractions
You can use the DELIMS
attribute in field transforms to configure field extractions for events where field values or field/value pairs are separated by delimiters such as commas, colons, tab spaces, and more.
You have a recurring multiline event where a different field/value pair sits on a separate line, and each pair is separated by a colon followed by a tab space. Here's a sample event:
ComponentId: Application Server ProcessId: 5316 ThreadId: 00000000 ThreadName: P=901265:O=0:CT SourceId: com.ibm.ws.runtime.WsServerImpl ClassName: MethodName: Manufacturer: IBM Product: WebSphere Version: Platform 7.0.0.7 [BASE 7.0.0.7 cf070942.55] ServerName: sfeserv36Node01Cell\sfeserv36Node01\server1 TimeStamp: 2010-04-27 09:15:57.671000000 UnitOfWork: Severity: 3 Category: AUDIT PrimaryMessage: WSVR0001I: Server server1 open for e-business ExtendedMessage:
Steps
- Configure the following stanza in
transforms.conf
:[activity_report] DELIMS="\n", ":\t"
"\n"
), and then specifies that the field name and field value on each line is separated by a colon and tab space (":\t"
). - Rewrite the
props.conf
stanza above as:[activitylog] LINE_BREAKER=[-]{8,}([\r\n]+) SHOULD_LINEMERGE=false REPORT-activity=activity_report
These two brief configurations will extract the same set of fields as before, but they leave less room for error and are more flexible.
Handling events with multivalue fields
You can use the MV_ADD
attribute to extract fields in situations where the same field is used more than once in an event, but has a different value each time. Ordinarily, Splunk Enterprise only extracts the first occurrence of a field in an event; every subsequent occurrence is discarded. But when MV_ADD
is set to true in transforms.conf
, Splunk Enterprise treats the field like a multivalue field and extracts each unique field/value pair in the event.
Example
You have a set of events.
event1.epochtime=1282182111 type=type1 value=value1 type=type3 value=value3 event2.epochtime=1282182111 type=type2 value=value4 type=type3 value=value5 type=type4 value=value6
The type
and value
fields are repeated several times in each event. In order to have search type=type3
return both events or to run a count(type)
report on the two events that returns 5
, create a custom multivalue extraction of the type
field for these events.
Steps
Set up your transforms.conf
and props.conf
files to configure multivalue extraction.
- In
transforms.conf
, add the following.[mv-type] REGEX=type=(?<type>\s+) MV_ADD=true
- In
props.conf
for your sourcetype or source, set the following.REPORT-type=mv-type
Example inline field extraction configurations | Configure extractions of multivalue fields with fields.conf |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!