Add fields
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
Add fields
It would be nice if you could tie your transaction to the API logs, however, they don't have an accountNumber. If you look closely though, you can see that the J2EE logs and API logs do share another value: the subscriberID. You want to use this to build transactions that cross these two logs. But you have a problem -- the values are in the API logs, but Splunk does not recognize them as a field. This section shows how to extract the subscriberID field from your API logs so that you can construct more transactions.
Understand fields in Splunk
Splunk does more than pull distributed or disparate data into a single location. It also helps you to make sense of the data by organizing it in various ways. One of the most useful organizing tools is fields, key/value pairs that identify patterns in your data.
Raw input data, such as log data, is a series of lines or text strings. Within each line, different substrings can have meaning. For example, a line in weblog data may contain an IP address, a timestamp, a GET or PUT statement, a status code, and so on. Fields identify patterns in the data and allow you to search on them by field name. Splunk automatically recognizes many fields in your data:
- Splunk always extracts the source, sourcetype, and host fields.
- Splunk recognizes many well-known input formats, such as Apache common log format, and automatically extracts the correct fields when that sourcetype is selected. See List of pretrained source types in the Admin manual for a list of the source types for which Splunk automatically creates fields and extracts their values.
- When your log entries already contain terms like
messageStatus = INIT, Splunk interprets this syntax as a key/value pair and extracts a field,messageStatus, and a value,INIT.
You can also create field extractions to pull in key/value pairs from your data when Splunk doesn't automatically recognize them. Fields are important for transactions, because you need them to tie your events together. You can modify fields at any time - fields do not have to be baked into the index and can be added, deleted, or changed as you need them. The cool thing is you don't have to set up all your field extractions at the same time -- you can focus on just the fields you need and extract those. You can do this interactively using the interactive field extractor (IFX) or you can create regular expressions that identify your fields.
This section shows how to extract fields in two different ways:
- extract the
subscriberIDfield from the API logs using the Interactive field Extractor (IFX) to "train" Splunk to recognize a field - explicitly extract the
connectionResultfield from the API logs by creating a regex in the configuration files.
Create a field extraction using IFX
The API logs are in a custom format, so you need to do some work to get Splunk to recognize the subscriberID and encode it as a Splunk field. To extract the field:
1. In Splunk Web, run a search that contains API log results:
index="test" sourcetype=apilog2. Click on the arrow to the left of the timestamp for any event in the window events and select Extract Fields.
This opens the interactive field extractor.
3. Choose sourcetype="apilog" from the Restrict field extraction to menu.
4. Copy and paste some subscriberID values from your results into the Sample Values box.
5. When you think you have enough information, click Generate. Splunk generates a pattern for your field based on the information you have given it and the general format of your events.
6. Look at the Sample Extractions and Sample Events to see if it looks like the expression is working. Look for two things: any values that you don't want, and any values that you do want but that aren't displayed for some reason. If you see unwanted values, click the x next to the value to remove it. If you notice a value that is left out, add it to the list of Sample Values.
7. Click Test to go to the Search window and test your extraction.
8. In this case, it looks pretty good. Go back to the IFX window and click Save.
9. Enter subscriberID for the field name and click Save.
10. Return to the Search window.
For for another example using the Interactive Field Extractor, see Extract fields interactively in Splunk Web in the User manual.
Create a field extraction using configuration files
If you are comfortable with regular expressions and want the convenience and control of doing everything with a text editor, you can create field extractions directly using the configuration files. See the regular-expressions.info website for a regular epxressions reference.
A common pattern used in the Splunk configuration files is to set up a regular expression in transforms.conf, and then apply it to a host, source, or source type in props.conf. This is the pattern used in this example.
This example shows how to extract the connectionResult value in the logs into a result field.
#### 2010-04-03 15:24:04,398
nameSpace: content.static.API
subscriberID: 100521193
callerID: TTCOV100521193-8529990
driver: content.jdbc.ContentDriver
callerAction: MAR10545LA
host: 10.25.50.110
connectionResult: SUCCESS
Details: Successfully updated contentDB
1. Start by constructing a regular expression that finds the field values. The values you want to extract are placed inside parentheses ():
^\s+connectionResult:\s+([A-Za-z]+)
2. From the command line, go to $SPLUNK_HOME/etc/system/local/ and open transforms.conf with a text editor.
3. Add the following stanza to the file:
[api_result-extractor] REGEX = ^\s+connectionResult:\s+([A-Za-z]+) FORMAT = result::$1
Things to know:
-
[api_result-extractor]defines a name for this stanza that you will use inprops.conf. -
REGEX =declares that the rest of the line is a regex. -
([A-Za-z]+): enclosing part of the regex in parentheses captures that part as a group that you can use to specify the field value. (You can actually extract more than one field with the same regex by using multiple captured groups.) For more about captured groups, see Use Round Brackets for Grouping on the regular-expressions.info website. -
FORMAT = result::$1assigns the nameresultto the first captured group.
4. Save and close transforms.conf.
5. In the same directory, open props.conf with a text editor. This is where you assign your regex to the apilog format.
6. Check to see if there is already an [apilog] stanza. (It should contain, for example, the field you extracted with the IFX, as well as the regular expression for line breaks.) If there is not, create it.
7. Add a line to the [apilog] stanza:
REPORT-fields = api_result-extractor
Things to know:
-
[apilog]specifies the source type to which this extraction applies. You can also use[host::hostname]or[source::sourcetype]. -
REPORT-fields =specifies that the named stanza intransforms.confis a field extraction.
8. Save and close props.conf
9. Restart Splunk. To do this from the command line, use
$SPLUNK_HOME/splunk stop -start. Then navigate to http://splunkhost:8000 (or whatever host and port you used to install) and log back in. (Use admin changeme if you haven't changed your log in credentials yet.)
Use the field in a transaction
Here is another transaction that goes across two tiers, in this case, the J2EE logs and the API logs. This transaction uses the subscriberID field to stitch together the events:
eventtype="CONTENT_EVENTS" | transaction subscriberID maxspan=1mA sample event from this transaction looks like:
3/24/10 11:23:23.822 AM
<TRANSACTION date="2010-03-24 11:23:23,822" activityCode="1010" sequenceNumber="102849281" accountNumber="COT8048415232" subscriberID="8048415232" callerID="MAR10665LA" transactionStatus="COMPLETE" result="SUCCESS" host="10.34.51.93" comment="Invocation of Content API for sequenceNumber 102849281 Successful" />
#### 2010-03-24 11:23:23,866
nameSpace: content.static.API
subscriberID: 102849281
callerID: TTCOV102849281-3807279
driver: content.jdbc.ContentDriver
callerAction: MAR10665LA
host: 10.34.50.151
connectionResult: SUCCESS
Details: Successfully updated contentDB
* host=apiserver22
* host=j2eeserver3
* sourcetype=apilog
* sourcetype=j2eelog
* source=/var/log/apilog/apiserver22/API.log
* source=/var/log/j2eelog/j2eeserver3/J2EE.log
* duration=0.044
This documentation applies to the following versions of Splunk: 4.1 , 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 , 4.1.6 , 4.1.7 , 4.1.8 View the Article History for its revisions.





