
Configure field lookups
Use the dynamic fields lookup feature to add fields to your events with information from an external source, such as a static table (CSV file) or an external (Python) command. You can also add fields based on matching time information.
Note: CSV files with Pre-OS X (OS9 or earlier) Macintosh-style line endings (aka carriage return only, "\r") are not supported.
For example, if you are monitoring logins with Splunk Enterprise and have IP addresses and timestamps for those logins in your Splunk Enterprise index, you can use a dynamic field lookup to map the IP address and timestamp to the MAC address and username information for the matching IP and timestamp data that you have in your DHCP logs.
You can set up a lookup using the Lookups page in Splunk Web or by configuring stanzas in props.conf
and transforms.conf
. For more information about using the Lookups page, see the fields lookup tutorial in the Splunk Search Tutorial.
This topic shows you how to use props.conf
and transforms.conf
to set up your lookups.
To set up a lookup using the configuration files:
Important: Do not edit conf files in $SPLUNK_HOME/etc/system/default
. Instead, you should edit the file in $SPLUNK_HOME/etc/system/local/
or $SPLUNK_HOME/etc/apps/<app_name>/local/
. If the file doesn't exist, create it.
1. Edit transforms.conf
to define your lookup table.
Currently you can define two kinds of lookup tables: static lookups (which utilize CSV files) and external lookups (which utilize Python scripts). The arguments you use in your transforms stanza indicate the type of lookup table you want to define. Use filename
for static lookups and external_cmd
for external lookups.
Note: A lookup table must have at least two columns. Each column may have multiple instances of the same value (multi-valued fields). It may not contain non-utf-8 characters (plain ascii text is ok, as is any character set that is also valid utf-8.)
2. Edit props.conf
to apply your lookup table automatically.
This step is the same for both static and external lookups. In this configuration file, you specify the fields to match and output (or outputnew, if you don't want to overwrite the output field) from the lookup table that you defined in transforms.conf
.
You can have more than one field lookup defined in a single source stanza. Each lookup should have its own unique lookup name; for example, if you have multiple tables, you can name them: LOOKUP-table1
, LOOKUP-table2
, etc., or something more descriptive.
When you add a lookup to props.conf, the lookup is run automatically and applied to any events from your search that have the matching sourcetype. If your automatic lookup is very slow, it will also impact the speed of your searches.
3. Restart Splunk Enterprise to implement the changes you made to the configuration files.
After restart, you should see the output fields from your lookup table listed in the fields sidebar. From there, you can select the fields to display in each of the matching search results.
Set up a fields lookup based on a static file
The simplest fields lookup is based on a static table, specifically a CSV file. The CSV file needs to be located in one of two places:
$SPLUNK_HOME/etc/system/lookups/
$SPLUNK_HOME/etc/apps/<app_name>/lookups/
Create the lookups directory if it does not exist.
1. Edit transforms.conf
to define your lookup table.
- In
transforms.conf
, add a stanza to define your lookup table. The name of the stanza is also the name of your lookup table. You will use this transform inprops.conf
.
- In this stanza, reference the CSV file's name:
[myLookup] filename = <filename> max_matches = <integer>
- Optionally, you can specify the number of matching entries to apply to an event;
max_matches
indicates that the first (in file order)<integer>
number of entries are used. By default,max_matches
is 100 for lookups that are not based on a timestamp field.
2. Edit props.conf
to apply your lookup table.
- In
props.conf
, add a stanza with thelookup
key. This stanza specifies the lookup table that you defined intransforms.conf
and indicates how Splunk Enterprise should apply it to your events:
[<stanza name>] LOOKUP-<class> = $TRANSFORM <match_field_in_table> OUTPUT|OUTPUTNEW <output_field_in_table>
stanza name
is the sourcetype, host, or source to which this lookup applies, as specified in props.conf.stanza name
can't use regex-type syntax.$TRANSFORM
references the stanza intransforms.conf
where you defined your lookup table.match_field_in_table
is the column in the lookup table that you use to match values.output_field_in_table
is the column in the lookup table that you add to your events. UseOUTPUTNEW
if you don't want to overwrite existing values in your output field.- You can have multiple columns on either side of the lookup. For example, you could have
$TRANSFORM <match_field1>, <match_field2> OUTPUT|OUTPUTNEW <match_field3>, <match_field4>
. You can also have one field return two fields, three fields return one field, and so on.
- Use the
AS
clause if the field names in the lookup table and your events do not match or if you want to rename the field in your event:
[<stanza name>] LOOKUP-<class> = $TRANSFORM <match_field_in_table> AS <match_field_in_event> OUTPUT|OUTPUTNEW <output_field_in_table> AS <output_field_in_event>
- You can have more than one field after the
OUTPUT|OUTPUTNEW
clause. If you don't useOUTPUT|OUTPUTNEW
, Splunk Enterprise adds all the field names and values from the lookup table to your events.
3. Restart Splunk Enterprise.
Example of static fields lookup
Here's an example of setting up lookups for HTTP status codes in an access_combined
log. In this example, you want to match the status
field in your lookup table (http_status.csv
) with the field in your events. Then, you add the status description and status type fields into your events.
The following is the http_status.csv
file. You can put this into $SPLUNK_HOME/etc/apps/<app_name>/lookups/
. If you're using this in the Search App, put the file into $SPLUNK_HOME/etc/apps/search/lookups/
:
status,status_description,status_type 100,Continue,Informational 101,Switching Protocols,Informational 200,OK,Successful 201,Created,Successful 202,Accepted,Successful 203,Non-Authoritative Information,Successful 204,No Content,Successful 205,Reset Content,Successful 206,Partial Content,Successful 300,Multiple Choices,Redirection 301,Moved Permanently,Redirection 302,Found,Redirection 303,See Other,Redirection 304,Not Modified,Redirection 305,Use Proxy,Redirection 307,Temporary Redirect,Redirection 400,Bad Request,Client Error 401,Unauthorized,Client Error 402,Payment Required,Client Error 403,Forbidden,Client Error 404,Not Found,Client Error 405,Method Not Allowed,Client Error 406,Not Acceptable,Client Error 407,Proxy Authentication Required,Client Error 408,Request Timeout,Client Error 409,Conflict,Client Error 410,Gone,Client Error 411,Length Required,Client Error 412,Precondition Failed,Client Error 413,Request Entity Too Large,Client Error 414,Request-URI Too Long,Client Error 415,Unsupported Media Type,Client Error 416,Requested Range Not Satisfiable,Client Error 417,Expectation Failed,Client Error 500,Internal Server Error,Server Error 501,Not Implemented,Server Error 502,Bad Gateway,Server Error 503,Service Unavailable,Server Error 504,Gateway Timeout,Server Error 505,HTTP Version Not Supported,Server Error
1. In a transforms.conf file located in either $SPLUNK_HOME/etc/system/local/
or $SPLUNK_HOME/etc/apps/<app_name>/local
, put:
[http_status] filename = http_status.csv
2. In a props.conf file, located in either $SPLUNK_HOME/etc/system/local/
or $SPLUNK_HOME/etc/apps/<app_name>/local/
, put:
[access_combined] LOOKUP-http = http_status status OUTPUT status_description, status_type
3. Restart Splunk Enterprise.
Now, when you run a search that returns Web access information, you will see the fields status_description
and status_type
listed in your fields sidebar menu.
Use search results to populate a lookup table
You can edit a local or app-specific copy of savedsearches.conf
to use the results of a report to populate a lookup table.
In a report stanza, where the search returns a results table:
1. Add the following line to enable the lookup population action.
action.populate_lookup = 1
This tells Splunk Enterprise to save your results table into a CSV file.
2. Add the following line to tell Splunk Enterprise where to copy your lookup table.
action.populate_lookup.dest = <string>
The action.populate_lookup.dest
value is a lookup name from transforms.conf
or a path to a CSV file where Splunk Enterprise should copy the search results. If it is a path to a CSV file, the path should be relative to $SPLUNK_HOME.
For example, if you want to save the results to a global lookup table, you might include:
action.populate_lookup.dest = etc/system/lookups/myTable.csv
The destination directory, $SPLUNK_HOME/etc/system/lookups
or $SPLUNK_HOME/etc/<app_name>/lookups
, should already exist.
3. Add the following line if you want this search to run when Splunk Enterprise starts up.
run_on_startup = true
If it does not run on startup, it will run at the next scheduled time. Generally, we recommend that you set this to true for scheduled searches that populate lookup tables.
Because Splunk Enterprise copies the results of the report to a CSV file, you can set up your fields lookup the same way you set up a static lookup.
Set up a fields lookup based on an external command or script
For dynamic or external lookups, your transforms.conf
stanza references the command or script and arguments to invoke. This is also called a scripted or external lookup.
You can also specify the type of command or script to invoke:
[myLookup] external_cmd = <string> external_type = python fields_list = <string> max_matches = <integer>
Use fields_list
to list all the fields supported by the external command, delimited by a comma and space.
Note: Currently, Splunk Enterprise only supports Python scripts for external lookups. Python scripts used for these lookups must be located in one of two places:
$SPLUNK_HOME/etc/apps/<app_name>/bin
$SPLUNK_HOME/etc/searchscripts
Note: When writing your Python script, if you refer to any external resources (such as a file), the reference must be relative to the directory where the script is located.
External fields lookup example
Here's an example of how you might set up an external fields lookup to match with information from a DNS server. Splunk Enterprise ships with a script located in $SPLUNK_HOME/etc/system/bin/
called external_lookup.py
, which is a DNS lookup script that:
- if given a host, returns the IP address.
- if given an IP address, returns the host name.
Splunk also ships with a configuration for this script in $SPLUNK_HOME/etc/system/default/transforms.conf
:
[dnslookup] external_cmd = external_lookup.py clienthost clientip fields_list = clienthost,clientip
You can run a lookup search that uses this lookup configuration. This one looks at every host
value in your events and returns a corresponding clientip
value:
sourcetype=access_combined | eval clienthost = host | lookup dnslookup clienthost | stats count by clientip
This search uses an eval
statement to say that the clienthost
field in the script is equivalent to the host
field in your data. It also returns a table that provides a count for each of the clientip
values that the script returned.
A search that provides a reverse lookup (returns a host value for each IP address received) would be:
sourcetype=access_combined | lookup dnslookup clientip | stats count by clienthost
Note that this reverse lookup does not provide an eval statement. This is because Splunk automatically extracts IP addresses as clientip
.
More about the external lookup script
Your external lookup script must take in a partially empty CSV file and output a filled-in CSV file. The arguments that you pass to the script are the headers for these input and output files.
In the DNS lookup example above, the CSV file contains 2 fields, "clienthost" and "clientip". The fields that you pass to this script are the ones you specify in transforms.conf:
external_cmd = external_lookup.py clienthost clientip
Note: If you don't pass these arguments, the script returns an error.
When you run the search command:
... | lookup dnsLookup clienthost
You're telling Splunk Enterprise to use the lookup table that you defined in transforms.conf as [dnsLookup]
and pass into the external command script the values for the "clienthost" field as a CSV file, which may look like this:
clienthost,clientip work.com home.net
Basically, this is a CSV file with the header "clienthost" and "clientip", but missing values for clientip. The two headers are included because they are the fields you specified in the fields_list
parameter of transforms.conf.
The script then outputs the following CSV file and returns it to Splunk Enterprise, which populates the clientip field in your results:
host,ip work.com,127.0.0.1 home.net,127.0.0.2
Set up a time-based fields lookup
If your static or external lookup table has a field value that represents time, you can use this time field to set up your fields lookup. For time-based (or temporal) lookups, add the following lines to your lookup stanza in transforms.conf
:
time_field = <field_name> time_format = <string>
If time_field
is present, by default max_matches
is 1. Also, the first matching entry in descending order is applied.
Use the time_format
key to specify the strptime() format of your time_field
. By default, time_format
is %s.%Q or seconds from unix epoch in UTC and optional milliseconds.
Note: Splunk Enterprise enables you to use some nonstandard date-time strptime()
formats. For example, when you define ISO 8601 timestamps, you may use time_format = '%s.%Q'
, where %s
represents seconds and %Q
represents milliseconds. For more information about these additional settings, see the subtopic "Enhanced strptime() support" in "Configure timestamp recognition," in the Getting Data In Manual.
For a match to occur with time-based lookups, you can also specify offsets for the minimum and maximum amounts of time that an event may be later than a lookup entry. To do this, add the following lines to your stanza:
max_offset_secs = <integer> min_offset_secs = <integer>
By default, there is no maximum offset and the minimum offset is 0.
Example of time-based fields lookup
Here's an example of how you might use DHCP logs to identify users on your network based on their IP address and the timestamp. Let's say the DHCP logs are in a file, dhcp.csv
, which contains the timestamp, IP address, and the user's name and MAC address.
1. In a transforms.conf
file, put:
[dhcpLookup] filename = dhcp.csv time_field = timestamp time_format = %d/%m/%y %H:%M:%S
2. In a props.conf
file, put:
[dhcp] LOOKUP-table = dhcpLookup ip mac OUTPUT user
3. Restart Splunk Enterprise.
PREVIOUS Use field lookups to add information to your events |
NEXT Create and maintain workflow actions in Splunk Web |
This documentation applies to the following versions of Splunk® Enterprise: 6.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.0.6, 6.0.7, 6.0.8, 6.0.9, 6.0.10, 6.0.11, 6.0.12, 6.0.13, 6.0.14, 6.0.15, 6.1, 6.1.1, 6.1.2, 6.1.3, 6.1.4, 6.1.5, 6.1.6, 6.1.7, 6.1.8, 6.1.9, 6.1.10, 6.1.11, 6.1.12, 6.1.13, 6.1.14
Comments
"By default, time_format is UTC."<br /><br />this is ambiguous, and I ran into a problem where this was relevant. I was providing timestamps in ISO 8601 format, since that's what Splunk provides the _time field in when you export CSV. If you get the right error condition, it tells you that what it's looking for is:<br /><br />timeformat = '%s.%Q'<br /><br />Please add that "timeformat" configuration to the documentation.
Supersleeepwalker: I've updated the the relevant section of this topic to mention that configuration. I've also linked to a topic in our documentation that discusses some of the nonstandard strptime() formats that Splunk Enterprise supports, such as %Q," which is the format for milliseconds.