Look up fields from external data sources
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
Look up fields from external data sources
Use the dynamic fields lookup feature to add fields to your events with information from an external source, such as a static table (CSV file) or an external (Python) command. You can also base more sophisticated lookups on time information.
For example, if you are monitoring logins with Splunk and have IP addresses and timestamps for accesses in your Splunk index, you can use dynamic fields lookup to map the IP address and timestamp to the MAC address and username information for the matching IP and timestamp data that you have in your DHCP logs.
To set up a lookup:
1. Edit transforms.conf to define your lookup table.
Currently you can define two kinds of lookup tables: static lookups (which utilize CSV files) and external lookups (which utilize Python scripts). The arguments you use in your transforms stanza indicate the type of lookup table you want to define. Use filename for static lookups and external_cmd for external lookups.
Note: A lookup table must have at least two columns. Each column may have multiple instances of the same value (multi-valued fields).
2. Edit props.conf to apply your lookup table.
This step is the same for both static and external lookups. In this configuration file, you specify the fields to match and output from the lookup table that you defined in transforms.conf.
3. Restart Splunk to implement the changes you made to the configuration files.
When your lookup is configured properly, you see the output fields from your lookup table in each of the matching events.
Set up a fields lookup based on a static file
The simplest fields lookup is based on a static table CSV file. The CSV file needs to be located in one of two places:
-
$SPLUNK_HOME/etc/system/lookups -
$SPLUNK_HOME/etc/apps/<app_name>/lookups
1. Edit transforms.conf to define your lookup table.
In transforms.conf, add a stanza to define your lookup table. The name of the stanza is also the name of your lookup table. You will use this transform in props.conf.
In this stanza, reference the CSV file's name:
[myLookup] filename = <filename> max_matches = <integer>
Optionally, you can specify the number of matching entries to apply to an event. max_matches indicates that the first (in file order) <integer> entries are used. By default, max_matches is 1000 for lookups that are not time-based.
2. Edit props.conf to apply your lookup table.
In props.conf, add a stanza with the lookup key. This stanza specifies the lookup table that you defined in transforms.conf and indicates how Splunk should apply it to your events:
[<stanza name>] lookup_<class> = $TRANSFORM <match_field_in_table> OUTPUT <output_field_in_table>
-
$TRANSFORMreferences the stanza intransforms.confwhere you defined your lookup table. -
match_field_in_tableis the column in the lookup table that you use to match values. -
output_field_in_tableis the column in the lookup table that you add to your events. - You can have multiple columns on either side of the lookup. For example, you could have
$TRANSFORM <match_field1>, <match_field2> OUTPUT <match_field3>, <match_field4>. You can also have one field return two fields, three fields return one field, and so on.
Use the AS clause if the field names in the lookup table and your events do not match or if you want to rename the field in your event:
[<stanza name>] lookup_<class> = $TRANSFORM <match_field_in_table> AS <match_field_in_event> OUTPUT <output_field_in_table> AS <output_field_in_event>
You can have more than one field after the OUTPUT clause. If you don't use the OUTPUT, Splunk adds all the field names and values from the lookup table to your events.
3. Restart Splunk.
Example of static fields lookup
Here's an example of setting up lookups for HTTP status codes in an access_combined log.
In this example, you want to match the status field in your lookup table (http_status.csv) with the field in your events. Then, you add the status description and status type fields into your events.
1. In a transforms.conf file, put:
[http_status] filename = http_status.csv
2. In a props.conf file, put:
[access_combined] lookup_table = http_status status OUTPUT status_description, status_type
3. Restart Splunk.
Use search results to populate a lookup table
You can use the results of a saved search to populate a lookup table. In a local or app-specific copy of savedsearches.conf:
1. Define a lookup. Optionally, test it with a search that uses the lookup search command to make sure it is correct.
1. Enable the lookup population action.
2. Tell Splunk where to copy your lookup table.
To do this, add these two lines into the stanza for your saved search:
action.populate_lookup = 1 action.populate_lookup.dest = <string>
The action.populate_lookup.dest value is the path to a CSV file where Splunk copies the search results. For this action to work, the destination directories must already exist; they are either:
-
$SPLUNK_HOME/etc/system/lookups -
$SPLUNK_HOME/etc/<app_name>/lookups
Because Splunk copies the results of the saved search to a CSV file, you can set up your fields lookup the same way you set up a static lookup.
Set up a fields lookup based on an external command
For an external lookup, your transforms.conf stanza references the command or script and arguments to invoke. You can also specify the type of command or script to invoke:
[myLookup] external_cmd = <string> external_type = python fields_list = <string> max_matches <integer>
Note: Currently, Splunk only supports Python scripts for external-command-based field lookups. Python scripts used for these lookups must be located in one of two places:
-
$SPLUNK_HOME/etc/<app_name>/bin -
$SPLUNK_HOME/etc/searchscripts
Use fields_list to list all the fields supported by the external command, delimited by a comma and space.
Example of external fields lookup
Here's an example of how you might use external lookups to match with information from a DNS server. In this example, dnslookup.py is a script that:
- if given a host, returns the IP address.
- if given an IP address, returns the host name.
1. In a transforms.conf file, put:
[dnsLookup] external_cmd = dnslookup.py host ip fields_list = host, ip
2. In a props.conf file, put:
[dns] lookup_ip = dnsLookup host OUTPUT ip
For a reverse DNS lookup, your props.conf stanza would be:
[reversedns] lookup_host = dnsLookup ip OUTPUT host
3. Restart Splunk.
Set up a time-based fields lookup
If your static or external lookup table has a field value that represents time, you can use this time field to set up your fields lookup. For time-based lookups, add the following lines to your lookup stanza in transforms.conf:
time_field = <field_name> time_format = <string>
If time_field is present, by default max_matches is 1. Also, the first matching entry in descending order is applied.
Use the time_format key to specify the strptime format of your time_field. By default, time_format is UTC.
For a match to occur with time-based lookups, you can also specify offsets for the minimum and maximum amounts of time that an event may be later than a lookup entry. To do this, add the following lines to your stanza:
max_offset_secs = <integer> min_offset_secs = <integer>
By default, there is no maximum offset and the minimum offset is 0.
Example of time-based fields lookup
Here's an example of how you might use DHCP logs to identify users on your network based on their IP address and the timestamp. Let's say the DHCP logs are in a file, dhcp.csv, which contains the timestamp, IP address, and the user's name and MAC address.
1. In a transforms.conf file, put:
[dhcpLookup] filename = dhcp.csv time_field = timestamp time_format = %d/%m/%y %H:%M:%S
2. In a props.conf file, put:
[dhcp] lookup_table = dhcpLookup ip mac OUTPUT user
3. Restart Splunk.
This documentation applies to the following versions of Splunk: 3.4.10 , 3.4.11 , 3.4.12 , 3.4.13 , 3.4.14 View the Article History for its revisions.