Splunk® Enterprise

Knowledge Manager Manual

Download manual as PDF

Splunk Enterprise version 5.0 reached its End of Life on December 1, 2017. Please see the migration information.
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Download topic as PDF

Configure field lookups

Use the dynamic fields lookup feature to add fields to your events with information from an external source, such as a static table (CSV file) or an external (Python) command. You can also add fields based on matching time information.

Note: CSV files with Pre-OS X (OS9 or earlier) Macintosh-style line endings (aka carriage return only, "\r") are not supported.

For example, if you are monitoring logins with Splunk and have IP addresses and timestamps for those logins in your Splunk index, you can use a dynamic field lookup to map the IP address and timestamp to the MAC address and username information for the matching IP and timestamp data that you have in your DHCP logs.

You can set up a lookup using the Lookups Manager page in Splunk Web or by configuring stanzas in props.conf and transforms.conf. For more information about using the Lookups Manager, see the fields lookup tutorial in the Splunk Tutorial. This topic shows you how to use props.conf and transforms.conf to set up your lookups.

To set up a lookup using the configuration files:

Important: Do not edit conf files in $SPLUNK_HOME/etc/system/default. Instead, you should edit the file in $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local/. If the file doesn't exist, create it.

1. Edit transforms.conf to define your lookup table.

Currently you can define two kinds of lookup tables: static lookups (which utilize CSV files) and external lookups (which utilize Python scripts). The arguments you use in your transforms stanza indicate the type of lookup table you want to define. Use filename for static lookups and external_cmd for external lookups.

Note: A lookup table must have at least two columns. Each column may have multiple instances of the same value (multi-valued fields). It may not contain non-utf-8 characters (plain ascii text is ok, as is any character set that is also valid utf-8.)

2. Edit props.conf to apply your lookup table.

This step is the same for both static and external lookups. In this configuration file, you specify the fields to match and output (or outputnew, if you don't want to overwrite the output field) from the lookup table that you defined in transforms.conf.

You can have more than one field lookup defined in a single source stanza. Each lookup should have it's own unique lookup name; for example, if you have multiple tables, you can name them: LOOKUP-table1, LOOKUP-table2, etc., or something more descriptive.

When you add a lookup to props.conf, the lookup is run automatically. If your automatic lookup is very slow, it will also impact the speed of your searches.

3. Restart Splunk to implement the changes you made to the configuration files.

After restart, you should see the output fields from your lookup table listed in the fields sidebar. From there, you can select the fields to display in each of the matching search results.

Set up a fields lookup based on a static file

The simplest fields lookup is based on a static table, specifically a CSV file. The CSV file needs to be located in one of two places:

  • $SPLUNK_HOME/etc/system/lookups/
  • $SPLUNK_HOME/etc/apps/<app_name>/lookups/

Create the lookups directory if it does not exist.

1. Edit transforms.conf to define your lookup table.

In transforms.conf, add a stanza to define your lookup table. The name of the stanza is also the name of your lookup table. You will use this transform in props.conf.

In this stanza, reference the CSV file's name:

[myLookup]
filename = <filename>
max_matches = <integer>

Optionally, you can specify the number of matching entries to apply to an event; max_matches indicates that the first (in file order) <integer> number of entries are used. By default, max_matches is 100 for lookups that are not based on a timestamp field.

2. Edit props.conf to apply your lookup table.

In props.conf, add a stanza with the lookup key. This stanza specifies the lookup table that you defined in transforms.conf and indicates how Splunk should apply it to your events:

[<stanza name>]
LOOKUP-<class> = $TRANSFORM <match_field_in_table> OUTPUT|OUTPUTNEW <output_field_in_table>
  • stanza name is the sourcetype, host, or source to which this lookup applies, as specified in props.conf.
  • stanza name can't use regex-type syntax.
  • $TRANSFORM references the stanza in transforms.conf where you defined your lookup table.
  • match_field_in_table is the column in the lookup table that you use to match values.
  • output_field_in_table is the column in the lookup table that you add to your events. Use OUTPUTNEW if you don't want to overwrite existing values in your output field.
  • You can have multiple columns on either side of the lookup. For example, you could have $TRANSFORM <match_field1>, <match_field2> OUTPUT|OUTPUTNEW <match_field3>, <match_field4>. You can also have one field return two fields, three fields return one field, and so on.

Use the AS clause if the field names in the lookup table and your events do not match or if you want to rename the field in your event:

[<stanza name>]
LOOKUP-<class> = $TRANSFORM <match_field_in_table> AS <match_field_in_event> 
OUTPUT|OUTPUTNEW <output_field_in_table> AS <output_field_in_event>

You can have more than one field after the OUTPUT|OUTPUTNEW clause. If you don't use OUTPUT|OUTPUTNEW, Splunk adds all the field names and values from the lookup table to your events.

3. Restart Splunk.

Example of static fields lookup

Here's an example of setting up lookups for HTTP status codes in an access_combined log. In this example, you want to match the status field in your lookup table (http_status.csv) with the field in your events. Then, you add the status description and status type fields into your events.

The following is the http_status.csv file. You can put this into $SPLUNK_HOME/etc/apps/<app_name>/lookups/. If you're using this in the Search App, put the file into $SPLUNK_HOME/etc/apps/search/lookups/:

status,status_description,status_type
100,Continue,Informational
101,Switching Protocols,Informational
200,OK,Successful
201,Created,Successful
202,Accepted,Successful
203,Non-Authoritative Information,Successful
204,No Content,Successful
205,Reset Content,Successful
206,Partial Content,Successful
300,Multiple Choices,Redirection
301,Moved Permanently,Redirection
302,Found,Redirection
303,See Other,Redirection
304,Not Modified,Redirection
305,Use Proxy,Redirection
307,Temporary Redirect,Redirection
400,Bad Request,Client Error
401,Unauthorized,Client Error
402,Payment Required,Client Error
403,Forbidden,Client Error
404,Not Found,Client Error
405,Method Not Allowed,Client Error
406,Not Acceptable,Client Error
407,Proxy Authentication Required,Client Error
408,Request Timeout,Client Error
409,Conflict,Client Error
410,Gone,Client Error
411,Length Required,Client Error
412,Precondition Failed,Client Error
413,Request Entity Too Large,Client Error
414,Request-URI Too Long,Client Error
415,Unsupported Media Type,Client Error
416,Requested Range Not Satisfiable,Client Error
417,Expectation Failed,Client Error
500,Internal Server Error,Server Error
501,Not Implemented,Server Error
502,Bad Gateway,Server Error
503,Service Unavailable,Server Error
504,Gateway Timeout,Server Error
505,HTTP Version Not Supported,Server Error

1. In a transforms.conf file located in either $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local, put:

[http_status]
filename = http_status.csv

2. In a props.conf file, located in either $SPLUNK_HOME/etc/system/local/ or $SPLUNK_HOME/etc/apps/<app_name>/local/, put:

[access_combined]
LOOKUP-http = http_status status OUTPUT status_description, status_type

3. Restart Splunk.

Now, when you run a search that returns Web access information, you will see the fields status_description and status_type listed in your fields sidebar menu.

Use search results to populate a lookup table

You can edit a local or app-specific copy of savedsearches.conf to use the results of a saved search to populate a lookup table.

In a saved search stanza, where the search returns a results table:

1. Add the following line to enable the lookup population action.

action.populate_lookup = 1

This tells Splunk to save your results table into a CSV file.

2. Add the following line to tell Splunk where to copy your lookup table.

action.populate_lookup.dest = <string>

The action.populate_lookup.dest value is a lookup name from transforms.conf or a path to a CSV file where Splunk should copy the search results. If it is a path to a CSV file, the path should be relative to $SPLUNK_HOME.

For example, if you want to save the results to a global lookup table, you might include:

action.populate_lookup.dest = etc/system/lookups/myTable.csv

The destination directory, $SPLUNK_HOME/etc/system/lookups or $SPLUNK_HOME/etc/<app_name>/lookups, should already exist.

3. Add the following line if you want this search to run when Splunk starts up.

run_on_startup = true

If it does not run on startup, it will run at the next scheduled time. Generally, we recommend that you set this to true for scheduled searches that populate lookup tables.

Because Splunk copies the results of the saved search to a CSV file, you can set up your fields lookup the same way you set up a static lookup.

Set up a fields lookup based on an external command or script

For dynamic or external lookups, your transforms.conf stanza references the command or script and arguments to invoke. This is also called a scripted or external lookup.

You can also specify the type of command or script to invoke:

[myLookup]
external_cmd = <string>
external_type = python
fields_list = <string>
max_matches = <integer>

Use fields_list to list all the fields supported by the external command, delimited by a comma and space.

Note: Currently, Splunk only supports Python scripts for external lookups. Python scripts used for these lookups must be located in one of two places:

  • $SPLUNK_HOME/etc/apps/<app_name>/bin
  • $SPLUNK_HOME/etc/searchscripts

Note: When writing your Python script, if you refer to any external resources (such as a file), the reference must be relative to the directory where the script is located.

Example of external fields lookup

Here's an example of how you might use external lookups to match with information from a DNS server. Splunk ships with a script located in $SPLUNK_HOME/etc/system/bin/ called external_lookup.py, which is a DNS lookup script that:

  • if given a host, returns the IP address.
  • if given an IP address, returns the host name.

1. In a transforms.conf file, put:

[dnsLookup]
external_cmd = external_lookup.py host ip
fields_list = host, ip

2. In a props.conf file, put:

[access_combined]
LOOKUP-dns = dnsLookup host OUTPUT ip AS clientip

The field in the lookup table is named ip, but Splunk automatically extracts the IP addresses from Web access logs into a field named clientip. So, "OUTPUT ip AS clientip" indicates that you want Splunk to add the values of ip from the lookup table into the clientip field in the events. Since the host field has the same name in the lookup table and the events, you don't need to rename the field.

For a reverse DNS lookup, your props.conf stanza would be:

[access_combined]
LOOKUP-rdns = dnsLookup ip AS clientip OUTPUTNEW host AS hostname

For this example, instead of overwriting the host field value, you want Splunk to return the host value in a new field, called hostname

3. Restart Splunk.

More about the external lookup script

When designing your external lookup script, keep in mind that it needs to take in a partially empty CSV file and output a filled-in CSV file. The arguments that you pass to the script are the headers for these input and output files.

In the DNS lookup example above, the CSV file contains 2 fields, "host" and "ip". The fields that you pass to this script are the ones you specify in transforms.conf:

external_cmd = external_lookup.py host ip

Note: If you don't pass these arguments, the script will return an error.

When you run the search command:

... | lookup dnsLookup host

You're telling Splunk to use the lookup table that you defined in transforms.conf as [dnsLookup] and pass into the external command script the values for the "host" field as a CSV file, which may look like this:

host,ip
work.com
home.net

Basically, this is a CSV file with the header "host" and "ip", but missing values for ip. The two headers are included because they are the fields you specified in the fields_list parameter of transforms.conf.

The script then outputs the following CSV file and returns it to Splunk, which populates the ip field in your results:

host,ip
work.com,127.0.0.1
home.net,127.0.0.2

Set up a time-based fields lookup

If your static or external lookup table has a field value that represents time, you can use this time field to set up your fields lookup. For time-based (or temporal) lookups, add the following lines to your lookup stanza in transforms.conf:

time_field = <field_name>
time_format = <string>

If time_field is present, by default max_matches is 1. Also, the first matching entry in descending order is applied.

Use the time_format key to specify the strptime format of your time_field. By default, time_format is UTC.

For a match to occur with time-based lookups, you can also specify offsets for the minimum and maximum amounts of time that an event may be later than a lookup entry. To do this, add the following lines to your stanza:

max_offset_secs = <integer>
min_offset_secs = <integer>

By default, there is no maximum offset and the minimum offset is 0.

Example of time-based fields lookup

Here's an example of how you might use DHCP logs to identify users on your network based on their IP address and the timestamp. Let's say the DHCP logs are in a file, dhcp.csv, which contains the timestamp, IP address, and the user's name and MAC address.

1. In a transforms.conf file, put:

[dhcpLookup]
filename = dhcp.csv
time_field = timestamp
time_format = %d/%m/%y %H:%M:%S

2. In a props.conf file, put:

[dhcp]
LOOKUP-table = dhcpLookup ip mac OUTPUT user

3. Restart Splunk.

Troubleshooting lookups - Using identical names in lookup stanzas

Lookup table definitions are indicated with the attribute, LOOKUP-<class>. In general it's best if all of your lookup stanzas have different names to reduce the chance of things going wrong. When you do give the same name to two or more lookups you can run into trouble unless you know what you're trying to do:

  • If two or more lookups with the same name share the same stanza (the same host, source, or sourcetype) the first lookup with that stanza in fields.conf overrides the others. All lookups with the same host, source, or sourcetype should have different names.
  • If you have lookups with different stanzas (different hosts, sources, or sourcetypes) that share the same name, you can end up with a situation where only one of them seems to work at any given point in time. You may set this up on purpose, but in most cases it's probably not very convenient.

For example, say you have the following two lookups that share the name "table":

[host::machine_name]
LOOKUP-table = logs_per_day host OUTPUTNEW average_logs AS logs_per_day
[sendmail]
LOOKUP-table = location host OUTPUTNEW building AS location

Any events that overlap between these two lookups will only be affected by one of them. In other words:

  • events that match the host will get the host lookup.
  • events that match the sourcetype will get the sourcetype lookup.
  • events that match both will only get the host lookup.

When you name your lookup LOOKUP-table, you're saying this is the lookup that achieves some purpose or action described by "table". In this example, these lookups are intended to achieve different goals--one determines something about logs per day, and the other has something to do with location. You might instead rename them:

[host::machine_name]
LOOKUP-table = logs_per_day host OUTPUTNEW average_logs AS logs_per_day
[sendmail]
LOOKUP-location = location host OUTPUTNEW building AS location

Now you have two different settings that won't collide.

PREVIOUS
Use field lookups to add information to your events
  NEXT
Create and maintain workflow actions in Manager

This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters