Splunk Cloud Platform

Knowledge Manager Manual

Configure CSV lookups

CSV lookups match field values from your events to field values in the static table represented by a CSV file. Then they output corresponding field values from that table to your events. They are also referred to as "static lookups". CSV inline lookup table files and inline lookup definitions that use CSV files are both dataset types. See Dataset types and usage.

Each column in a CSV table is interpreted as a potential value of a field.

About the CSV files

There are a few restrictions to the kinds of CSV files that can be used for CSV lookups:

  • The table represented by the CSV file must have at least two columns. One of those columns should represent a field with a set of values that includes those belonging to a field in your events. The column does not have to have the same name as the event field. Any column can have multiple instances of the same value, as this represents a multivalued field.
  • The CSV file cannot contain non-utf-8 characters. Plain ascii text is supported, as is any character set that is also valid utf-8.
  • The following are unsupported:
    • CSV files with pre-OS X (OS 9 or earlier) Macintosh-style line endings (carriage return ("\r") only)
    • CSV files with header rows that exceed 4096 characters.

Create a CSV lookup

Prerequisities

Steps

  1. Add the CSV file for the lookup to your Splunk deployment. The CSV file must be located in one of the following places:
    $SPLUNK_HOME/etc/system/lookups
    $SPLUNK_HOME/etc/apps/<app_name>/lookups
    Create the lookups directory if it does not exist.
  2. Add a CSV lookup stanza to transforms.conf.
    If you want the lookup to be available globally, add its lookup stanza to the version of transforms.conf in $SPLUNK_HOME/etc/system/local/. If you want the lookup to be specific to a particular app, add its stanza to the version of transforms.conf in $SPLUNK_HOME/etc/apps/<app_name>/local/.
    Caution: Do not edit configuration files in $SPLUNK_HOME/etc/system/default.
    The CSV lookup stanza names the lookup table and provides the name of the CSV file that the lookup uses. It uses these required fields.
    • [<lookup_name>]: The name of the lookup table.
    • filename = <string>: The name of the CSV file that the lookup references. The CSV file is saved in $SPLUNK_HOME/etc/system/lookups/, or in $SPLUNK_HOME/etc/<app_name>/lookups/ if the lookup belongs to a specific app. Only file names are supported. If you specify a path, the Splunk software strips the path to use the value after the final path separator.
  3. (Optional) Use the check_permission field in transforms.conf and outputlookup_check_permission in limits.conf to restrict write access to users with the appropriate permissions when using the outputlookup command.

    Both check_permission and outputlookup_check_permission default to false. Set to true for Splunk software to verify permission settings for lookups for users.
    You can change lookup table file permissions in the .meta file for each lookup file, or Settings > Lookups > Lookup table files. By default, only users who have the admin or power role can write to a shared CSV lookup file.
  4. (Optional) Use the filter field to prefilter large CSV lookup tables.
    You may need to prefilter significantly large CSV lookup tables. To do this use the filter field to restrict searches.
  5. (Optional) Set up field/value matching rules for the CSV lookup.
  6. (Optional) If the CSV lookup table contains time fields, make the CSV lookup time-bounded.
  7. (Optional) Make the CSV lookup automatic by adding a configuration to props.conf
    If you want the automatic lookup to be available globally, add its lookup stanza to the version of props.conf in $SPLUNK_HOME/etc/system/local/. If you want the lookup to be specific to a particular app, add its stanza to the version of props.conf in $SPLUNK_HOME/etc/apps/<app_name>/local/.
    Caution: Do not edit configuration files in $SPLUNK_HOME/etc/system/default.
  8. Restart Splunk Enterprise to implement your changes.
    If you have set up an automatic lookup, after restart you should see the output fields from your lookup table listed in the fields sidebar. From there, you can select the fields to display in each of the matching search results.

Handle large CSV lookup tables

Lookup tables are created and modified on a search head. The search head replicates a new or modified lookup table to other search heads, or to indexers to perform certain tasks.

  • Knowledge bundle replication. When a search head distributes searches to indexers, it also distributes a related knowledge bundle to the indexers. The knowledge bundle contains knowledge objects, such as lookup tables, that the indexers need to perform their searches. See What search heads send to search peers in Distributed Search.
  • Configuration replication (search head clusters). In search head clusters, runtime changes made on one search head are automatically replicated to all other search heads in the cluster. If a user creates or updates a lookup table on a search head in a cluster, that search head then replicates the updated table to the other search heads. See Configuration updates that the cluster replicates in Distributed Search.

When a lookup table changes, the search head must replicate the updated version of the lookup table to the other search heads or the indexers, or both, depending on the situation. By default, the search head sends the entire table each time any part of the table changes.

Replicating lookup tables only on the search heads

There are situations in which you might not want to replicate lookup tables to the indexers. For example, if you are using the outputcsv or inputcsv commands, those commands always run on the search head. If you only want to replicate the lookup table on the search heads in a search head clustering setup, set replicate=false in the transforms.conf file.

Enable custom indexing

You can also improve lookup performance by configuring which fields are indexed by using the index_fields_list setting in the transforms.conf file. The index_fields_list is a list of all fields that need to be indexed for your static CSV lookup file.

Prerequisites

  • Access to the transforms.conf files, located in $SPLUNK_HOME/etc/apps/<app_name>/local/
  • A CSV lookup file


Steps

  1. In the transforms.conf file, add the index_fields_list setting to your lookup table.
    The index_fields_list setting is a comma and space-delimited list of all of the fields that need to be indexed for your static CSV lookup file.

The default for the index_fields_list setting is all of the fields that are defined in the lookup table file header. Restricting the fields will improve lookup performance.

Prefilter CSV lookup tables

If you know that you only need results from a subset of records in the lookup table, improve search performance by using the filter field to filter out all of the records that do not need to be looked at. The filter field requires a string containing a search query with Boolean expressions and/or comparison operators (==, !=, >, <, <=, >=, OR , AND, and NOT). This query runs whenever you run a search that invokes this lookup.

For example, if your lookup configuration has filter = (CustID>500) AND (CustName="P*"), it will try to retrieve values only from records that have a CustID value greater than 500 and a CustName value beginning with the letter P.

You can also filter records from CSV tables when you use the WHERE clause in conjunction with the inputlookup and inputcsv commands, when you use those commands to search CSV files.

CSV lookup example

This example explains how you can set up a lookup for HTTP status codes in an access_combined log. In this example, you design a lookup that matches the status field in your events with the status column in a lookup table named http_status.csv. Then you have the lookup output the corresponding status_description and status_type fields to your events.

The following is the text for the http_status.csv file.

status,status_description,status_type
100,Continue,Informational
101,Switching Protocols,Informational
200,OK,Successful
201,Created,Successful
202,Accepted,Successful
203,Non-Authoritative Information,Successful
204,No Content,Successful
205,Reset Content,Successful
206,Partial Content,Successful
300,Multiple Choices,Redirection
301,Moved Permanently,Redirection
302,Found,Redirection
303,See Other,Redirection
304,Not Modified,Redirection
305,Use Proxy,Redirection
307,Temporary Redirect,Redirection
400,Bad Request,Client Error
401,Unauthorized,Client Error
402,Payment Required,Client Error
403,Forbidden,Client Error
404,Not Found,Client Error
405,Method Not Allowed,Client Error
406,Not Acceptable,Client Error
407,Proxy Authentication Required,Client Error
408,Request Timeout,Client Error
409,Conflict,Client Error
410,Gone,Client Error
411,Length Required,Client Error
412,Precondition Failed,Client Error
413,Request Entity Too Large,Client Error
414,Request-URI Too Long,Client Error
415,Unsupported Media Type,Client Error
416,Requested Range Not Satisfiable,Client Error
417,Expectation Failed,Client Error
500,Internal Server Error,Server Error
501,Not Implemented,Server Error
502,Bad Gateway,Server Error
503,Service Unavailable,Server Error
504,Gateway Timeout,Server Error
505,HTTP Version Not Supported,Server Error
  1. Put the http_status.csv file in $SPLUNK_HOME/etc/apps/search/lookups/. This indicates that the lookup is specific to the Search App.
  2. In the transforms.conf file located in $SPLUNK_HOME/etc/apps/search/local, put:
    [http_status]
    filename = http_status.csv
    
  3. Restart Splunk Enterprise to implement your changes.

Now you can invoke this lookup in search strings with the following commands:

  • lookup: Use to add fields to the events in the results of the search.
  • inputlookup: Use to search the contents of a lookup table.
  • outputlookup: Use to write fields in search results to a CSV file that you specify.

See the topics on these commands in the Search Reference for more information about how to do this.

For example, you could run this search to add status_description and status_type fields to events that contain status values that match status values in the CSV table.

... | lookup http_status status OUTPUT status_description, status_type

Use search results to populate a CSV lookup table

You can edit a local or app-specific copy of savedsearches.conf to use the results of a report to populate a lookup table.

In a report stanza, where the search returns a results table:

  1. Add the following line to enable the lookup population action.
    action.populate_lookup = 1
    
    This tells Splunk software to save your results table into a CSV file.
  2. Add the following line to specify where to copy your lookup table.
    action.populate_lookup.dest = <string>
    

    The action.populate_lookup.dest value is a lookup name from transforms.conf or a path to a CSV file where the search results are to be copied. If it is a path to a CSV file, the path should be relative to $SPLUNK_HOME.

    For example, if you want to save the results to a global lookup table, you might include:

    action.populate_lookup.dest = etc/system/lookups/myTable.csv
    
    The destination directory, $SPLUNK_HOME/etc/system/lookups or $SPLUNK_HOME/etc/<app_name>/lookups, should already exist.
  3. Add the following line if you want this search to run when Splunk Enterprise starts up.
    run_on_startup = true
    

    If it does not run on startup, it will run at the next scheduled time. We recommend that you set run_on_startup = true for scheduled searches that populate lookup tables.

    Because the results of the reporter copied to a CSV file, you can set up this lookup the same way you set up a CSV lookup.
Last modified on 13 August, 2021
Introduction to lookup configuration   Configure external lookups

This documentation applies to the following versions of Splunk Cloud Platform: 9.2.2406 (latest FedRAMP release), 8.2.2202, 9.0.2205, 8.2.2112, 8.2.2201, 8.2.2203, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters