Splunk® Enterprise

Splunk Analytics for Hadoop

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Set up a virtual index in the configuration file

Splunk Analytics for Hadoop reaches End of Life on January 31, 2025.

Use the following procedure to set up a virtual index using the configuration file. See Add or edit a virtual index in this manual for information about adding a virtual index via the Splunk Web.

1. In indexes.conf, define one or more virtual indexes for each provider. This is where you can specify how the data is organized into directories, which files are part of the index and some hints about the time range of the content of the files.

vix.provider = MyHadoopProvider
vix.input.1.path = /home/myindex/data/${date_date}/${date_hour}/${server}/...
vix.input.1.accept = \.gz$
vix.input.1.et.regex = /home/myindex/data/(\d+)/(\d+)/
vix.input.1.et.format = yyyyMMddHH
vix.input.1.et.offset = 0
vix.input.1.lt.regex = /home/myindex/data/(\d+)/(\d+)/
vix.input.1.lt.format = yyyyMMddHH
vix.input.1.lt.offset = 3600
  • For vix.input.1.path: Provide a fully qualified path to the data that belongs in this index and any fields you want to extract from the path.

For example:


Items enclosed in ${}'s are extracted as fields and added to each search result from that path. The search will ignore the directories which do not match the search string, thus significantly aiding performance.

  • For vix.input.1.accept provide a regular expression list of files to match.
  • For vix.input.1.ignore provide a regular expression list of files to ignore. Note, ignore takes precedence over accept.

2. Use the regex, format, and offset values to extract a time range for the data contained in a particular path. The time range is made up of two parts: earliest time vix.input.1.et and latest time vix.input.1.lt. The following configurations can be used:

  • For vix.input.1.et/lt.regex, provide a regular expression that matches a portion of the directory which provides date and time, to allow for interpreting time from the path.
    Use capturing groups to extract the parts that make up the timestamp. The values of the capturing groups are concatenated together and are interpreted according to the specified format. Extracting a time range from the path will significantly speed searching for particular time windows by ignoring directories which fall outside of the search's time range.
  • For vix.input.1.et/lt.format, provide a date/time format string for how to interpret the data extracted from the above regex. The format string specs can be found in the SimpleDateFormat. You can set this value epoch to interpret the time as seconds.
  • For vix.input.[N].et/lt.value, you can specify mtime to use the modification time of the file rather than the data extracted by the regex.
  • For vix.input.1.et/lt.offset, you can optionally use it to provide an offset to account for timezone and/or safety boundaries.
Last modified on 30 October, 2023
Add a sourcetype
Add or edit an HDFS provider in Splunk Web

This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.1.0, 9.1.1, 9.1.2

Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters