Splunk® Enterprise

Splunk Analytics for Hadoop

Splunk Enterprise version 9.0 will no longer be supported as of June 14, 2024. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.

Provider Configuration Variables

Splunk Analytics for Hadoop reaches End of Life on January 31, 2025.

When you configure an HDFS provider, Splunk Analytics for Hadoop automatically sets a number of configuration variables. You can use the preset variables, or you can modify them as needed by editing the provider.

Setting: Use it to:
vix.splunk.setup.onsearch Determines whether to perform setup (install & BR) on search.
vix.splunk.setup.package Location for the Splunk .tgz package that Splunk can install and use on data nodes (in vix.splunk.home.datanode). A value of current uses current install.
vix.splunk.home.datanode SPLUNK_HOME on the DataNode and/or TaskTracker
vix.splunk.home.hdfs The location of scratch space on HDFS for this Splunk instance.
vix.splunk.search.debug Determines whether search is run in debug mode.
vix.splunk.search.recordreader Provides a comma separated list of data pre-processing classes This value must extend BaseSplunkRecordReader and return data to be consumed by Splunk as the value
vix.splunk.search.recordreader.avro.regex Specifies a regex that files must match in order to be considered avro files, defaults to \.avro$.
vix.splunk.search.mr.threads Determines the number of threads to use when reading map results from HDFS.
vix.splunk.search.mr.maxsplits Determines the maximum number of splits in an MapReduce job.
vix.splunk.search.mr.poll Determines the polling period for job status, in milliseconds.
vix.splunk.search.mixedmode Determines whether mixed mode execution is enabled
vix.splunk.search.mixedmode.maxstream Determines the maximum number of bytes to stream during mixed mode. The default value is 10GB. A value of 0 indicates that there is no stream limit. Bytes will cease streaming after the first split that takes the value over the limit.
vix.splunk.jars Provides a comma-delimted list of dirs/jars to use in SH and MR

High-Availability configurations


Setting: Use it to:
vix.fs.default.name The name of the default file system uri> # hdfs://sveserv51-ha
vix.dfs.nameservices Comma-separated list of nameservices> # sveserv51-ha
vix.dfs.ha.namenodes.sveserv51-ha comma-separated list of namenodes for a given nameservice, eg sveserv51-ha> # nn1,nn2
vix.dfs.namenode.rpc-address.sveserv51-ha.nn1 The RPC server address and port for a given namenode, eg nn1, of a given nameservice, eg, sveserv51-ha> # sveserv51-vm6.sv.splunk.com:8020
vix.dfs.namenode.rpc-address.sveserv51-ha.nn2 The RPC server address and port for a given namenode, eg nn2, of a given nameservice, eg, sveserv51-ha> # sveserv51-vm5.sv.splunk.com:8020
vix.dfs.client.failover.proxy.provider.sveserv51-ha A FailoverProxyProvider implementation for a given nameservice, eg, sveserv51-ha> # org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider


Setting: Use it to:
vix.mapred.job.tracker The logical name for a list of jobtrackers> # sveser51-ha-jt
vix.mapred.jobtrackers.sveserv51-ha-jt comma-separated list of jobtrackers for a given logical jobtracker name, eg, sveserv51-ha-jt> # jt1,jt2
vix.mapred.jobtracker.rpc-address.sveserv51-ha-jt.jt1 The RPC server address and port for a given jobtracker, eg jt1, of a given logical jobtracker name, eg, sveserv51-ha-jt> # sveserv51-vm6.sv.splunk.com:8021
vix.mapred.jobtracker.rpc-address.sveserv51-ha-jt.jt2 The RPC server address and port for a given jobtracker, eg jt2, of a given logical jobtracker name, eg, sveserv51-ha-jt> # sveserv51-vm5.sv.splunk.com:8021
vix.mapred.client.failover. proxy.provider.sveserv51-ha-jt A FailoverProxyProvider implementation for a given logical jobtracker name, eg, sveserv51-ha-jt> # org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider
vix.mapred.map.max.attempts Number of attempts a map task can be retried, until it is marked as failed> # 4
vix.mapred.max.map.failures.percent Percent of acceptable task failures for the whole MR job, until the MR job fails> # 5
Last modified on 30 October, 2023
Performance best practices   Virtual index configuration variables

This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.2.0, 9.2.1, 9.2.2

Was this topic useful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters