Provider Configuration Variables

Splunk Analytics for Hadoop reaches End of Life on January 31, 2025.

When you configure an HDFS provider, Splunk Analytics for Hadoop automatically sets a number of configuration variables. You can use the preset variables, or you can modify them as needed by editing the provider.

For more information about editing them in the configuration file, see Set up a provider and virtual index in the configuration file.

For information about editing providers in Splunk Web interface, see Add an HDFS provider.

For information about setting provider configuration variables for YARN, see Required configuration variables for YARN.

Setting:	Use it to:
`vix.splunk.setup.onsearch`	Determines whether to perform setup (install & BR) on search.
`vix.splunk.setup.package`	Location for the Splunk `.tgz` package that Splunk can install and use on data nodes (in `vix.splunk.home.datanode`). A value of `current` uses current install.
`vix.splunk.home.datanode`	`SPLUNK_HOME` on the DataNode and/or TaskTracker
`vix.splunk.home.hdfs`	The location of scratch space on HDFS for this Splunk instance.
`vix.splunk.search.debug`	Determines whether search is run in debug mode.
`vix.splunk.search.recordreader`	Provides a comma separated list of data pre-processing classes This value must extend `BaseSplunkRecordReader` and return data to be consumed by Splunk as the value
`vix.splunk.search.recordreader.avro.regex`	Specifies a regex that files must match in order to be considered avro files, defaults to `\.avro$`.
`vix.splunk.search.mr.threads`	Determines the number of threads to use when reading map results from HDFS.
`vix.splunk.search.mr.maxsplits`	Determines the maximum number of splits in an MapReduce job.
`vix.splunk.search.mr.poll`	Determines the polling period for job status, in milliseconds.
`vix.splunk.search.mixedmode`	Determines whether mixed mode execution is enabled
`vix.splunk.search.mixedmode.maxstream`	Determines the maximum number of bytes to stream during mixed mode. The default value is 10GB. A value of `0` indicates that there is no stream limit. Bytes will cease streaming after the first split that takes the value over the limit.
`vix.splunk.jars`	Provides a comma-delimted list of dirs/jars to use in SH and MR

High-Availability configurations

HA-NN

Setting:	Use it to:
`vix.fs.default.name`	The name of the default file system uri> # hdfs://sveserv51-ha
`vix.dfs.nameservices`	Comma-separated list of nameservices> # sveserv51-ha
`vix.dfs.ha.namenodes.sveserv51-ha`	comma-separated list of namenodes for a given nameservice, eg sveserv51-ha> # nn1,nn2
`vix.dfs.namenode.rpc-address.sveserv51-ha.nn1`	The RPC server address and port for a given namenode, eg nn1, of a given nameservice, eg, sveserv51-ha> # sveserv51-vm6.sv.splunk.com:8020
`vix.dfs.namenode.rpc-address.sveserv51-ha.nn2`	The RPC server address and port for a given namenode, eg nn2, of a given nameservice, eg, sveserv51-ha> # sveserv51-vm5.sv.splunk.com:8020
`vix.dfs.client.failover.proxy.provider.sveserv51-ha`	A FailoverProxyProvider implementation for a given nameservice, eg, sveserv51-ha> # org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

HA-JT

Setting:	Use it to:
`vix.mapred.job.tracker`	The logical name for a list of jobtrackers> # sveser51-ha-jt
`vix.mapred.jobtrackers.sveserv51-ha-jt`	comma-separated list of jobtrackers for a given logical jobtracker name, eg, sveserv51-ha-jt> # jt1,jt2
`vix.mapred.jobtracker.rpc-address.sveserv51-ha-jt.jt1`	The RPC server address and port for a given jobtracker, eg jt1, of a given logical jobtracker name, eg, sveserv51-ha-jt> # sveserv51-vm6.sv.splunk.com:8021
`vix.mapred.jobtracker.rpc-address.sveserv51-ha-jt.jt2`	The RPC server address and port for a given jobtracker, eg jt2, of a given logical jobtracker name, eg, sveserv51-ha-jt> # sveserv51-vm5.sv.splunk.com:8021
`vix.mapred.client.failover. proxy.provider.sveserv51-ha-jt`	A FailoverProxyProvider implementation for a given logical jobtracker name, eg, sveserv51-ha-jt> # org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider
`vix.mapred.map.max.attempts`	Number of attempts a map task can be retried, until it is marked as failed> # 4
`vix.mapred.max.map.failures.percent`	Percent of acceptable task failures for the whole MR job, until the MR job fails> # 5

Related answers from Splunk Community

Provider Configuration Variables

High-Availability configurations

Comments

Provider Configuration Variables

Was this topic useful?