Splunk® Enterprise

Splunk Analytics for Hadoop

Splunk Enterprise version 8.2 is no longer supported as of September 30, 2023. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk® Enterprise. For documentation on the most recent version, go to the latest release.

Configure pass-through authentication in the configuration file

Splunk Analytics for Hadoop reaches End of Life on January 31, 2025.

This topic describes how to edit indexes.conf and impersonation.conf so that Splunk Analytics for Hadoop users are able to act as Hadoop users. This lets you to give specific Splunk Analytics for Hadoop users the ability to submit MapReduce jobs as different Hadoop users to a specific queue. To configure pass-through authentication using the Splunk Web user interface, see Map pass-through authentication.

With pass-through authentication, Splunk Enterprise for Hadoop uses its Superuser as a proxy to Hadoop, letting you interact with Hadoop. You can configure this to be a Hadoop user with the same name as the Superuser, or a user with different name.

To learn more about how pass-through authentication works, see About pass-through authentication.

Configure Hadoop Cluster to support pass-through authentication

Once you enable pass-through authentication, interactions with Hadoop happen as the Hadoop user with the same name as the Splunk Analytics for Hadoop user who is logged in. Hadoop must be configured as follows to support this:

1. Make sure that any Hadoop user you want Splunk Analytics for Hadoop users to act as exists on each Hadoop node. You can manually create them or use LDAP to create them.

2. Ensure your Superuser is in the Hadoop Supergroup. You can find the Hadoop supergroup in the hdfs-site.xml file as dfs.permissions.supergroup.

If your Superuser is not in the Supergroup on each Hadoop node, use the following command to add the Superuser to the Supergroup on each node:

sudo usermod -G <group name> <user name>. 

3. Create home directories in HDFS for the users in your Hadoop clusters, and ensure that provider's Hadoop home vix.splunk.home.hdfs in HDFS is readable and executable by all the users added in step 2.

4. Add a stanza to core-site.xml to allow the Hadoop user (with the same name as the Splunk Analytics for Hadoop Superuser) to act as a proxy for Hadoop users in designated node user groups:

Note: For best results, we recommend you do this against Kerberized clusters. For more information about using Kerberos, see Configure Kerberos Authentication.

<property>
<name>hadoop.proxyuser.<name of your Splunk Analytics for Hadoop Superuser>.groups</name>
<value>group1,group2</value>
<description>Allows the Splunk Analytics for Hadoop Superuser to impersonate any
members of the group group1 and group2</description>
</property>

5. Optionally limit connections by host:

<property>
<name>hadoop.proxyuser.<name of your Splunk Analytics for Hadoop Superuser>.hosts</name>
<value>host1,host2</value>
<description>The superuser can connect only from host1 and
host2 to impersonate a user</description>
</property>

Configure pass-through authentication in Splunk Analytics for Hadoop

In the next steps, you can configure Splunk Analytics for Hadoop users to submit jobs to a specific queue and/or interact with Hadoop as a Hadoop user with a different name than the Splunk Analytics for Hadoop user logged in.

1. Turn on the feature in indexes.conf for each provider:

[provider:myprovider]
vix.splunk.impersonation = 1

2. Optionally map Splunk Analytics for Hadoop users to a specific alias and queue in HDFS by updating impersonation.conf:

[provider:myprovider]
admin = {"user": "hadoopadmin"}
splunkanalyticsforhadoopuser1 = {"queue": "red"}

[provider:mycdh]
splunkanalyticsforhadoopuser1 = {"user": "hadoopuser", "queue": "blue"}
Last modified on 30 October, 2023
Configure pass-through authentication in Splunk Web   Configure Splunk Analytics for Hadoop to read Hadoop Archive (HAR) files

This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.1.7, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.2.4, 9.3.0, 9.3.1, 9.3.2


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters