Splunk® Hadoop Connect

Deploy and Use Splunk Hadoop Connect

Download manual as PDF

Download topic as PDF

How Splunk Hadoop Connect fits into your Splunk deployment

This topic discusses how Splunk Hadoop Connect fits into the Splunk picture, discusses where you should deploy the app in your environment, and discusses decisions that you should make before you install Splunk Hadoop Connect.

Splunk Hadoop Connect allows users to export data on disk. Give access to this app only to trusted administrator personnel. To update sharing settings, edit default.meta as described in the topic "Security responsibilities with custom commands" in the Splunk Enterprise Search Manual.

Splunk Hadoop Connect is an app available within the Splunk ecosystem. All Splunk apps and add-ons run on top of a Splunk Enterprise installation. Install Splunk Enterprise first, then install the app components of Splunk Hadoop Connect.

Where to deploy Splunk Hadoop Connect

Splunk Hadoop Connect fits into your Splunk deployment. Where you put it in the deployment depends on which Hadoop Distributed File System (HDFS) features you want to use and the size and scope of your Splunk deployment.

The diagram shows the locations where you can install Splunk Hadoop Connect.


Use the HDFS or mounted file system export and explore features, in the following scenarios.

  • On a single-server Splunk instance, install the app onto that instance.
  • On a distributed Splunk deployment, install the app onto a search head.

If you want to use the HDFS import feature, the process is slightly different:

  • On a single-server Splunk instance, install the app onto that instance.
  • On a distributed Splunk deployment, install the app onto either a search head that is also configured to forward data, or on an external heavy forwarder.
  • See "Import HDFS data" in this topic for additional requirements.

Note: Importing from a mounted file system is configured as a Splunk input. For information about adding inputs to the Splunk platform, see "Getting data in."

Windows support

You must deploy Splunk Hadoop Connect on a *nix instance of Splunk Enterprise.

If you have a Windows Splunk environment and want to deploy Splunk Hadoop Connect, you can add one or more *nix Splunk instances to your environment as search heads. You can then install Splunk Hadoop Connect on the *nix search heads.

For more information on what Splunk Hadoop Connect requires for installation, see System Requirements.

Import HDFS data considerations

Import and index data from Hadoop Distributed File System (HDFS) into the Splunk platform. Use version 5.0 or later of the Splunk platform. Splunk 5.0 and later features modular inputs, which are a requirement for this activity. You can deploy Splunk Hadoop Connect onto a heavy forwarder running version 5.0 or later.

Forwarder support

Splunk Hadoop Connect requires Python to run. It does not support installation on a universal forwarder. Install it on a full Splunk instance, and configure that instance to forward data to your Splunk indexers.

Last modified on 07 April, 2017
How to get help
System requirements

This documentation applies to the following versions of Splunk® Hadoop Connect: 1.2, 1.2.1, 1.2.2, 1.2.3, 1.2.4, 1.2.5

Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters