Splunk® Hadoop Connect

Deploy and Use Splunk Hadoop Connect

Download manual as PDF

Download topic as PDF

Import from HDFS

If you are running Splunk Enterprise 5.0 or later, you can import files from Hadoop Distributed File System (HDFS) into Splunk Enterprise for indexing.

You can import any files or directories that reside in the Hadoop clusters that you configured for the Splunk platform. The Splunk platform monitors directory information you import, and if the Splunk platform detects directory changes, it imports that information into the indexers. See "Configure Splunk Hadoop Connect".

Note: You can add mounted file systems as input using the input configuration features provided by the Splunk platform.

Once Splunk Enterprise indexes an imported HDFS file, Splunk Enterprise does not monitor the file for changes.

1. In the dashboard, click Manage HDFS Inputs.

2. In the Data Inputs page, click New.

3. Complete the following fields:

AddNewInput1.png

  • Resource name: Enter a fully qualified path (without the leading hdfs://) to the data that you want to index. For example hadoop.example.com:8020/path/to/data/.
  • Whitelist regex: Enter a regular expression that matches files that you want to index.
  • Blacklist regex: Enter a regular expression that matches files that you do not want to index.

4. Set the source type for the imported data.

  • Automatic: Classifies and assigns the source type to imported data. Unknown source types get placeholder names.
  • Manual: Provide the source type manually in the field provided.
  • From list: Choose from the list of source types.

5. (Optional) Click More settings to change the host or index values:

  • Host field value: The host value that is given to your imported data. The default value is the Splunk host where the app is running.
  • Index: Select the index where you want to store your imported data. Unless you set this manually, everything is stored in the default index.

6. Click Save.

PREVIOUS
Explore HDFS or a mounted file system
  NEXT
Use search commands in Hadoop Connect

This documentation applies to the following versions of Splunk® Hadoop Connect: 1.1, 1.2, 1.2.1, 1.2.2, 1.2.3, 1.2.4, 1.2.5


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters