Splunk® Enterprise

Splunk Analytics for Hadoop

Download manual as PDF

Download topic as PDF

About virtual indexes

Virtual indexes let Splunk Analytics for Hadoop address data stored in external systems and push computations to those systems. With virtual indexes you can access and report on structured, unstructured and polystructured data residing within your Hadoop cluster.

Splunk Analytics for Hadoop leverages the MapReduce framework to execute report-generating searches on Hadoop nodes. Data does not need to be pre-processed before it is accessed because Splunk Analytics for Hadoop lets you run analytics searches against the data where it rests in Hadoop.

Splunk Analytics for Hadoop treats virtual indexes as read-only data stores and binds a schema to the data at search time. This means the data you report on remains accessible in the same format as before to other systems and tools that use it, such as Hive and Pig.

Configuring virtual indexes

Before you set up a virtual index, you set up providers. When you configure a provider, you tell Splunk Analytics for Hadoop details about your Hadoop cluster, which the ERP process uses to carry out reporting tasks. An ERP is a search helper process that we've created to carry out searches on Hadoop data.

You then configure virtual indexes by giving Splunk Analytics for Hadoop information about your Hadoop data, such as the data location, a set of whitelist and blacklisted files or directories. When properly configured, virtual indexes recognize certain directory structures and extract and use that information to optimize searches. For example, if your data is partitioned in a directory structure using dates, then Splunk Analytics for Hadoop can reduce the amount of data it processes by properly choosing to process only the data in relevant paths.

Learn more

PREVIOUS
Special instructions for upgrades from Hunk to Splunk Analytics for Hadoop
  NEXT
Set up a provider and virtual index in the configuration file

This documentation applies to the following versions of Splunk® Enterprise: 6.5.0, 6.5.1, 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.5.9, 6.5.10, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 6.6.8, 6.6.9, 6.6.10, 6.6.11, 6.6.12, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.3.0, 7.3.1


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters