Hunk®(Legacy)

Hunk User Manual

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

About virtual indexes

With virtual indexes you can access and report on structured, unstructured, and polystructured data residing within your Hadoop cluster. Hunk uses the MapReduce framework to execute report-generating searches on Hadoop nodes. Data does not need to be pre-processed before it is accessed because Hunk lets you run analytics searches against the data where it rests in Hadoop.

Hunk treats Hadoop virtual indexes as read-only data stores and binds a schema to the data at search time. This means the data you report on with Hunk remains accessible in the same format as before to other systems and tools that use it, such as Hive and Pig.

You can also create streaming resource libraries to access data that resides in NoSQL or other data stores. See About streaming resource libraries for more information about creating an interface that lets you create virtual indexes for this type of data.

Configuring virtual indexes

Before you set up a virtual index, you set up providers and configure an External Resource Provider (ERP). An ERP is a search helper process we've created that carries out searches on Hadoop data. When you configure a provider, you give Hunk details about your Hadoop cluster that the ERP needs to carry out reporting tasks (this procedure is described in detail here).

Configure virtual indexes by giving Hunk information about your Hadoop data, such as the data location, a set of whitelist and blacklisted files or directories. When properly configured, virtual indexes recognize certain directory structures and extract and use that information to optimize searches. For example, if your data is partitioned in a directory structure using dates, then Hunk can reduce the amount of data it processes by processing only the data in relevant paths.

Learn more

Last modified on 27 January, 2016
PREVIOUS
Get Hunk with the Hunk Amazon Machine Image
  NEXT
Set up a provider and virtual index in the configuration file

This documentation applies to the following versions of Hunk®(Legacy): 6.1, 6.1.1, 6.1.2, 6.1.3, 6.2, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters