About streaming resource libraries
In order to access and process data in external systems, Hunk leverages External Result Providers (ERPs) to carry out the implementation logic and details for data retrieval and computation.
You can use Hunk's built-in ERPs to process your Hadoop data or build your own ERPs as streaming resource library interfaces that connect to other types of external provider systems.
Creating an ERP as a streaming resource lets you query data from different system and multiple data source from one location.
You can run a single search across multiple indexes, so that in the same search you could query:
- Native Splunk Enterprise indexes.
- Virtual indexes configured to return data from your external Hadoop system.
- Streaming libraries configured to return data form NoSQL or other types of external systems.
With this, you can harness the awesomeness of Hunk on dispersed data and:
- Create visualizations in Splunk
- Correlate data across different systems
- Create reports
When you configure an ERP as a streaming resource library interface to your external systems, the ERP performs search parsing and partial or full execution of the search, returning a super set of the final results to Hunk. Hunk then performs further processing and filtering to arrive at the final result set.
How it works
For example, you could configure Hunk so that you can create a single query that pulls data from:
- Local Splunk indexes
- File system
Here's an example of how a search is implemented when at least one ERP is configured as a streaming resource library (in this example, MongoDB):
1. User logs into the Hunk User Interface and creates a search on one or more data collections (aka indexes, native and/or virtual).
2. Hunk kicks off a search process and for each index the search process:
- Detects that a virtual index is referenced in the search.
- Determines that the virtual index is mapped to an external results provider (the provider).
- If the external results provider is configured as a streaming resource library, the search process determines that the provider maps to a provider family which contains the common information for all provider instances for that external system. An example of this would be a development MongoDB instance running on
mongo-dev1.example.com. The provider family in this case contains connection information for MongoDB, while the provider contains details about how to connect to the specific instance running on
Note: For information about virtual indexes that are not configured as streaming resource libraries, see About virtual indexes in this manual. For information about how Splunk Enterprise indexes work, see the Managing Indexes and Clusters manual for Splunk Enterprise.
3. The search process starts an ERP process to query the external system (in this example, Mongo) to get the information that the user has requested.
4. The result data is returned to Hunk where it gets processed further and merged with results from all of your indexes.
Developing your own ERP
An ERP is a search helper process that does the following:
- Reads configurations and search parameters through standard in (stdin)
- Produces results over a predefined protocol for a specific search via stdout
An ERP can be written in any programming language as long as it implements the above logic.
Configuring Hunk's default ERP as a streaming resource library
You can configure your streaming resource library to be as complex or simple as you need. And you can even package it as an app and distribute it to your users.
1. You first create a provider family. The provider family provides the connection paths and information that the ERP needs in order to connect to your
databse.mode and tells it how to start the process. See "Configure a provider family" in this manual to create a provider family.
2. Then you create one or more providers. You associate each provider to its provider family then include any specific or overriding configuration information. You can create multiple providers for a provider family. See "Configure a provider" in this manual to create a provider.
3. Finally you can create virtual indexes for your provider. The virtual index specifies a particular data collection in the database to be queried during a search. You can configure multiple virtual indexes for a provider. Each virtual index has an association to the provider plus any index-specific properties such as Blacklist, the path to use for the data, etc. See the following topics for more information about virtual indexes:
- "About virtual indexes" to learn more about how virtual indexes work.
- "Set up a virtual index" to set up a virtual index by editing configuration files.
- "Add or edit a virtual index" to set up a virtual index in the Hunk user interface.
4. To distribute your streaming resource library to users, you can package it as an app. You can configure indexes as part of the packaged app or you can have users download the app then set up indexes based on their needs. See "Working with Splunk Apps" to learn more about packaging apps.
Licensing streaming resource libraries
In the full version of Hunk, any byte that the ERP brings back is charged against the Enterprise indexing volume. To license a full version of Hunk that allows you to work with streaming resource libraries, contact your sales representative.
To license a trial version of streaming libraries, you can simply download the Hunk license and begin building ERPs against you data. The trial license includes 500MB indexing volume.
Configure Kerberos authentication
Configure Hunk to use a streaming resource library
This documentation applies to the following versions of Hunk®(Legacy): 6.1, 6.1.1, 6.1.2, 6.1.3, 6.2, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.3.0, 6.3.1, 6.3.2, 6.3.3, 6.3.4, 6.3.5, 6.3.6, 6.3.7, 6.3.8, 6.3.9, 6.3.10, 6.3.11, 6.3.12, 6.3.13, 6.4.0, 6.4.1, 6.4.2, 6.4.3, 6.4.4, 6.4.5, 6.4.6, 6.4.7, 6.4.8, 6.4.9, 6.4.10, 6.4.11