Meet Splunk Analytics for Hadoop
Hadoop lets you store massive amounts of structured, polystructured and unstructured data, however extracting value from that data can be a challenging and time-consuming task.
Splunk Analytics for Hadoop lets you access data in remote Hadoop clusters via virtual indexes and allows you to use the Splunk Search Processing Language to analyze your data using Hadoop and NoSQL data stores.
- Process, report, and visualize large amounts of structured, polystructured, and unstructured data.
- Run combined reports on Hadoop data and data from your Splunk Enterprise indexes.
- Use SDKs and apps with Hadoop data.
Due to the nature of how data is stored in Hadoop, there are certain Splunk Enterprise index behaviors that cannot be duplicated:
- Splunk Analytics for Hadoop currently doesn't support real-time search of Hadoop data, although preview functionality and report acceleration is available.
- Since events are not sorted in any particular order, any search command which depends on implicit time order will exhibit different behavior with Splunk Analytics for Hadoop. (For example: head, tail, delta, etc.) For more information about how certain timestamp-sensitive commands work with virtual indexes, see Search a virtual index in this manual.
- Data is not always returned as quickly as data is returned for a local index.
To set up Splunk Analytics for Hadoop to work with your own HDFS data, see Install Splunk Analytics for Hadoop.
To learn about configuring and searching data in Splunk Web, see Search and report on virtual index data.
To learn more about how Splunk Analytics for Hadoop works, see Splunk Analytics for Hadoop concepts.
How Splunk Analytics for Hadoop returns reports on Hadoop data
This documentation applies to the following versions of Splunk® Enterprise: 6.5.0, 6.5.1, 6.5.2, 6.5.3, 6.5.4, 6.5.5, 6.5.6, 6.5.7, 6.5.8, 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 6.6.7, 7.0.0, 7.0.1, 7.0.2, 7.0.3