Working with Hive and Parquet data
Splunk Analytics for Hadoop reaches End of Life on January 31, 2025.
Data Preprocessors
When Splunk Analytics for Hadoop initializes a search for non-HDFS input data, it uses the information contained in the FileSplitGenerator
class to determine how to split data for parallel processing.
The default FileSplitGenerator
contains the same data split logic defined in Hadoop's FileInputFormat
This means that it works for any data format that can be read by Hadoop's InputFormat
implementation (which has same split logic as FileInputFormat
).
Since the default FileSplitGenerator
does not work for Hive or Parquet files, Splunk Analytics for Hadoop provides HiveSplitGenerator
and ParquetSplitGenerator
for Hive and Parquet. Any custom Hive files with file-based split logic (such as files created with Hadoop FileOutputFormat
and its subclasses) works with the HiveSplitGenerator
. If you have custom Hive file formats that do not use file-based data split logic, you can implement a custom SplitGenerator
that uses your split logic.
Parquet files created by all tools (including Hive) work with (and only with) ParquetSplitGenerator
.
- To configure Splunk Analytics for Hadoop to work with Hive, see Configure Hive connectivity.
- To configure Splunk Analytics for Hadoop to work with Parquet tables, see Configure Parquet tables.
Configure Splunk Analytics for Hadoop to read Hadoop Archive (HAR) files | Configure Hive connectivity |
This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.3.0, 9.3.1
Feedback submitted, thanks!