Performance best practices
Splunk Analytics for Hadoop reaches End of Life on January 31, 2025.
When your raw HDFS data is subjected to the search process, the data passes through index-time processing. (Index time extractions run at search time and cannot be turned off.)
In order to more efficiently process this data, you should optimize your index-time settings, particularly timestamping and aggregation. The following settings added to your data source in props.conf can be configured to improve performance:
DATETIME_CONFIG
MAX_TIMESTAMP_LOOKAHEAD
TIME_PREFIX
TIME_FORMAT
SHOULD_LINEMERGE
ANNOTATE_PUNCT
For example, for single line, non-timestamped data, the following settings can improve throughput roughly four times over:
[source::MyDataSource] ANNOTATE_PUNCT = false SHOULD_LINEMERGE = false DATETIME_CONFIG = NONE
Note: If you need to use timestamping, we strongly recommend that you use TIME_PREFIX
and TIME_FORMAT
to improve processing.
The table below shows examples of possible timestamping and breaking options and how long (in seconds) that combination can take when processing a file with 10 million single line events:
Timestamping and breaking options: | Time: |
---|---|
Default configuration |
190 seconds |
|
179 |
|
105 |
|
107 |
|
51 |
|
53 |
|
44 |
|
109 |
|
99 |
|
54 |
|
54 |
|
49 |
|
50 |
|
35 |
Disable streaming to speed up searches
If you want data only from MapReduce jobs, without previews, you can disable the Streaming feature of Splunk Analytics for Hadoop to speed up searches.
By default Splunk Analytics for Hadoop uses Mix Mode, which combining of Streaming (Splunk only) and Reporting (Hadoop MR jobs) modes. If you do not require a preview, you can disable the streaming part of Splunk Anaytics for Hadoop.
To enable or disable streaming:
- Mix Mode:
vix.mode = report and vix.splunk.search.mixedmode = 1
- Report Mode only:
vix.mode = report and vix.splunk.search.mixedmode = 0
- Streaming Mode only:
vix.mode = stream
Troubleshoot Splunk Analytics for Hadoop | Provider Configuration Variables |
This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.3.0, 9.3.1
Feedback submitted, thanks!