Splunk® Enterprise

Splunk Analytics for Hadoop

Splunk Enterprise version 7.0 is no longer supported as of October 23, 2019. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see How to upgrade Splunk Enterprise.
This documentation does not apply to the most recent version of Splunk® Enterprise. For documentation on the most recent version, go to the latest release.

Performance best practices

Splunk Analytics for Hadoop reaches End of Life on January 31, 2025.

When your raw HDFS data is subjected to the search process, the data passes through index-time processing. (Index time extractions run at search time and cannot be turned off.)

In order to more efficiently process this data, you should optimize your index-time settings, particularly timestamping and aggregation. The following settings added to your data source in props.conf can be configured to improve performance:

  • DATETIME_CONFIG
  • MAX_TIMESTAMP_LOOKAHEAD
  • TIME_PREFIX
  • TIME_FORMAT
  • SHOULD_LINEMERGE
  • ANNOTATE_PUNCT

For example, for single line, non-timestamped data, the following settings can improve throughput roughly four times over:

[source::MyDataSource]
ANNOTATE_PUNCT   = false
SHOULD_LINEMERGE = false
DATETIME_CONFIG  = NONE

Note: If you need to use timestamping, we strongly recommend that you use TIME_PREFIX and TIME_FORMAT to improve processing.

The table below shows examples of possible timestamping and breaking options and how long (in seconds) that combination can take when processing a file with 10 million single line events:

Timestamping and breaking options: Time:

Default configuration

190 seconds

MAX_TIMESTAMP_LOOKAHEAD = 30

179

MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false

105

MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false
TIME_PREFIX = ^

107

MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false
TIME_FORMAT = %a, %d %b %Y %H:%M:%S %Z

51

MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %a, %d %b %Y %H:%M:%S %Z

53

MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false
TIME_FORMAT = %a, %d %b %Y %H:%M:%S %Z
ANNOTATE_PUNCT = false

44

SHOULD_LINEMERGE = false

109

SHOULD_LINEMERGE = false
TIME_PREFIX = ^

99

SHOULD_LINEMERGE = false
TIME_FORMAT = %a, %d %b %Y %H:%M:%S %Z

54

SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %a, %d %b %Y %H:%M:%S %Z

54

MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false
DATETIME_CONFIG = NONE

49

SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT

50

MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false
DATETIME_CONFIG = NONE
ANNOTATE_PUNCT = false

35

Disable streaming to speed up searches

If you want data only from MapReduce jobs, without previews, you can disable the Streaming feature of Splunk Analytics for Hadoop to speed up searches.

By default Splunk Analytics for Hadoop uses Mix Mode, which combining of Streaming (Splunk only) and Reporting (Hadoop MR jobs) modes. If you do not require a preview, you can disable the streaming part of Splunk Anaytics for Hadoop.

To enable or disable streaming:

  • Mix Mode: vix.mode = report and vix.splunk.search.mixedmode = 1
  • Report Mode only: vix.mode = report and vix.splunk.search.mixedmode = 0
  • Streaming Mode only: vix.mode = stream
Last modified on 30 October, 2023
Troubleshoot Splunk Analytics for Hadoop   Provider Configuration Variables

This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.1.6, 9.1.7, 9.2.0, 9.2.1, 9.2.2, 9.2.3, 9.2.4, 9.3.0, 9.3.1, 9.3.2


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters