Splunk® Supported Add-ons

Splunk Add-on for AWS

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

Performance reference for the Splunk Add-on for AWS data inputs

Many factors impact throughput performance. The rate at which the Splunk Add-on for AWS ingests input data varies depending on a number of variables: deployment topology, number of keys in a bucket, file size, file compression format, number of events in a file, event size, and hardware and networking conditions.

This section provides measured throughput data achieved under certain operating conditions and draws from the performance testing results some rough conclusions and guidelines on tuning AWS add-on throughput performance. Use the information here as a basis for estimating and optimizing the AWS add-on throughput performance in your own production environment. As performance varies based on user characteristics, application usage, server configurations, and other factors, specific performance results cannot be guaranteed. Consult Splunk Support for accurate performance tuning and sizing.

Reference hardware and software environment

The throughput data and conclusions provided here are based on performance testing using Splunk platform instances (dedicated heavy forwarders and indexers) running on the following environment:

Instance type M4 Double Extra Large (m4.4xlarge)
Memory 64 GB
Compute Units (ECU) 53.5
vCPU 16
Storage (GB) 0 (EBS only)
Arch 64-bit
EBS Optimized (Max Bandwidth) 2000 Mbps
Network performance High

The following settings are configured in outputs.conf on the heavy forwarder:

useACK = true

maxQueueSize = 15MB

Measured performance data

The throughput data provided here is the maximum performance for each single input achieved in performance testing under specific operating conditions and is subject to change when any of the hardware and software variables changes. Use this data as a rough reference only.

Single-input max throughput

Data Input Sourcetype Max Throughput (KBs) Max EPS (events) Max Throughput (GB/day)
Generic S3 aws:elb:accesslogs
(plain text, syslog, event size 250B, S3 key size 2MB)
17,000 86,000 1,470
Generic S3 aws:cloudtrail
(gz, json, event size 720B, S3 key size 2MB)
11,000 35,000 950
Incremental S3 aws:elb:accesslogs
(plain text, syslog, event size 250B, S3 key size 2MB)
11,000 43,000 950
Incremental S3 aws:cloudtrail
(gz, json, event size 720B, S3 key size 2MB)
7,000 10,000 600
SQS-based S3 aws:elb:accesslogs
(plain text, syslog, event size 250B, S3 key size 2MB)
12,000 50,000 1,000
SQS-based S3 aws:elb:accesslogs
(gz, syslog, event size 250B, S3 key size 2MB)
24,000 100,000 2,000
SQS-based S3 aws:cloudtrail
(gz, json, event size 720B, S3 key size 2MB)
13,000 19,000 1,100
CloudWatch logs [1] aws:cloudwatchlog:vpcflow 1,000 6,700 100
CloudWatch
(ListMetric, 10,000 metrics)
aws:cloudwatch 240 (Metricss) NA NA
CloudTrail aws:cloudtrail
(gz, json, sqs=1000, 9K events/key)
5,000 7,000 400
Kinesis aws:cloudwatchlog:vpcflow
(json, 10 shards)
15,000 125,000 1,200
SQS aws:sqs
(json, event size 2.8K)
N/A 160 N/A

[1] API throttling error occurs if input streams > 1k

Multi-inputs max throughput

The following throughput data was measured with multiple inputs configured on a heavy forwarder in an indexer cluster distributed environment.

Configuring more AWS accounts increases CPU usage and lowers throughput performance due to increased API calls. It is recommended that you consolidate AWS accounts when configuring the Splunk Add-on for AWS.

Data Input Sourcetype Max Throughput (KBs) Max EPS (events) Max Throughput (GB/day)
Generic S3 aws:elb:accesslogs
(plain text, syslog, event size 250B, S3 key size 2MB)
23,000 108,000 1,980
Generic S3 aws:cloudtrail
(gz, json, event size 720B, S3 key size 2MB)
45,000 130,000 3,880
Incremental S3 aws:elb:accesslogs
(plain text, syslog, event size 250B, S3 key size 2MB)
34,000 140,000 2,930
Incremental S3 aws:cloudtrail
(gz, json, event size 720B, S3 key size 2MB)
45,000 65,000 3,880
SQS-based S3 [1] aws:elb:accesslogs
(plain text, syslog, event size 250B, S3 key size 2MB)
35,000 144,000 3,000
SQS-based S3 [1] aws:elb:accesslogs
(gz, syslog, event size 250B, S3 key size 2MB)
42,000 190,000 3,600
SQS-based S3 [1] aws:cloudtrail
(gz, json, event size 720B, S3 key size 2MB)
45,000 68,000 3,900
CloudWatch logs aws:cloudwatchlog:vpcflow 1,000 6,700 100
CloudWatch (ListMetric) aws:cloudwatch
(10,000 metrics)
240 (metrics/s) NA NA
CloudTrail aws:cloudtrail
(gz, json, sqs=100, 9K events/key)
20,000 15,000 1,700
Kinesis aws:cloudwatchlog:vpcflow
(json, 10 shards)
18,000 154,000 1,500
SQS aws:sqs
(json, event size 2.8K)
N/A 670 N/A

[1] Performance testing of the SQS-based S3 input indicates that optimal performance throughput is reached when running four inputs on a single heavy forwarder instance. To achieve higher throughput performance beyond this bottleneck, you can further scale out data collection by creating multiple heavy forwarder instances each configured with up to four SQS-based S3 inputs to concurrently ingest data by consuming messages from the same SQS queue.

Max inputs benchmark per heavy forwarder

The following input number ceiling was measured with multiple inputs configured on a heavy forwarder in an indexer cluster distributed environment, where CPU and memory resources were utilized to their fullest.

If you have a smaller event size, fewer keys per bucket, or more available CPU and memory resources in your environment, you can configure more inputs than the maximum input number indicated in the table.

Data Input Sourcetype Format Number of Keys/Bucket Event Size Max Inputs
S3 aws:s3 zip, syslog 100K 100B 300
S3 aws:cloudtrail gz, json 1,300K 1KB 30
Incremental S3 aws:cloudtrail gz, json 1,300K 1KB 20
SQS-based S3 aws:cloudtrail, aws:config gz, json 1,000K 1KB 50

Memory usage benchmark for generic S3 inputs

Event Size Number of Events per Key Total Number of Keys Archive Type Number of Inputs Memory Used
1K 1,000 10,000 zip 20 20G
1K 1,000 1,000 zip 20 12G
1K 1,000 10,000 zip 10 18G
100B 1,000 10,000 zip 10 15G

Performance tuning and sizing guidelines

If you do not achieve the expected AWS data ingestion throughput, follow these steps to tune the throughput performance:

  1. Identify the bottleneck in your system that prevents it from achieving a higher level of throughput performance. The bottleneck in AWS data ingestion may lie in one of the following components:
    • The Splunk Add-on for AWS: its capacity to pull in AWS data through API calls
    • Heavy forwarder: its capacity to parse and forward data to the indexer tier, which involves the throughput of the parsing, merging, and typing pipelines
    • Indexer: the index pipeline throughput
    To troubleshoot the indexing performance on the heavy forwarder and indexer, refer to Troubleshooting indexing performance in the Capacity Planning Manual.
    A chain is as only as strong as its weakest link. The capacity of the bottleneck is the capacity of the entire system as a whole. Only by identifying and tuning the performance of the bottleneck component can you improve the overall system performance.
  2. Tune the performance of the bottleneck component.
    If the bottleneck lies in heavy forwarders or indexers, refer to the Summary of performance recommendations in the Capacity Planning Manual.
    If the bottleneck lies in the Splunk Add-on for AWS, adjust the following key factors that usually impact the AWS data input throughput:
    • Parallelization settings
      To achieve optimal throughput performance, you can set the parallelIngestionPipelines value to 2 in server.conf if your resource capacity permits. For information about parallelIngestionPipelines, see Parallelization settings in the Splunk Enterprise Capacity Planning Manual.
    • AWS data inputs
      When there is no shortage of resources, adding more inputs in the add-on increases throughput but it also consumes more memory and CPU. Increase the number of inputs to improve throughput until memory or CPU is running short.
      If you are using SQS-based S3 inputs, you can horizontally scale out data collection by configuring more inputs on multiple heavy forwarders to consume messages from the same SQS queue.
    • Number of keys in a bucket
      For both the Generic S3 and Incremental S3 inputs, the number of keys (or objects) in a bucket is a factor that impacts initial data collection performance. The first time a Generic or Incremental S3 input collects data from a bucket, the more keys the bucket contains, the longer time it takes to complete the list operation, and the more memory is consumed. A large number of keys in a bucket require a huge amount of memory for S3 inputs in the initial data collection and limit the number of inputs you can configure in the add-on.
      If applicable, you can use log file prefix to subset keys in a bucket into smaller groups and configure different inputs to ingest them separately. For information about how to configure inputs to use log file prefix, see Add an S3 input for Splunk Add-on for AWS.
      For SQS-based S3 inputs, the number of keys in a bucket is not a primary factor since data collection can be horizontally scaled out based on messages consumed from the same SQS queue.
    • File format
      Compressed files consume much more memory than plain text files.
  3. When you have resolved the bottleneck, see if the improved performance meets your requirements. If not, continue the previous steps to identify the next bottleneck in the system and address it until the expected overall throughput performance is achieved.
Last modified on 25 August, 2020
 

This documentation applies to the following versions of Splunk® Supported Add-ons: released, released


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters