Splunk® Supported Add-ons

Splunk Add-on for AWS

Download manual as PDF

Download topic as PDF

Troubleshoot the Splunk Add-on for AWS

Use the following information to troubleshoot the Splunk Add-on for Amazon Web Services (AWS). For helpful troubleshooting tips that you can apply to all add-ons see Troubleshoot add-ons, and Support and resource links for add-ons in the Splunk Add-ons manual.

Data collection errors and performance issues

You can choose dashboards from the Health Check menu to troubleshoot data collection errors and performance issues.

The Health Overview dashboard gives you an at-a-glance view of the following data collection errors and performance metrics for all input types:

  • Errors count by error category
  • Error count over time by input type, host, data input, and error category
  • Throughput over time by host, input type, and data input

The S3 Health Details dashboard focuses on the generic, incremental, and SQS-based S3 input types and provides indexing time lag and detailed error information of these multi-purpose inputs.

You can customize the health dashboard. See the About the Dashboard Editor topic in the Dashboards and Visualizations manual.

Internal logs

You can directly access internal log data for help with troubleshooting. Data collected with these source types is used in the Health Check dashboards.

Data source Source type
splunk_ta_aws_cloudtrail_cloudtrail_{input_name}.log. aws:cloudtrail:log
splunk_ta_aws_cloudwatch.log. aws:cloudwatch:log
splunk_ta_aws_cloudwatch_logs.log. aws:cloudwatchlogs:log
splunk_ta_aws_config_{input_name}.log. aws:config:log
splunk_ta_aws_config_rule.log. aws:configrule:log
splunk_ta_aws_inspector_main.log, splunk_ta_aws_inspector_app_env.log, splunk_ta_aws_inspector_proxy_conf.log, and splunk_ta_aws_inspector_util.log. aws:inspector:log
splunk_ta_aws_description.log. aws:description:log
splunk_ta_aws_billing_{input_name}.log. aws:billing:log
splunk_ta_aws_generic_s3_{input_name}. aws:s3:log
splunk_ta_aws_logs_{input_name}.log, each incremental S3 input has one log file with the input name in the log file. aws:logs:log
splunk_ta_aws_kinesis.log. aws:kinesis:log
splunk_ta_aws_ sqs_based_s3_{input_name} . aws:sqsbaseds3:log
splunk_ta_aws_sns_alert_modular.log and splunk_ta_aws_sns_alert_search.log. aws:sns:alert:log
splunk_ta_aws_rest.log, populated by REST API handlers called when setting up the add-on or data input. aws:resthandler:log
splunk_ta_aws_proxy_conf.log, the proxy handler used in all AWS data inputs. aws:proxy-conf:log
splunk_ta_aws_s3util.log, populated by the S3, CloudWatch, and SQS connectors. aws:resthandler:log
splunk_ta_aws_util.log, a shared utilities library. aws:util:log

Configure log levels

  1. Click Splunk Add-on for AWS in the navigation bar on Splunk Web.
  2. Click Configuration in the app navigation bar.
  3. Click the Logging tab.
  4. Adjust the log levels for each of the AWS services as needed by changing the default level of INFO to DEBUG or ERROR.

These log level configurations apply only to runtime logs. Some REST endpoint logs from configuration activity log at DEBUG, and some validation logs log at ERROR. These levels cannot be configured.

Low throughput for the Splunk Add-on for AWS

If you do not achieve the expected AWS data ingestion throughput, follow these steps to troubleshoot the throughput performance:

  1. Identify the problem in your system.
  2. Adjust the factors affecting performance.
  3. Verify whether performance meets your requirements.
  1. Identify the problem in your system that prevents it from achieving a higher level of throughput performance. The problem in AWS data ingestion might be caused one of the following components:
    • The amount of data the Splunk Add-on for AWS can pull in through API calls
    • The heavy forwarder's capacity to parse and forward data to the indexer tier, which involves the throughput of the parsing, merging, and typing pipelines
    • The index pipeline throughput
    To troubleshoot the indexing performance on the heavy forwarder and indexer, refer to Troubleshooting indexing performance in the Capacity Planning Manual.
  2. Troubleshoot the performance of the problem component.
    If heavy forwarders or indexers are affecting performance, refer to the Summary of performance recommendations in the Splunk Enterprise Capacity Planning Manual.
    If the Splunk Add-on for AWS is affecting performance, adjust the following factors:
    • Parallelization settings
      To achieve optimal throughput performance, set the value of parallelIngestionPipelines to 2 in the server.conf file if your resource capacity permits. For information about parallelIngestionPipelines, see Parallelization settings in the Splunk Enterprise Capacity Planning Manual.
    • AWS data inputs
      If you have sufficient resources, you can increase the number of inputs to improve throughput, but be aware that this also consumes more memory and CPU. Increase the number of inputs to improve throughput until memory or CPU is running short.
      If you are using SQS-based S3 inputs, you can horizontally scale data collection by configuring more inputs on multiple heavy forwarders to consume messages from the same SQS queue.
    • Number of keys in a bucket
      For both the Generic S3 and Incremental S3 inputs, the number of keys or objects in a bucket can impact initial data collection performance. A large number of keys in a bucket requires more memory for S3 inputs in the initial data collection and limits the number of inputs you can configure in the add-on.
      If applicable, you can use log file prefix to subset keys in a bucket into smaller groups and configure different inputs to ingest them separately. For information about how to configure inputs to use log file prefix, see Configure Generic S3 inputs for the Splunk Add-on for AWS.
      For SQS-based S3 inputs, the number of keys in a bucket is not a primary factor since data collection can be horizontally scaled out based on messages consumed from the same SQS queue.
    • File format
      Compressed files consume much more memory than plain text files.
  3. When you resolve the performance issue, see if the improved performance meets your requirements. If not, repeat the previous steps to identify the next bottleneck in the system and address it until you're satisfied with the overall throughput performance.

Problem saving during account or input configuration

If you experience errors or trouble saving while configuring your AWS accounts on the setup page, go to $SPLUNK_HOME/etc/system/local/web.conf and and change the following timeout setting:

  [settings]
  splunkdConnectionTimeout = 300

Problems deploying with a deployment server

If you use a deployment server to deploy the Splunk Add-on for Amazon Web Services to multiple heavy forwarders, you must configure the Amazon Web Services accounts using the Splunk Web setup page for each instance separately because the deployment server does not support sharing hashed password storage across instances.

S3 issues

Troubleshoot the S3 inputs for the Splunk Add-on for AWS.

S3 input performance issues

You can configure multiple S3 inputs for a single S3 bucket to improve performance. The Splunk platform dedicates one process for each data input, so provided that your system has sufficient processing power, you can improve performance with multiple inputs. See Hardware and software requirements for the Splunk Add-on for AWS.

To prevent indexing duplicate data, don't overlap the S3 key names in multiple inputs against the same bucket.

S3 key name filtering issues

Troubleshoot regex to fix filtering issues. For example, the deny and allow list matches the full key name, not just the last segment:

Allow list .*abc/.* matches /a/b/abc/e.gz.

For more help with regex, see the following resources:

S3 event line breaking issues

If your indexed S3 data has incorrect line breaking, configure a custom source type in props.conf to control how the lines break for your events.

If S3 events are too long and get truncated, set TRUNCATE = 0 in props.conf to prevent line truncating.

More more information, see Configure event line breaking in the Getting Data In manual.

CloudWatch configuration issues

Troubleshoot your CloudWatch configuration.

API throttling issues

If you have a high volume of CloudWatch data, search index=_internal Throttling to determine if you are experiencing an API throttling issue. If you are, contact AWS support to increase your CloudWatch API rate. You can also decrease the number of metrics you collect or increase the granularity of your indexed data in order to make fewer API calls.

Granularity

If the granularity of your indexed data does not match your expectations, check that your configured granularity falls within what AWS supports for the metric you have selected. Different AWS metrics support different minimum granularities, based on the allowed sampling period for that metric. For example, CPUUtilization has a sampling period of 5 minutes, whereas Billing Estimated Charge has a sampling period of 4 hours.

If you configured a granularity that is less than the sampling period for the selected metric, the reported granularity in your indexed data reflects the actual sampling granularity but is labeled with your configured granularity. Clear the local/inputs.conf cloudwatch stanza with the problem, adjust the granularity configuration to match the supported sampling granularity so that newly indexed data is correct, and reindex the data.

CloudTrail data indexing problems

If you are not seeing CloudTrail data in the Splunk platform, follow this troubleshooting process.

  1. Review the internal logs with the following search: index=_internal source=*cloudtrail*
  2. Verify that the Splunk platform is connecting to SQS successfully by searching for the string Connected to SQS.
  3. Verify that the Splunk platform is processing messages successfully. Look for strings that follow the pattern: X completed, Y failed while processing notification batch.
  4. Verify that the Splunk platform is processing messages successfully. Look for strings that follow the following pattern: X completed, Y failed while processing notification batch.
  5. Review your Amazon Web Services configuration to verify that SQS messages are being placed into the queue. If messages are being removed and the logs do not show that the input is removing them, then there might be another script or input consuming messages from the queue. Review your data inputs to ensure there are no other inputs configured to consume the same queue.
  6. Go to the AWS console to view CloudWatch metrics with the detail set to 1 minute to view the trend. For more details, see https://aws.amazon.com/blogs/aws/amazon-cloudwatch-search-and-browse-metrics-in-the-console/. If you see messages consumed but no Splunk platform inputs are consuming them, check for remote services that might be accessing the same queue.
  7. If your AWS deployment contains large S3 buckets with a large number of subdirectories for 60 or more AWS accounts, perform one of the following tasks:
    • Enable SQS notification for each S3 bucket and switch to a SQS S3 input. This lets you add multiple copies of the input for scaling purposes.
    • Split your inputs into one bucket per account and use multiple incremental inputs.

Billing Report issues

Troubleshoot the Splunk Add-on for AWS Billing inputs.

Problems accessing billing reports from AWS

If you have problems accessing billing reports from AWS, ensure that:

  • There Billing Reports available on the S3 bucket you select when you configure the billing input,
  • The AWS account you specify has the permission to read the files inside that bucket.

Problems understanding the billing report data

If you have problems understanding the billing report data, access the saved searches included with the add-on to analyze billing report data.

Problems configuring the billing data interval

The default billing data ingestion collection intervals for billing report data is designed to minimize license usage. Review the default behavior and make adjustments with caution.

Configure the interval by which the Splunk platform pulls Monthly and Detailed Billing Reports:

  1. In Splunk Web, go to the Splunk Add-on for AWS inputs screen.
  2. Create a new Billing input or click to edit your existing one.
  3. Click the Settings tab.
  4. Customize the value in the Interval field.

SNS alert issues

Because the modular input module is inactive, it cannot check whether the AWS is correctly configured or existing in the AWS SNS. If you cannot send a message to the AWS SNS account, you can perform the following procedures:

  • Ensure the SNS topic name exists in AWS and the region ID is correctly configured.
  • Ensure the AWS account is correctly configured in Splunk Add-on for AWS.

If you still have the issue, use the following search to check the log for AWS SNS:

index=_internal sourcetype=aws:sns:alert:log"

Proxy settings for VPC endpoints

You must add each S3 region endpoint to the no_proxy setting, and use the correct hostname in your region: s3.<your_aws_region>.amazonaws.com. The no_proxy setting does not allow for any spaces between the IP addresses.

When using a proxy with VPC endpoints, check the proxy setting defined in the splunk-launch.conf file located at $SPLUNK_HOME/etc/splunk-launch.conf. For example:

no_proxy = 169.254.169.254,127.0.0.1,s3.amazonaws.com,s3.ap-southeast-2.amazonaws.com

Certificate verify failed (_ssl.c:741) error message

If you create a new input, you might receive the following error message:
certificate verify failed (_ssl.c:741)
Perform the following steps to resolve the error:

  1. Navigate to $SPLUNK_HOME/etc/auth/cacert.pem and open the cacert.pem file with a text editor.
  2. Copy the text from your deployment's proxy server certificate, and paste it into the cacert.pem file.
  3. Save your changes.
Last modified on 29 August, 2020
PREVIOUS
Configure alerts for the Splunk Add-on for AWS
  NEXT
Configure permissions for all inputs for the Splunk Add-on for AWS at once

This documentation applies to the following versions of Splunk® Supported Add-ons: released, released


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters