Splunk Cloud Platform

Getting Data In

Troubleshoot the input process

Following are some initial steps you can take to troubleshoot the data input process on .

Determine why you do not find the events you expect

When you add an input to your deployment, that input gets added relative to the app you are in. Some apps write input data to a specific index. If you cannot find data that you are certain is in your deployment, confirm that you are looking at the right index. See Retrieve events from indexes in the Search Manual. You might want to add indexes to the list of default indexes for the role you are using.

  • For more information about roles, see Add and edit roles in the Securing the Splunk Platform manual.
  • For more information about troubleshooting data input issues, read the rest of this topic or see I can't find my data! in the Troubleshooting Manual.

If you use Splunk Enterprise and add inputs by editing the inputs.conf configuration file, Splunk Enterprise might not recognize the inputs immediately. Splunk Enterprise looks for inputs every 24 hours, starting from the time it was last restarted, so if you add a new stanza to monitor a directory or file, it could take up to 24 hours for Splunk Enterprise to start indexing the contents of that directory or file. To ensure that your input is immediately recognized and indexed, add the input through Splunk Web or the CLI, or restart Splunk services after you make edits to the inputs.conf file.

Troubleshoot your tailed files

You can use the FileStatus Representational State Transfer (REST) endpoint to get the status of your tailed files. For example:

curl https://serverhost:8089/services/admin/inputstatus/TailingProcessor:FileStatus

You can also monitor the fishbucket, a subdirectory that Splunk software uses to keep track of how much of a file's contents have been indexed. In Splunk Enterprise deployments, the fishbucket resides at $SPLUNK_DB/fishbucket/splunk_private_db. In Splunk Cloud Platform deployments you do not have physical access to this subdirectory.

To monitor the fishbucket, use the REST endpoint. Review the REST API Reference manual for additional information.

Troubleshoot ingestion congestion on Splunk Enterprise

Sometimes, Splunk Enterprise data ingestion can slow for what appears to be an unknown reason. One possibility for this slowness could be the number of inactive input channels available on your Splunk Enterprise indexers.

Description of an input channel

An indexer must track the state of each unique stream of data that it processes. For example, when it breaks up lines of data that it has ingested from a set of tailed files, the indexer receives data from these files in an order that cannot be predicted. Parts of various files can be interleaved with one another. An indexer prevents this interleaving from causing the line breaking of one file from interfering with the line breaking of another by tracking the state of each file with a data structure called an input channel.

An input channel stores a variety of information, including the following information:

  • The state of the linebreaker
  • The state of the aggregator
  • The punct state
  • The settings in props.conf for the input

There is a unique input channel for each source, source type, or host stream that the indexer encounters.

Description of an inactive input channel

An indexer does not, for performance and memory usage reasons, keep input channels around forever. After a channel has not been used for a while, for example, after data for a particular source, source type, and host tuple has not appeared for a while, a channel becomes eligible for reuse by a different stream. Splunk Enterprise has several settings that control the recycling behavior for inactive channels. You configure these settings in the limits.conf configuration file.

For example, suppose the indexer has just encountered a new stream. As a result, it needs an input channel into which it can save the state of this stream as it ingests it. At this point, it must decide whether to create a new input channel, thus using more memory, or to reuse an inactive channel, and incur a performance penalty if that inactive channel becomes active again.

When determining whether or not to use an inactive input channel, the indexer follows the following decision process:

  1. If the number of inactive channels is less than or equal to the value set for the lowater_inactive setting in the limits.conf configuration file, it creates an input channel.
  2. If the number of inactive channels is greater than the value set for the max_inactive setting, or the age of the oldest inactive channel is greater than the value set for the inactive_eligibility_age_seconds in the limits.conf file, do one of the following things:
    • Recycle the oldest inactive input channel.
    • Create an input channel.

Put another way:

  • The indexer always creates a new input channel if it is currently below the lowater_inactive value.
  • The indexer always recycles an inactive input channel if it is currently above the max_inactive value.
  • If the indexer is above the lowater_inactive value and below the max_inactive value at the same time, it recycles the oldest inactive channel if that channel is older than inactive_eligibility_age_seconds seconds; otherwise, it creates a new input channel.

The max_inactive setting now has a setting value auto. This configures the indexer to adjust the max_inactive setting based on the amount of memory that is present in the machine that runs the instance.

Configure manual or automatic inactive input channel limits

You can adjust the amount of maximum inactive input channels that an indexer keeps available. Increasing this number manually increases the amount of memory that the indexer uses. Lower numbers mean less memory usage by the indexer, but an increase in the amount of new input channels that the indexer creates, which can significantly reduce performance based on the amount of sources, source types, and hosts that the indexer encounters while it processes incoming data. Each inactive input channel takes around 5kB of memory.

  1. On the indexer where you want to adjust inactive input channel limits, open a shell or command prompt or text editor.
  2. Open the $SPLUNK_HOME/etc/system/local/limits.conf file for editing.
  3. In this file, locate the [input_channels] stanza. If the stanza does not exist in the file, create it.
  4. Under the [input_channels] stanza, add the following line:
    max_inactive = <positive integer>
    

    If you want the indexer to manage the number of inactive channels automatically, change the line to

    max_inactive = auto
    
  5. Save the file and close it.
  6. Restart The Splunk Enterprise instance to apply the change.

For more information on the max_inactive, lowater_inactive, and inactive_eligibility_age_seconds settings for limits.conf, see the limits.conf specification file.

Can't find forwarded data?

Confirm that the forwarder functions properly and is visible to the indexer. You can use the Distributed Management Console (DMC) to troubleshoot Splunk topologies and get to the root of any forwarder issues. See Monitoring Splunk Enterprise for details.

Last modified on 27 October, 2021
Improving data ingestion using the Edge Processor solution   Resolve data quality issues

This documentation applies to the following versions of Splunk Cloud Platform: 9.3.2408, 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release)


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters