Troubleshoot the input process
Following are some initial steps you can take to troubleshoot the data input process on Splunk Cloud.
Determine why you do not find the events you expect
When you add an input to your Splunk Cloud deployment, that input gets added relative to the app you are in. Some apps write input data to a specific index. If you cannot find data that you are certain is in your Splunk Cloud deployment, confirm that you are looking at the right index. See Retrieve events from indexes in the Search Manual. You might want to add indexes to the list of default indexes for the role you are using.
- For more information about roles, see Add and edit roles in the Securing the Splunk Platform manual.
- For more information about troubleshooting data input issues, read the rest of this topic or see I can't find my data! in the Troubleshooting Manual.
If you use Splunk Enterprise and add inputs by editing the inputs.conf configuration file, Splunk Enterprise might not recognize the inputs immediately. Splunk Enterprise looks for inputs every 24 hours, starting from the time it was last restarted, so if you add a new stanza to monitor a directory or file, it could take up to 24 hours for Splunk Enterprise to start indexing the contents of that directory or file. To ensure that your input is immediately recognized and indexed, add the input through Splunk Web or the CLI, or restart Splunk services after you make edits to the inputs.conf file.
Troubleshoot your tailed files
You can use the FileStatus Representational State Transfer (REST) endpoint to get the status of your tailed files. For example:
You can also monitor the fishbucket, a subdirectory that Splunk software uses to keep track of how much of a file's contents have been indexed. In Splunk Enterprise deployments, the fishbucket resides at
$SPLUNK_DB/fishbucket/splunk_private_db. In Splunk Cloud deployments you do not have physical access to this subdirectory.
To monitor the fishbucket, use the REST endpoint. Review the REST API Reference manual for additional information.
Troubleshoot ingestion congestion on Splunk Enterprise
Sometimes, Splunk Enterprise data ingestion can slow for what appears to be an unknown reason. One possibility for this slowness could be the number of inactive input channels available on your Splunk Enterprise indexers.
Description of an input channel
An indexer must track the state of each unique stream of data that it processes. For example, when it breaks up lines of data that it has ingested from a set of tailed files, the indexer receives data from these files in an order that cannot be predicted. Parts of various files can be interleaved with one another. An indexer prevents this interleaving from causing the line breaking of one file from interfering with the line breaking of another by tracking the state of each file with a data structure called an input channel.
An input channel stores a variety of information, including the following information:
- The state of the linebreaker
- The state of the aggregator
- The punct state
- The settings in props.conf for the input
There is a unique input channel for each source, source type, or host stream that the indexer encounters.
Description of an inactive input channel
An indexer does not, for performance and memory usage reasons, keep input channels around forever. After a channel has not been used for a while, for example, after data for a particular source, source type, and host tuple has not appeared for a while, a channel becomes eligible for reuse by a different stream. Splunk Enterprise has several settings that control the recycling behavior for inactive channels. You configure these settings in the limits.conf configuration file.
For example, suppose the indexer has just encountered a new stream. As a result, it needs an input channel into which it can save the state of this stream as it ingests it. At this point, it must decide whether to create a new input channel, thus using more memory, or to reuse an inactive channel, and incur a performance penalty if that inactive channel becomes active again.
When determining whether or not to use an inactive input channel, the indexer follows the following decision process:
- If the number of inactive channels is less than or equal to the value set for the
lowater_inactivesetting in the limits.conf configuration file, it creates an input channel.
- If the number of inactive channels is greater than the value set for the
max_inactivesetting, or the age of the oldest inactive channel is greater than the value set for the
inactive_eligibility_age_secondsin the limits.conf file, do one of the following things:
- Recycle the oldest inactive input channel.
- Create an input channel.
Put another way:
- The indexer always creates a new input channel if it is currently below the
- The indexer always recycles an inactive input channel if it is currently above the
- If the indexer is above the
lowater_inactivevalue and below the
max_inactivevalue at the same time, it recycles the oldest inactive channel if that channel is older than
inactive_eligibility_age_secondsseconds; otherwise, it creates a new input channel.
max_inactive setting now has a setting value
auto. This configures the indexer to adjust the
max_inactive setting based on the amount of memory that is present in the machine that runs the instance.
Configure manual or automatic inactive input channel limits
You can adjust the amount of maximum inactive input channels that an indexer keeps available. Increasing this number manually increases the amount of memory that the indexer uses. Lower numbers mean less memory usage by the indexer, but an increase in the amount of new input channels that the indexer creates, which can significantly reduce performance based on the amount of sources, source types, and hosts that the indexer encounters while it processes incoming data. Each inactive input channel takes around 5kB of memory.
- On the indexer where you want to adjust inactive input channel limits, open a shell or command prompt or text editor.
- Open the $SPLUNK_HOME/etc/system/local/limits.conf file for editing.
- In this file, locate the
[input_channels]stanza. If the stanza does not exist in the file, create it.
- Under the
[input_channels]stanza, add the following line:
max_inactive = <positive integer>
If you want the indexer to manage the number of inactive channels automatically, change the line to
max_inactive = auto
- Save the file and close it.
- Restart The Splunk Enterprise instance to apply the change.
For more information on the
inactive_eligibility_age_seconds settings for limits.conf, see the limits.conf specification file.
Can't find forwarded data?
Confirm that the forwarder functions properly and is visible to the indexer. You can use the Distributed Management Console (DMC) to troubleshoot Splunk topologies and get to the root of any forwarder issues. See Monitoring Splunk Enterprise for details.
Use persistent queues to help prevent data loss
Resolve data quality issues
This documentation applies to the following versions of Splunk Cloud™: 8.2.2104, 8.0.2007, 8.1.2008, 8.1.2009, 8.1.2011, 8.1.2012 (latest FedRAMP release), 8.1.2101, 8.1.2103