Splunk® Enterprise

Getting Data In

Download manual as PDF

Download topic as PDF

Troubleshoot the input process

This topic discusses some initial steps you can take to troubleshoot the data input process.

Determine why you do not find the events you expect

When you add an input to your Splunk deployment, that input gets added relative to the app you are in. Some apps write input data to a specific index. If you cannot find data that you are certain is in your Splunk deployment, confirm that you are looking at the right index. See Retrieve events from indexes in the Search Manual. You might want to add indexes to the list of default indexes for the role you are using.

  • For more information about roles, refer to the topic about roles in the Securing Splunk Enterprise manual.
  • For more information about troubleshooting data input issues, read the rest of this topic or see I can't find my data! in the Troubleshooting Manual.

Note: If you have Splunk Enterprise and add inputs by editing inputs.conf, the inputs might not be recognized immediately. Splunk Enterprise looks for inputs every 24 hours, starting from the time it was last restarted, so if you add a new stanza to monitor a directory or file, it could take up to 24 hours for Splunk Enterprise to start indexing the contents of that directory or file. To ensure that your input is immediately recognized and indexed, add the input through Splunk Web or the CLI, or restart Splunk services after making edits to inputs.conf.

Troubleshoot your tailed files

You can use the FileStatus Representational State Transfer (REST) endpoint to get the status of your tailed files. For example:

curl https://serverhost:8089/services/admin/inputstatus/TailingProcessor:FileStatus

You can also monitor the fishbucket, a subdirectory used to keep track of how much of a file's contents has been indexed. In Splunk Enterprise deployments, the fishbucket resides at $SPLUNK_DB/fishbucket/splunk_private_db. In Splunk Cloud deployments you do not have physical access to this subdirectory.

To monitor the fishbucket, use the REST endpoint. Review the REST API Reference manual for additional information.

Troubleshoot monitor inputs

For a variety of information on dealing with monitor input issues, read "Troubleshooting Monitor Inputs" in the Community Wiki.

Troubleshoot ingestion congestion

Sometimes, ingestion can slow for what appears to be an unknown reason. One possibility for this slowness could be the number of inactive input channels available on your Splunk Enterprise indexers.

Description of an input channel

An indexer must track the state of each unique "stream" of data that it processes. For example, when linebreaking data that it has ingested from a set of tailed files, the indexer receives data from these files in an order that cannot be predicted. Parts of various files can be interleaved with one another. An indexer prevents this interleaving from causing the line breaking of one file from interfering with the line breaking of another by tracking the state of each file with a data structure called an "input channel".

An input channel stores a variety of information, including:

  • The state of the linebreaker
  • The state of the aggregator
  • The punct state
  • The settings in props.conf for the input.

There is a unique input channel for each (source, sourcetype, host) "stream" that the indexer encounters.

Description of an inactive input channel

An indexer does not, for performance and memory usage reasons, keep input channels around forever. After a channel has not been used for a while, for example, after data for a particular source, sourctype, and host tuple has not appeared for a while, a channel becomes eligible for reuse by a different "stream". Splunk Enterprise has several settings that control the recycling behavior for inactive channels. You configure these settings in the limits.conf configuration file.

For example, suppose the indexer has just encountered a new stream. As a result, it needs an input channel into which it can save the state of this stream as it ingests it. At this point, it must decide whether to create a new input channel, thus using more memory, or to reuse an inactive channel, and thus incur a performance penalty if that inactive channel becomes active again.

When determining whether or not to use an inactive input channel, the indexer follows the following decision process:

  1. if (number of inactive channels is less than or equal to the value set for the lowater_inactive setting in limits.conf, create a new input channel. Otherwise,
  2. If the number of inactive channels is greater than the value set for the max_inactive setting in limits.conf, or the age of the oldest inactive channel is greater than the value set for inactive_eligibility_age_seconds in limits.conf:
    • Recycle the oldest inactive input channel.
    • Otherwise, create a new input channel.

Put In another way:

  • The indexer always creates a new input channel if it is currently below lowater_inactive.
  • The indexer always recycles an inactive input channel if it is currently above max_inactive.
  • If the indexer is above lowater_inactive and below max_inactive at the same time, it recycles the oldest inactive channel if it is older than inactive_eligibility_age_seconds; otherwise, it creates a new input channel.

The max_inactive setting now has a setting value auto. This configures the indexer to adjust the max_inactive setting based on the amount of memory that is present in the machine that runs the instance.

Configure manual or automatic inactive input channel limits

You can adjust the amount of maximum inactive input channels that an indexer keeps available. Increasing this number manually increases the amount of memory that the indexer uses. Lower numbers mean less memory usage by the indexer, but an increase in the amount of new input channels that the indexer creates, which can significantly reduce performance based on the amount of sources, source types, and hosts that the indexer encounters while it processes incoming data. Each inactive input channel takes around 5kB of memory.

  1. On the indexer where you want to adjust inactive input channel limits, open a shell or command prompt or text editor.
  2. Open the $SPLUNK_HOME/etc/system/local/limits.conf file for editing.
  3. In this file, locate the [input_channels] stanza. If the stanza does not exist in the file, create it.
  4. Under the [input_channels] stanza, add the following line:
    max_inactive = <positive integer>
    

    If you want the indexer to manage the number of inactive channels automatically, change the line to

    max_inactive = auto
    
  5. Save the file and close it.
  6. Restart The Splunk Enterprise instance to apply the change.

For more information on the max_inactive, lowater_inactive, and inactive_eligibility_age_seconds settings for limits.conf, see the limits.conf specification file.

Can't find forwarded data?

Confirm that the forwarder functions properly and is visible to the indexer. You can use the Distributed Management Console (DMC) to troubleshoot Splunk topologies and get to the root of any forwarder issues. Read Monitoring Splunk Enterprise for details.

Last modified on 02 September, 2020
PREVIOUS
Use persistent queues to help prevent data loss
  NEXT
Resolve data quality issues

This documentation applies to the following versions of Splunk® Enterprise: 7.3.7, 8.0.6, 8.0.7, 8.1.0


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters