Splunk Cloud Platform

Getting Data In

Use persistent queues to help prevent data loss

Persistent queuing lets you store data in an input queue to disk. In a Splunk Cloud Platform deployment, persistent queues can help prevent data loss if a forwarder that you configured to send data to your Splunk Cloud Platform instance backs up. In a Splunk Enterprise deployment, persistent queues work for either forwarders or indexers. You can't configure persistent queues directly on a Splunk Cloud Platform instance.

By default, forwarders and indexers have an in-memory input queue of 500 KB. If the input stream runs at a faster rate than the forwarder or indexer can process, to a point where the input queue on the forwarder maxes out, undesired consequences occur. In the case where you send network data over the UDP protocol, that data drops off of the queue and gets lost. For other types of data inputs, the application that generates the data can get backed up.

By implementing persistent queues, you can help prevent this data drop or loss from happening. With persistent queuing, after the in-memory queue is full, the forwarder or indexer writes the input stream to files on disk. It then processes data from the in-memory and disk queues until it reaches the point when it can again start processing directly from the data stream.

While persistent queues help prevent data loss if processing gets backed up, you can still lose data if the forwarder or indexer crashes. For example, the forwarder holds some input data in the in-memory queue as well as in the persistent queue files. The in-memory data can get lost if the forwarder crashes. Similarly, data that is in the parsing or indexing pipeline but that has not yet been written to disk can get lost in a crash.

When can you use persistent queues?

Persistent queuing is available for certain types of inputs, but not all. Generally speaking, persistent queuing is available for inputs of an ephemeral nature, such as network inputs, but isn't available for inputs that have their own form of persistence, such as monitoring files.

Persistent queues are available for these input types:

  • Network inputs that use the TCP protocol
  • Network inputs that use the UDP protocol
  • First-In, First-Out (FIFO) inputs
  • Scripted inputs
  • Windows Event Log inputs
  • HTTP Event Collector tokens

Persistent queues aren't available for these input types:

  • Monitor inputs
  • Batch inputs
  • File system change monitor

Configure a persistent queue

Use the inputs.conf configuration file to configure a persistent queue. You can configure the persistent queue on the universal forwarder that you configured to send data to Splunk Cloud Platform. You can also configure persistent queues on Splunk Enterprise indexers. Use the same procedure directly on the indexer or forwarder that sends data to the indexer.

Inputs don't share queues. You configure a persistent queue in the stanza for the specific input.

  1. On the machine that forwards data to Splunk Cloud Platform, use a text editor to open the $SPLUNK_HOME/etc/system/local/inputs.conf file for editing.
  2. Locate or add the input stanza where you want to enable persistent queuing.
  3. Specify the following setting within that input stanza:
    persistentQueueSize = <integer>[KB|MB|GB|TB]
    
  4. Save the file and close it
  5. Restart the forwarder.

For more information about the inputs.conf file, see inputs.conf in the Splunk Enterprise Admin Manual.

Example of configuring a persistent queue

Here's an example of specifying a 10MB persistent queue for a TCP network input:

[tcp://9994]
persistentQueueSize=10MB

Here is another example for specifying a 15MB persistent queue for a Windows Event Log input:

[WinEventLog]
persistentQueueSize=15MB

The Windows Event Log monitor accepts a persistent queue configuration for the default Windows Event Log stanza only. You cannot configure a persistent queue for a specific Event Log channel. You can configure a persistent queue for a specific Windows host monitoring input.

Persistent queue location

The persistent queue has a hardcoded location, which varies according to the type of input.

For network inputs, the persistent queue is located at $SPLUNK_HOME/var/run/splunk/[tcpin|udpin]/pq__<port>.

Put two underscores in the file name: pq__, not pq_.

See the following examples:

  • The persistent queue for TCP port 2012 is $SPLUNK_HOME/var/run/splunk/tcpin/pq__2012.
  • The persistent queue for UDP port 2012 is $SPLUNK_HOME/var/run/splunk/udpin/pq__2012.

For FIFO inputs, the persistent queue resides in $SPLUNK_HOME/var/run/splunk/fifoin/<encoded path>.

For scripted inputs, the persistent queue resides in $SPLUNK_HOME/var/run/splunk/exec/<encoded path>. The FIFO scripted input stanza in the inputs.conf file derives the <encoded path>.

Last modified on 25 August, 2023
Use a test index to test your inputs   Use ingest actions to improve the data input process

This documentation applies to the following versions of Splunk Cloud Platform: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403 (latest FedRAMP release), 9.2.2406


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters