Splunk® Data Stream Processor

Use the Data Stream Processor

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Using activation checkpoints to activate your pipeline

When you first create a pipeline, it is in a deactivated state. You must activate a pipeline to start ingesting data. The first time you activate a pipeline, it starts with no state. The frequently adds checkpoints to active pipelines which are used if the pipeline fails or is restarted. By default, when a data pipeline is deactivated, the progress state is saved as a savepoint and used the next time the data pipeline is activated.

Checkpoints and savepoints

Checkpoints and savepoints are two separate features in the that serve different needs to ensure consistency and fault tolerance, and to make sure that no data is dropped when you deactivate and reactivate a pipeline. Checkpoints are primarily used to provide a recovery mechanism in case of unexpected errors. The adds checkpoints frequently and automatically, without any user interaction. In contrast, savepoints are created on a best-effort basis after a particular user action, such as deactivating a pipeline. Savepoints ensure that your pipeline picks up right where it left off when you reactivate it by saving the progress state of each function in the pipeline.

Data pipeline activation options

There are three different types of activation options. If you are activating your pipeline for the first time, simply chose Activate in the UI with no other settings enabled. See the table below for information on the different activation options available.

Activation option Description Additional Notes
Activate Activate a pipeline. Use this option first before using the other activation options. If you are reactivating a previously activated pipeline, this option reactivates the pipeline from the pipeline savepoint.
Skip Restore State Reactivate a pipeline without attempting to restore any state from a savepoint or, in the case that a savepoint doesn't exist, a checkpoint. When you use this option, the pipeline starts ingesting data from the initial position of your data source, ignoring the last savepoint. If the initial position of your data source is earliest, or TRIM HORIZON, then you could experience either data duplication for data still in the stream that was already ingested or data loss for data that may have expired in the stream before it could be ingested. If the initial position of your data source is LATEST, then you may experience data loss, since data sent after the last checkpoint was saved and the latest data in the input stream would be missed.
Allow Non Restored State Reactivate a pipeline using a best effort, partial restore from the savepoint. Modifying a pipeline might prevent the existing savepoint from being restored which normally results in a pipeline activation failure. When you use this option, the will use the existing savepoint to recover as much state as possible and ignore all errors while doing so. Use this option instead of Skip Restore State when you want to use the last savepoint to recover as much state as possible.

Enabling Allow Non Restored State or Skip Restore State can lead to data loss.

Pipeline deactivation options

In addition to pipeline activation options, you also have the following deactivation option available.

Deactivation option Description Additional Notes
skipSavepoint Deactivate a pipeline without creating a savepoint. By default, the creates a savepoint when you deactivate a pipeline to ensure that your pipeline picks up right where it left off when you reactivate it. However, attempts to deactivate your pipeline might fail if the pipeline is blocked and no data is flowing through, which prevents a savepoint from being created. To bypass this issue, enable this setting to deactivate the pipeline without creating a savepoint. In most cases, you will not experience data loss when you reactivate the pipeline as the will use the last checkpoint instead. However, if the pipeline was blocked for a long time, data loss might occur as an older checkpoint will be used. You can only modify this setting using the Splunk Cloud Services CLI or the Streams service.
Last modified on 23 June, 2021
Back up, restore, and share pipelines
Interpreting pipeline statuses

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.3.0

Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters