All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.
Upgrade the Splunk Data Stream Processor from 1.2.4 to 1.3.0
This topic describes how to upgrade the Splunk Data Stream Processor (DSP) to 1.3.0.
Currently, we only support upgrades from 1.2.4 to 1.3.0. If you are on any of the *-patch02 versions of DSP, you must first upgrade to 1.2.4. See Upgrade the Splunk Data Stream Processor to 1.2.4.
To upgrade to DSP 1.3.0 successfully, you must complete several prerequisite tasks before starting the upgrade. Make sure to read through all of the Before you upgrade sections in this topic and complete the relevant prerequisite tasks before upgrading DSP.
DSP does not provide a means of downgrading to previous versions. If you need to revert to an older DSP release, uninstall the upgraded version and reinstall the version you want.
Before you upgrade
Complete the following tasks before upgrading DSP. If you don't complete these tasks, you might encounter issues such as pipeline failures.
- Make sure you are using DSP 1.2.4
- Review known issues
- Review the features planned for deprecation or removal
- Identify connections that need to be replaced
- Disable scheduled jobs
- Remove all machine learning functions
Make sure you are using DSP 1.2.4
You must upgrade to DSP 1.3.0 from DSP 1.2.4. If you are using a patch version of DSP, such as DSP 1.2.1-patch02 or DSP 1.2.2-patch02, then you upgrade to DSP 1.2.4 first. See Upgrade the Splunk Data Stream Processor to 1.2.4.
Review known issues
Review the known issues related to the upgrade process. Depending on what functions you have in your pipelines, you might need to complete some additional steps to restore those pipelines after the upgrade is complete.
Review the features planned for deprecation or removal
Review the Features planned for deprecation or removal to see what features are scheduled for future deprecation.
Identify connections that need to be replaced
Identify any pipelines that connect to the following data sources or destinations, and track how these connections are being used:
- Amazon Kinesis Data Streams
- Apache Kafka
- Apache Pulsar
- Microsoft Azure Event Hubs
Specifically, keep track of the names of all the pipelines where these connections are being used and whether each connection points to a data source or a data destination. After upgrading to DSP 1.3.0, you need to replace these connections with ones that use the new source-specific and sink-specific connectors. See the Update pipelines to use new DSP 1.3.0 connections section on this page for more information.
After you upgrade to DSP 1.3.0, pipelines with the aforementioned connections will continue to run successfully. However, these connections won't appear on the Connections page, and you won't be able to modify them. You can only delete them.
Disable scheduled jobs
If you're running any scheduled data collection jobs using the following source connectors, you must disable those jobs before upgrading DSP:
- Amazon CloudWatch Metrics
- Amazon S3
- AWS Metadata
- Google Cloud Monitoring
- Microsoft 365
- Microsoft Azure Monitor
If you don't disable all the scheduled jobs in these connectors before upgrading your DSP deployment, the Kubernetes container image name used by these connectors is not updated. See the ImagePullBackoff status shown in Kubernetes after upgrading DSP troubleshooting topic for more information.
- In DSP, select the Connections page.
- For each connection that uses one of the source connectors listed earlier, do the following:
- Select the connection to open it for editing.
- Toggle the Scheduled parameter off.
- Save your changes.
Remove all machine learning functions
If you are not using the Streaming ML plugin, skip this step. The Streaming ML Plugin beta feature and all of the machine learning functions included in the plugin have been removed in DSP 1.3.0. See Feature deprecation and removal notices.
Before upgrading DSP, you must remove the following machine learning functions from all active pipelines:
- Adaptive Thresholding
- Apply ML Model
- Datagen
- Drift Detection
- Pairwise Categorial Outlier Detection
- Sentiment Analysis
- Sequential Outlier Detection
- Time Series Decomposition (STL)
- estdc
- perc
If you don't remove these functions, the pipelines containing them will fail and all other pipelines will also need to be restarted after upgrading.
- In DSP, select the Pipelines page.
- For each pipeline that uses a machine learning function, do the following:
- Open the pipeline for editing. If the pipeline is active, click Deactivate.
- Delete the machine learning function from the pipeline.
- Click Save, and reactivate the pipeline if it was active before. When you reactivate a pipeline, you must select where you want to resume data ingestion. See Using activation checkpoints to activate your pipeline in the Use the Data Stream Processor manual for more information.
Upgrade the Splunk Data Stream Processor
Once you've prepared your DSP environment for upgrade by completing the tasks described in the Before you upgrade section, follow these steps to upgrade DSP.
- Download the new DSP tarball on one of the master nodes of your cluster.
- Extract the tarball.
tar xf <dsp-version>.tar
- Navigate to the extracted file.
cd <dsp-version>
- (Optional) If your environment has a small root volume (6GB or less of free space) in
/tmp
, your upgrade may fail when you run out of space. Choose a different directory to write temporary files to during the upgrade process.export TMPDIR=/<directory-on-larger-volume>
- From the extracted file directory, run the upgrade script.
sudo ./upgrade
Upgrading can take a while, depending on the number of nodes you have in your cluster. Once the upgrade is done, the message Upgrade completed successfully
is shown, followed by some garbage collection logs. Once you see those logs, you can start using the latest version of DSP. Any pipelines that were active before the upgrade are reactivated.
Validate the upgrade
Log in to DSP to confirm that your upgrade was successful.
- In the browser you use to access the DSP UI, clear the browser cache.
- Navigate to the DSP UI.
https://<DSP_HOST>:30000/
- On the login page, enter the following:
User: dsp-admin Password: <the dsp-admin password>
After upgrading
After successfully upgrading DSP, complete the following tasks:
- Delete old DSP directories
- Re-enable scheduled jobs
- Update pipelines to use new DSP 1.3.0 connections
- Enable automatic updates for lookups
- Upgrade the Splunk App for DSP
- Review known issues and apply workarounds
After upgrading to the latest version of DSP, any command-line operations must be performed in the new upgraded directory on the master node.
Delete old DSP directories
On each node, delete the directories containing the old version of DSP. This is an optional clean-up step.
To delete old DSP directories, run the following command on each node:
rm -r <dsp-version-upgraded-from>
Re-enable scheduled jobs
Re-enable the scheduled jobs that were disabled during Disable scheduled jobs.
- In DSP, select the Connections page.
- For each scheduled data collection job that you need to re-enable, do the following:
- Select the connection where the scheduled job is defined.
- Toggle the Scheduled parameter on.
- Save your changes.
Update pipelines to use new DSP 1.3.0 connections
Starting in DSP 1.3.0, connectors that supported both source and sink functions have been replaced by connectors that specifically support source functions only or sink functions only. If you have any pipelines that connect to Amazon Kinesis Data Streams, Apache Kafka, Apache Pulsar, or Microsoft Azure Event Hubs, you need to recreate the connections using the new source-specific and sink-specific connectors and then update your pipelines to use these new connections.
Pipelines with the aforementioned connections will continue to run successfully. However, these connections won't appear on the Connections page, and you won't be able to modify them. You can only delete them.
- Recreate your connections using the new connectors as needed. For example, if your pipeline connects to Amazon Kinesis Data Streams as a data source, then recreate that Kinesis connection using the Connector for Amazon Kinesis Data Streams Source. For detailed instructions on creating connections, see the Connect to Data Sources and Destinations with DSP manual.
- Select the Pipelines page.
- For each pipeline that needs to be updated to use a source-specific or sink-specific connection, do the following:
- Open the pipeline for editing. If the pipeline is active, click Deactivate.
- Select the source or sink function for which you need to update the connection.
- On the View Configurations tab, click the Delete icon () next to the Connection id field to delete the connection that's being replaced.
- Set Connection id to the appropriate source-specific or sink-specific connection that you created during step 1.
- Click Save, and reactivate the pipeline if it was active before. When you reactivate a pipeline, you must select where you want to resume data ingestion. See Using activation checkpoints to activate your pipeline in the Use the Data Stream Processor manual for more information.
Enable automatic updates for lookups
Starting in version 1.3.0, DSP automatically checks for updates to CSV lookup files. However, any active pipelines that are using CSV lookups from a previous version of DSP are not automatically migrated to this new behavior. Do the following steps to enable automatic updates for lookups.
As an alternative option, you can also enable automatic updates by uploading a new version of a previous lookup file and restarting all pipelines using the lookup file.
- Log in to the Splunk Cloud Services CLI. Copy and save the bearer token returned to a preferred location.
./scloud login --verbose
- Get a list of all of the CSV lookups being used.
curl -X GET -k "https://<DSP_HOST>:31000/default/streams/v3beta1/connections?connectorId=b5dfcb94-142e-470f-9045-ad0b83603bdb" \ -H "Authorization: Bearer <my-bearer-token>" \ -H "Content-Type: application/json"
- Copy and save the
id
for each CSV lookup that you want to enable automatic updates for in a preferred location. Any CSV lookup that does not havecheck_for_new_connection_secs
andtrim_edge_whitespace
configurations do not have automatic updates enabled. -
curl -X PATCH -k "https://<DSP_HOST>:31000/default/streams/v3beta1/connections/<csv-lookup-id>" \ -H "Authorization: Bearer <my-bearer-token>" \ -H "Content-Type: application/json" \ -d '{ "data": {"check_for_new_connection_secs":60, "trim_edge_whitespace":null}}'
- (Optional) Verify your changes.
curl -X GET -k "https://<DSP_HOST>:31000/default/streams/v3beta1/connections"\ -H "Authorization: Bearer <my-bearer-token>" \ -H "Content-Type: application/json"
- Open the DSP UI and restart all pipelines using this CSV lookup. To restart a pipeline, deactivate it and reactivate it again.
You only need to restart the affected pipelines once. Afterwards, DSP will automatically detect when you upload a new version of a CSV lookup file and active pipelines will automatically switch to using the latest version of the CSV file.
Upgrade the Splunk App for DSP
If you have the Splunk App for DSP installed on your Splunk DSP cluster, you must upgrade it to the latest version. See Install the Splunk App for DSP for more information.
Review known issues and apply workarounds
There are some known issues that can occur when upgrading. Review the Known issues for DSP topic, and follow any workarounds that apply to you.
Install the Splunk Data Stream Processor | Upgrade the Splunk Data Stream Processor on the Google Cloud Platform |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.3.0
Feedback submitted, thanks!