Splunk® Data Stream Processor

Install and administer the Data Stream Processor

Acrobat logo Download manual as PDF


On April 3, 2023, Splunk Data Stream Processor reached its end of sale, and will reach its end of life on February 28, 2025. If you are an existing DSP customer, please reach out to your account team for more information.

All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.
Acrobat logo Download topic as PDF

Upgrade the Splunk Data Stream Processor on the Google Cloud Platform

You can do a blue-green upgrade of the Splunk Data Stream Processor (DSP) on the Google Cloud Platform (Google Cloud or GCP). This type of upgrade reduces downtime by running two identical GCP environments during the DSP upgrade process. At any given time, only one GCP environment is live and handling your DSP components. After you prepare your new GCP environment and install the new version of DSP, you can switch each DSP pipeline over to the new environment. This upgrade gives you the opportunity to troubleshoot and correct any errors during the DSP pipeline migration and prevent data loss. You can only perform this upgrade starting after 1.3.0, but these instructions are available to you for planning purposes or if you want to switch GCP regions.

This topic shows you how to do the following upgrade tasks:

  • Create a new Google Cloud environment.
  • Backup the resources in the previous Google Cloud environment.
  • Set up the new Google Cloud environment using the resources of the previous Google Cloud environment.
  • Install the new version of DSP.
  • Switch the pipelines over to the new DSP cluster.
  • Decommission the previous Google Cloud environment.

Prerequisites

Before you can upgrade the Splunk Data Stream Processor on Google Cloud Platform, you must have an existing Splunk Data Stream Processor deployment on the Google Cloud Platform.

This blue-green upgrade procedure will require two GCP environments, complete with all required resources including GCP Cloud SQL instances, GCP Buckets, GCP Service Accounts, and GCP Compute Engine instances. After the upgrade is complete and your DSP pipelines are moved over to the new GCP environment, you can decommission and delete the old GCP environment.

Step 1: Create a new Google Cloud environment

First you must create a new Google Cloud environment that will run in parallel with your existing Google Cloud environment.

  1. Create a Google Cloud auto mode or custom mode VPC network. We recommend that you reuse the network, subnetwork, and firewall rules used in your original configuration. This will simplify the migration and make it easier to ensure that your configurations are identical between the old and new environments.
  2. Create a dedicated Google Cloud service account. You must create a new Google Cloud service account. Roles in the Google Cloud service account are conditional on the DSP cluster name.
  3. Create GCP Cloud SQL for PostgreSQL instances. Create new passwords when you create the GCP Cloud SQL instances. Do not create the databases and users when you create the new GCP Cloud SQL instances for the new environment. You will restore the old databases and users in a later step.
  4. Create and setup a Google Cloud Storage bucket. Create a multi-region Google Cloud Storage bucket with a name in the format: {prefix}-{new-cluster-name}-{suffix}.

Step 2: Backup the GCP Cloud SQL instances in your original GCP environment

Create an on-demand backup for each GCP Cloud SQL instance. Search for "Creating an on-demand backup" in the Google Cloud documentation.

When the backup is complete, you will have the following GCP Cloud SQL instances backed up.

  • {prefix}-{old-cluster-name}-hec-{suffix}
  • {prefix}-{old-cluster-name}-iac-{suffix}
  • {prefix}-{old-cluster-name}-s2s-{suffix}
  • {prefix}-{old-cluster-name}-streams-{suffix}
  • {prefix}-{old-cluster-name}-uaa-{suffix}

Step 3: Use the Google Cloud Secret Manager to copy the secrets

The Google Cloud Platform provides a secrets manager that the Splunk Data Stream Processor can use to store application secrets. You can use the Secret Manager to view, copy, and update the secret values. All secrets in the clusters are in the format {prefix}-{cluster-name}-{secret-name}. Search for "Creating and accessing secrets" in the Google Cloud documentation

  1. Navigate to the Secret Manager in the Google Cloud Console.
  2. For each secret in the old environment, create a corresponding secret in the new environment. The secrets must follow the format {prefix}-{new-cluster-name}-{secret-name}.
  3. Select the most recent version of each secret in the old environment and copy the secret value to the corresponding secret in the new environment.
  4. Repeat these steps until all old secret values have been copied to the corresponding new secrets.

Step 4: Restore the GCP Cloud SQL instances to your new GCP environment

At this point you will have a new environment with five GCP Cloud SQL instances. These five GCP Cloud SQL instances can now be populated with the backup you created in "Step 2: Backup the GCP Cloud SQL instances in your original GCP environment". The following table shows you the GCP Cloud SQL instances in the new cluster and the corresponding GCP Cloud SQL backup from the old environment.

New GCP Cloud SQL instance Old GCP Cloud SQL instance
{prefix}-{new-cluster-name}-hec-{suffix} {prefix}-{old-cluster-name}-hec-{suffix}
{prefix}-{new-cluster-name}-iac-{suffix} {prefix}-{old-cluster-name}-iac-{suffix}
{prefix}-{new-cluster-name}-s2s-{suffix} {prefix}-{old-cluster-name}-s2s-{suffix}
{prefix}-{new-cluster-name}-streams-{suffix} {prefix}-{old-cluster-name}-streams-{suffix}
{prefix}-{new-cluster-name}-uaa-{suffix} {prefix}-{old-cluster-name}-uaa-{suffix}
  1. Restore the backup for each GCP Cloud SQL instance. Search for "Restoring an instance from backup" in the Google Cloud documentation for information.
  2. Update the passwords for the user in each database instance to the passwords that you created in "Step 1: Create a new Google Cloud environment".
    GCP Cloud SQL instance name User Password
    {prefix}-{new-cluster-name}-hec-{suffix} hec Password created when the instance was created
    {prefix}-{new-cluster-name}-iac-{suffix} splunk Password created when the instance was created
    {prefix}-{new-cluster-name}-s2s-{suffix} s2s Password created when the instance was created
    {prefix}-{new-cluster-name}-streams-{suffix} streams Password created when the instance was created
    {prefix}-{new-cluster-name}-uaa-{suffix} uaa Password created when the instance was created

Step 5: Copy the JAR files from the old GCP environment to the new GCP environment

Use the Google Cloud Console to copy the JAR files from {prefix}-{old-cluster-name}-{suffix}/{cluster-name} to {prefix}-{new-cluster-name}-{suffix}/{cluster-name}. Search for "Copying, renaming, and moving objects" in the Google Cloud documentation.

Step 6: Create an installation file for Google Cloud and install the Splunk Data Stream Processor

Once all the Google Cloud resources are created in the new environment, you must create a config.yml file to install the Splunk Data Stream Processor, copy over any additional configuration items from the old environment, and deploy your Splunk Data Stream Processor cluster.

Make sure you meet all prerequisites given in Create an installation file for Google Cloud and install the Splunk Data Stream Processor.

  1. Perform Steps 1 through 3 of Create an installation file for Google Cloud and install the Splunk Data Stream Processor.
  2. Log into the old environment and run the command dsp config list to display the configuration of the old environment.
  3. Compare the configuration in the old environment with the contents of your new configuration file. If there are configuration items on the old environment that are not currently included in the new configuration file, copy them over into the new configuration file.
  4. Continue with Step 4 of Create an installation file for Google Cloud and install the Splunk Data Stream Processor and complete the installation.

Step 7: Reset the DSP administrator password

When you restored the databases in "Step 4: Restore the GCP Cloud SQL instances to your new GCP environment" the administrator password for the old environment was also restored. You must reset the DSP administrator password on the new environment.

  1. Log in to the new DSP environment.
  2. Navigate to the DSP install directory and run the following command.
    dsp admin reset-password

Step 8: Switch pipelines over to the new DSP cluster

Now you can switch your pipelines over from the old environment to the new environment. When you restored the backup to the new environment the pipelines were in the state they were in at the moment the backup was created. This means that some pipelines will appear be in a running state even though they are not actually running on the DSP backend. You will need to manually activate each pipeline before you can start using them to ingest data.

  1. Deactivate the pipelines on the new environment.
    1. List all the pipelines in the new environment.
      dsp admin pipelines
    2. Log in to DSP on the new environment with SCloud.
      ./scloud login
    3. Deactivate each pipeline.
      ./scloud streams deactivate-pipeline --id <pipeline-id>
  2. Deactivate the pipelines on the old environment, and make a note of the save points.
    1. List all the pipelines in the old environment.
      dsp admin pipelines
    2. Log in to DSP on the old environment with SCloud.
      ./scloud login
    3. Deactivate each pipeline.
      ./scloud streams deactivate-pipeline --id <pipeline-id>
    4. Make sure the pipeline deactivation was successful. If the deactivation fails, for example, the savepoint fails, this pipeline can't be migrated without data loss to the new environment. You must troubleshoot and fix the deactivation problem before completing the migration.
    5. Once the pipeline is successfully deactivated, make a note of the save point location. The save point is formatted as: gs://bucket/<savepoint_location>. For example gs://my-dsp-bucket/pipelines/flink/savepoints/default/38c4c22b-eeda-41d7-8a69-24e35ac00bbe/savepoint-8ced49-98df534b6ad7.
      dsp admin pipelines
  3. For each pipeline in the new environment, do the following.
    1. Update the savepoint for the pipeline.
      dsp admin pipelines <id> --set-savepoint gs://bucket/<savepoint_location>
    2. Activate the pipeline to resume data ingestion from where it left off. If the pipeline activation is unsuccessful, you can reactivate the pipeline on the old environment while you troubleshoot and correct any errors found on the new pipeline.

Step 9: Optional. Decommission and delete the old GCP environment

Once all DSP pipelines have been migrated to the new GCP environment, the old GCP environment and all associated resources can safely be stopped, suspended, or deleted. Refer to the Google Cloud documentation for information about stopping, suspending, and deleting GCP resources and GCP instances.

Last modified on 13 January, 2023
PREVIOUS
Upgrade the Splunk Data Stream Processor to 1.4.0
  NEXT
Uninstall the Splunk Data Stream Processor

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.4.0, 1.4.1, 1.4.2, 1.4.3


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters