All DSP releases prior to DSP 1.4.0 use Gravity, a Kubernetes orchestrator, which has been announced end-of-life. We have replaced Gravity with an alternative component in DSP 1.4.0. Therefore, we will no longer provide support for versions of DSP prior to DSP 1.4.0 after July 1, 2023. We advise all of our customers to upgrade to DSP 1.4.0 in order to continue to receive full product support from Splunk.
Cluster autoscaling for DSP on Google Kubernetes Engine
Data Stream Processor (DSP) deployments built on a Google Kubernetes Engine (GKE) can be configured to use the cluster autoscaling feature on GKE to increase or decrease the scale of node resources. Search for "about cluster autoscaling" in the Google Cloud documentation for more information about this GKE feature.
When the Splunk Data Stream Processor is running on Google Kubernetes Engine, the GKE cluster autoscaler feature must be turned on when provisioning the GKE cluster. When the GKE cluster autoscaler is turned on, GKE manages all Kubernetes cluster resources such as nodes, memory, and CPU. As such, the process for scaling DSP services up or down on GKE is different from those when scaling up or down in DSP.
To scale DSP up or down on GKE, you must use the commands provided on this page. Since GKE automatically increases or decreases the physical resources needed to match the needs of the DSP cluster, you cannot manually add or remove any nodes to or from the DSP cluster. Manual addition or removal of DSP cluster nodes on GKE may cause system instability.
Using GKE cluster autoscale with DSP involves setting a minimum and maximum number of nodes on the GKE UI and configuring DSP with those same numbers each time you want to scale your resources up or down.
Disclaimer
GKE autoscaling is different from the scaling capabilities in DSP. DSP does not support GKE autoscaling. Triggering a GKE autoscaler to run by independent means will not necessarily result in the corresponding scaling in DSP. Such action may result in instability of the DSP cluster. Any scaling of DSP resources should be done using the documented DSP commands only, then GKE autoscaler will scale resources as needed.
Prerequisites
- Turned on the GKE cluster autoscaling feature when provisioning your DSP cluster in GKE.
- Set your desired Minimum number of nodes and Maximum number of nodes in the GKE UI.
Steps
Scale up
- Increase the resources on task manager and job manager.
./dsp config set flink jm_cpu_limit=<number of nodes> jm_cpu_request=<number of nodes> ./dsp config set flink tm_cpu_limit=<number of nodes> tm_cpu_request=<number of nodes>
- Scale the task manager replica count.
./dsp config set flink tm_replicas=<number of nodes>
- Deploy your changes.
./dsp deploy flink
- Verify the list of all active nodes by running the following command.
kubectl get nodes
Scale down
- Scale the task manager replica count. This number must be less than what you entered in step 2 of Scale up.
./dsp config set flink tm_replicas=<number of nodes>
- Deploy your changes.
./dsp deploy flink
- After few minutes, GKE will scale down the node count. Verify the list of all active nodes by running the following command.
kubectl get nodes
Resizing a cluster by adding or removing nodes | Back up your Splunk Data Stream Processor deployment |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5
Feedback submitted, thanks!