Cluster autoscaling for DSP on Google Kubernetes Engine

Data Stream Processor (DSP) deployments built on a Google Kubernetes Engine (GKE) can be configured to use the cluster autoscaling feature on GKE to increase or decrease the scale of node resources. Search for "about cluster autoscaling" in the Google Cloud documentation for more information about this GKE feature.

When the Splunk Data Stream Processor is running on Google Kubernetes Engine, the GKE cluster autoscaler feature must be turned on when provisioning the GKE cluster. When the GKE cluster autoscaler is turned on, GKE manages all Kubernetes cluster resources such as nodes, memory, and CPU. As such, the process for scaling DSP services up or down on GKE is different from those when scaling up or down in DSP.

To scale DSP up or down on GKE, you must use the commands provided on this page. Since GKE automatically increases or decreases the physical resources needed to match the needs of the DSP cluster, you cannot manually add or remove any nodes to or from the DSP cluster. Manual addition or removal of DSP cluster nodes on GKE may cause system instability.

Using GKE cluster autoscale with DSP involves setting a minimum and maximum number of nodes on the GKE UI and configuring DSP with those same numbers each time you want to scale your resources up or down.

Disclaimer

GKE autoscaling is different from the scaling capabilities in DSP. DSP does not support GKE autoscaling. Triggering a GKE autoscaler to run by independent means will not necessarily result in the corresponding scaling in DSP. Such action may result in instability of the DSP cluster. Any scaling of DSP resources should be done using the documented DSP commands only, then GKE autoscaler will scale resources as needed.

Prerequisites

Turned on the GKE cluster autoscaling feature when provisioning your DSP cluster in GKE.
Set your desired Minimum number of nodes and Maximum number of nodes in the GKE UI.

Steps

Scale up

Increase the resources on task manager and job manager.

./dsp config set flink jm_cpu_limit=<number of nodes> jm_cpu_request=<number of nodes>
./dsp config set flink tm_cpu_limit=<number of nodes> tm_cpu_request=<number of nodes>

Scale the task manager replica count.

./dsp config set flink tm_replicas=<number of nodes>

Deploy your changes.
```
./dsp deploy flink
```
Verify the list of all active nodes by running the following command.
```
kubectl get nodes
```

Scale down

Scale the task manager replica count. This number must be less than what you entered in step 2 of Scale up.
```
./dsp config set flink tm_replicas=<number of nodes>
```
Deploy your changes.
```
./dsp deploy flink
```
After few minutes, GKE will scale down the node count. Verify the list of all active nodes by running the following command.
```
kubectl get nodes
```

Cluster autoscaling for DSP on Google Kubernetes Engine

Disclaimer

Prerequisites

Steps

Scale up

Scale down

Comments

Cluster autoscaling for DSP on Google Kubernetes Engine

Was this topic useful?