Configure Kubernetes integration for the Splunk App for Data Science and Deep Learning

Integrate the Splunk App for Data Science and Deep Learning (DSDL) with a Kubernetes environment to run data science workloads in a scalable and secure manner. Kubernetes provides container orchestration to manage and deploy containerized applications across a cluster of machines. This integration is suitable for production environments where performance, reliability, and security are critical.

For Kubernetes documentation see https://kubernetes.io/docs/home/.

Prerequisites

The following prerequisites must be met to configure a Kubernetes integration for DSDL:

Splunk Enterprise installed and running.
Splunk Machine Learning Toolkit (MLTK) and Python for Scientific Computing (PSC) installed on the Splunk Enterprise instance.
DSDL installed on the Splunk Enterprise instance.
Access to a Kubernetes cluster with appropriate permissions.
The Kubernetes command-line tool (kubectl) configured to interact with your Kubernetes cluster.
Network connectivity between the Splunk Enterprise instance and the Kubernetes cluster.

Kubernetes configuration guidelines

Consider the following guidelines if you are configuring Kubernetes integration for DSDL:

Guideline	Description
Secure authentication	Use secure authentication for certificates or bearer tokens with limited RBAC privileges.
Transport layer security (TLS)	Ensure the Kubernetes API server and set any external DSDL endpoints to use SSL.
Permissions	Assign minimal permissions to manage pods and resources.
Monitor and scale	Use Splunk Observability or cluster metrics to watch resource usage and scale as needed.
Component updates	Keep Kubernetes, DSDL, and related components updated to their latest and compatible versions to benefit from security fixes and performance improvements.

Set up a Kubernetes cluster

Before integrating with DSDL, set up a Kubernetes cluster that meets the following requirements:

Requirement	Details
Kubernetes version	Version 1.16 or higher is needed for compatibility with DSDL.
Networking	Provide connectivity between Splunk Enterprise and the cluster. Configure network plugins such as Calico and Flannel as needed.
Load balancer or Ingress controller	Expose services externally for production use if required.
Persistent storage	Configure dynamic PVC provisioning if you plan to store model artifacts or data externally.
Role-based access control (RBAC)	DSDL does not automate RBAC creation. You can manually assign RBAC details in the DSDL setup page.

Configuration steps

Complete the following steps to configure a Kubernetes cluster:

Install Kubernetes cluster:
1. Use kubeadm or a managed provider such as Amazon EKS, Red Hat OpenShift, GKE, or AKS to install the cluster.
2. Ensure that all nodes communicate with each other and the control plane.
Configure network plugin:
1. Choose a plugin that matches your cluster's version.
2. Install using directions from the plugin's documentation.
Set up persistent storage:
1. Install a storage provisioner such as NFS, Ceph, or AWS EBS.
2. Create a StorageClass for dynamic provisioning.
3. Test by creating and binding a sample PersistentVolumeClaim (PVC).
Install ingress controller or load balancer:
1. Use an ingress controller such as NGINX Ingress Controller or configure a load balancer. For example AWS Elastic Load Balancing (ELB).
2. Enable SSL/TLS termination if you need secure external access.
Verify cluster functionality:
1. Deploy a simple test application.
2. Confirm that services and the ingress controller or load balancer configurations work as expected.

Configure DSDL for Kubernetes

Once the Kubernetes cluster is set up, you can configure DSDL in Splunk Enterprise to deploy and manage your containerized data science workloads:

In DSDL, go to Configuration and then Setup.
Select Kubernetes and enter your cluster details.
Choose the Service type.
Provide a hostname. For example dsdl.apps.<cluster-domain> if you want a custom route.
Test and save: DSDL will attempt to deploy containers in your Kubernetes project.

Automatic deployment

After saving the necessary details on the DSDL setup page in Splunk Enterprise, DSDL automatically triggers the deployment of its containers and any required Kubernetes resources. This includes creating pods, persistent volumes, and services according to your configuration.

Authentication modes

DSDL supports multiple authentication methods for connecting to your Kubernetes cluster. Choose the mode that best suits your environment and security requirements:

Authentication mode	Details	When to use
Certificate and Key	Cluster base URL: `https://api.<cluster-domain>:6443` Cluster certificate authority: The CA certificate path that signed the Kubernetes server's certificate. Client certificate or client key: Paths to your client certificate and private key. Obtain certificates from a trusted CA rather than self-signing certificates.	Use for high-security environments where you have properly signed certificates. Use if you prefer mutual TLS over other mechanisms.
User Token	Use a bearer token associated with a Kubernetes service account: Cluster base URL: `https://api.<cluster-domain>:6443` User token: The bearer token for a Kubernetes service account. Steps: Create the service account, bind appropriate roles, and retrieve the token using kubectl.	Use when you want a simple setup without managing certificates. Service accounts can have minimal or limited permissions through RBAC.
User Login	Use a username and password for basic authentication: Cluster base URL: `https://api.<cluster-domain>:6443` User name and password: Credentials for basic authorization. (Optional) CA certificate: Required if you need TLS.	Use for simple testing or development. Not suitable for production due to weaker security.
Service Account (In-Cluster)	Use a service account automatically when Splunk Enterprise runs inside the same Kubernetes cluster: Cluster base URL: Might be auto-discovered if in-cluster. Namespace: The namespace containing the service account.	Use if Splunk Enterprise is itself deployed on Kubernetes. Use for in-cluster authentication for DSDL tasks.

Service types

Choose how DSDL services such as notebooks and API endpoints are exposed within Kubernetes:

Service type	Details	When to use
LoadBalancer	Specify Namespace and StorageClass in DSDL configuration.	Use for direct external access on cloud providers such as AWS, Azure, and GCP that support external load balancers.
NodePort	Provide internal and external hostnames if needed.	Use for internal or test environments where you bind a high port on each node. Use for quick, local testing without an ingress.
Ingress	Ingress host pattern. For example `*.example.com`. Annotations: Custom Ingress settings.	Use when you want advanced routing, TLS termination, or path-based rules. An ingress controller must be installed in your cluster.

Namespace and resource management guidelines

See the following guidelines for namespace and resource management when using Kubernetes:

Use dsdl-namespace to create a new namespace and isolate the DSDL workloads.
Specify the Namespace in the DSDL Configuration page.
Set resource requests and limits. Ensure DSDL pods have enough CPU and memory if performing large-scale model training.

Storage configuration

Use persistent storage for storing models, logs, and data. Complete the following steps:

Verify the StorageClass:
```
kubectl get storageclass
```
Specify Storage Class in DSDL Configuration.
Check that dynamic provisioning is working by creating sample PVCs.

Certificate guidelines

See the following guidelines for certificate management when using Kubernetes:

Self-signed certificates trigger browser warnings and potential vulnerabilities. For external production, use publicly signed certificates such as Let's Encrypt, DigiCert, or an internal certificate authority (CA).
Include certificates in your DSDL container. Place dltk.pem and dltk.key in the /dltk/.jupyter/ location or specify a custom path in the DSDL configuration.
Enable Hostname Verification in DSDL. Set "Check Hostname" to "Enabled" in the DSDL setup page.

Firewall considerations

DSDL requires certain ports to communicate with Kubernetes resources:

Component	Description
Kubernetes API	Port 6443. Outbound traffic from Splunk to manage cluster.
DSDL API	Port 5000 or dynamically generated. Bidirectional traffic for `fit`, `apply`, and `summary` commands.
Splunk REST API	Port 8089. If container-based notebooks call back to Splunk.
Splunk HTTP Event Collector (HEC)	Port 8088 for on-premises or port 443 for Splunk Cloud. Outbound traffic from notebooks and pods to Splunk for logs and results.

Ensure your firewall rules allow necessary ports, especially for any dynamic assignments in development (DEV) mode. For example Jupyter on port 8888, or TensorBoard on port 6006.

Troubleshoot Kubernetes configuration

Issue	Troubleshoot
Authentication failures	Check that tokens, certificates, or user credentials are valid. Confirm RBAC roles and permissions in your cluster.
Service exposure problems	Verify correct service type of NodePort, LoadBalancer, or Ingress. Check ingress controller logs or load balancer configuration if external access fails.
Resource limitations	Pods can be stuck in "Pending" if there is insufficient CPU and memory or a lack of storage. Scale your resources or adjust requests and limits.
Networking issues	DNS resolution within the cluster might need debugging if Splunk cannot reach container endpoints. Check your cluster's network policy or plugin settings.
Storage issues	PersistentVolumeClaims (PVCs) can remain in "Pending" if no suitable StorageClass is available. Review the provisioner logs for errors.

Configure Kubernetes integration for the Splunk App for Data Science and Deep Learning

Prerequisites

Kubernetes configuration guidelines

Set up a Kubernetes cluster

Configuration steps

Configure DSDL for Kubernetes

Automatic deployment

Authentication modes

Service types

Namespace and resource management guidelines

Storage configuration

Certificate guidelines

Firewall considerations

Troubleshoot Kubernetes configuration

Comments

Configure Kubernetes integration for the Splunk App for Data Science and Deep Learning

Was this topic useful?