Configure OpenShift integration for the Splunk App for Data Science and Deep Learning
Integrate the Splunk App for Data Science and Deep Learning (DSDL) with Red Hat OpenShift to run data science workloads in a scalable, secure, and enterprise-ready manner. OpenShift provides a Kubernetes-based platform with additional features such as Routes for external service exposure, integrated software-defined networking (SDN), and robust role-based access control (RBAC). This integration is suitable for production environments where performance, reliability, and security are essential.
For OpenShift documentation see https://docs.openshift.com/.
Prerequisites
The following prerequisites must be met to configure an OpenShift integration for DSDL:
- Splunk Enterprise installed and running.
- Splunk Machine Learning Toolkit (MLTK) and Python for Scientific Computing (PSC) installed on the Splunk Enterprise instance.
- DSDL installed on the Splunk Enterprise instance.
- Access to an OpenShift cluster with appropriate permissions.
- The OpenShift Container Platform command-line interface tool (oc) configured to interact with your OpenShift cluster.
- Network connectivity between the Splunk Enterprise instance and the OpenShift cluster.
OpenShift configuration guidelines
Consider the following guidelines if you are configuring OpenShift integration for DSDL:
Guideline | Description |
---|---|
Secure authentication | Use secure authentication rather than basic authentication for certificates or tokens. |
Publicly signed certificates | Using publicly signed certificates minimizes browser warnings and security risks. |
Transport layer security (TLS) termination | Match the container's certificate approach with Route settings. For example use passthrough for self-signed, and reencrypt for custom certificates.
|
Permissions | Assign minimal permissions to your service accounts or user tokens. |
Component updates | Keep OpenShift, MLTK, DSDL, and any containers updated to their latest and compatible versions to benefit from security fixes and performance improvements.. |
Monitor resources | Use OpenShift's built-in monitoring or external tools to monitor CPU and memory usage. |
Network policies | If you need advanced security, define that explicitly to control Pod traffic. |
Project isolation | Keep DSDL workloads in a dedicated project for clarity and resource governance. |
NFS storage | Optional. If you have an external NFS server, installing the NFS container storage interface (CSI) driver can enable a shared filesystem for DSDL data across Pods. |
Set up an OpenShift cluster
Before integrating with DSDL, set up an OpenShift cluster that meets the following requirements:
Requirement | Details |
---|---|
OpenShift version | Version 4.x or higher is needed for compatibility with DSDL. |
Networking | Verify that Pods can communicate across the cluster using the integrated software-defined networking (SDN). |
OpenShift uses an integrated SDN. Ensure proper configuration. | |
Ingress controller and routes | OpenShift uses Routes for external services. |
Ensure the default OpenShift Router is configured and can expose routes. | |
Persistent storage | DSDL requires PersistentVolumeClaims (PVCs) for model data and logs.
Set up a storage provisioner, such as NFS, OpenShift Container Storage, or other CSI solutions. |
Role-based access control (RBAC) | DSDL does not automate RBAC creation. You can manually assign RBAC details in the DSDL setup page. |
Configuration steps
Complete the following steps to configure an OpenShift cluster:
- Install an OpenShift cluster:
- Use the OpenShift installer or an operator-based approach. See https://docs.openshift.com/.
- Ensure nodes are active and can communicate..
- Configure networking:
- Validate the integrated SDN configuration.
- Confirm Pods can communicate across the cluster.
- Set up persistent storage:
- Install a storage provisioner. For example OpenShift Container Storage, NFS CSI driver, or another CSI solution.
- Create a StorageClass for dynamic PersistentVolumeClaim (PVC).
- Test by creating a sample PVC to ensure it binds properly.
- Verify OpenShift router configuration:
- Confirm the default router is running.
- Configure wildcard DNS if needed for dynamic Route hostnames.
- Test cluster functionality:
- Deploy a simple app to ensure services are reachable using Routes.
(Optional) Use NFS for persistent storage
If you prefer to use Network File Storage (NFS) for shared persistent storage, you can configure an NFS CSI driver in your OpenShift cluster. This setup is useful if you plan to run models in their dedicated containers and require storage to be accessible by multiple containers.
Complete the following steps:
- Install the NFS CSI driver. See https://github.com/kubernetes-csi/csi-driver-nfs.
See the following example:oc apply -k "github.com/kubernetes-csi/csi-driver-nfs/deploy/kubernetes/overlays/stable?ref=master"
Adjust the path and version for your cluster.
- Configure an NFS server:
- Ensure you have an NFS server accessible to OpenShift worker nodes.
- Export a share, for example
/srv/nfs
, with the correct permissions.
- Create a StorageClass that references the NFS CSI driver, as shown in the following example:
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: nfs-csi provisioner: nfs.csi.k8s.io parameters: server: <NFS_SERVER_IP_OR_HOSTNAME> share: /srv/nfs reclaimPolicy: Delete volumeBindingMode: Immediate
Replace
<NFS_SERVER_IP_OR_HOSTNAME>
with your NFS server address. - Verify that a PVC can bind using this storage class as shown in the following example:
oc create -f - <<EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: test-nfs spec: storageClassName: nfs-csi accessModes: - ReadWriteMany resources: requests: storage: 5Gi EOF
Confirm the PVC transitions to Bound.
- Use the NFS StorageClass in your DSDL configuration:
- In the DSDL Setup page, specify
nfs-csi
, or the name you created, as the Storage Class. - DSDL automatically creates PVCs using NFS for storing data and model artifacts.
- In the DSDL Setup page, specify
Configure DSDL for OpenShift
Once your cluster is ready, configure DSDL within Splunk Enterprise:
- In DSDL, go to Configuration and then Setup.
- Select OpenShift and enter your cluster details:
- Authentication Mode. For example Cert & Key, User Token.
- Cluster Base URL. For example
https://api.<cluster-domain>:6443
. - Namespace. Use an OpenShift Project, for example
dsdl-project
. - Storage Class for PVCs. For example
nfs-csi
orocs-storagecluster-ceph-rbd
.
- Service type: Typically Route in OpenShift for external exposure.
- Route hostname: Provide a hostname. For example
dsdl.apps.<cluster-domain>
if you want a custom route. - Test and save: DSDL will attempt to deploy containers in your OpenShift project.
Automatic deployment
After saving the necessary details on the DSDL setup page in Splunk Enterprise, DSDL automatically deploys its containers, persistent storage, and additional resources to your OpenShift cluster, simplifying the integration process.
Authentication modes
DSDL supports multiple authentication modes to connect to OpenShift. Choose the mode that best suits your environment:
Authentication mode | Details | When to use |
---|---|---|
Certificate and Key |
|
Use for strong, mutual TLS authentication. Offers direct client certificates from OpenShift or a custom CA. |
User Token |
|
Use for quick setup with minimal certificate handling. Ideal if you prefer to manage service accounts with limited permissions. |
User Login |
|
Not suitable for production due to less secure authentication. |
Service Account (In-Cluster) |
|
Use if Splunk Enterprise itself is deployed inside OpenShift. |
Service types
In OpenShift, the common approach is using Routes, but NodePort is also supported:
Service type | Details |
---|---|
Route | Route hostname example: dsdl.apps.<cluster-domain>
TLS termination:
Mismatching TLS termination with the container's certificate expectations can cause SSL handshake errors. For example, if the container uses self-signed certificates, passthrough is often required. |
NodePort | Exposes a static port on each node. Common option for internal use or development and testing scenarios. |
Namespace and resource management guidelines
See the following guidelines for namespace and resource management when using OpenShift:
- Use
oc new-project dsdl-project
to create a new project for DSDL. - Specify
dsdl-project
as the Namespace in the DSDL Configuration page. - Adjust resource requests and set limits for large workloads.
Certificate guidelines
See the following guidelines for transport layer security (TLS) and certificate management when using OpenShift:
- Self-signed certificates trigger browser warnings and potential vulnerabilities. For external production, use publicly signed certificates such as Let's Encrypt, DigiCert, or an internal certificate authority (CA).
- Generate or obtain certificates matching your domain.
- Add the certificates to the certificates directory in your DSDL container build context. For example
dltk.pem
,dltk.key
. - Build and push a custom container image if needed.
- Configure your Route for either
reencrypt
oredge
TLS. - If you want to enforce domain matching enable Hostname Verification in DSDL.
Firewall considerations
Configure firewall rules to ensure secure communication between Splunk and OpenShift:
Component | Description |
---|---|
OpenShift API | Port 6443. Outbound traffic from Splunk to manage resources. |
DSDL API | Port 5000 or dynamically generated port. Bidirectional traffic for fit , apply , and summary commands.
|
Splunk REST API | Port 8089. Use if the container-based notebooks or services call back to Splunk. |
Splunk HTTP Event Collector (HEC) | Port 8088 for on-premises or port 443 for Splunk Cloud. |
When making any firewall rule changes, ensure the cluster nodes can reach Splunk on the relevant ports, and Splunk can reach the cluster API.
Configure TLS in an OpenShift Sandbox
The following are example steps for using OpenShift Sandbox or a development trial environment and want Transport Layer Security (TLS):
- Set the Route to
passthrough
for DSDL notebooks including Jupyter and TensorBoard. - Extract the container's self-signed certificate with the command of
openssl s_client -connect ... -showcerts
. - Add the
.pem
path in the DSDL setup under Certificate Settings. - Manually create or delete PVCs in the sandbox if they get stuck.
TLS in an OpenShift Sandbox has the following limitations:
- Only
gp2
storage class is typically available. Storage classes such asgp3
orNFS
might not be supported. - NodePort or advanced network settings might not be allowed.
- You might need to manually configure pass-through routes for self-signed certificate usage.
Troubleshoot OpenShift configuration
Issue | Troubleshoot |
---|---|
502 Bad Gateway or invalid and incomplete response | This can be due to TLS mismatch or incorrect pass-through settings. Ensure termination is set to |
Authentication failures | Can occur because of incorrect tokens or expired service account. Fix this by regenerating tokens and updating DSDL configuration. For example |
Service exposure problems | Can be caused by Route not pointing to the correct service or to an invalid TLS termination. Check the router logs and verify that your Route's hostname and port are correct. |
Resource limitations | Pods can be stuck in "Pending" if there is insufficient CPU and memory or a lack of storage. Scale your resources or adjust requests and limits.. |
Storage issues | Permanent Virtual Circuits (PVCs) can remain in "Pending" if no suitable StorageClass is available. For NFS, confirm the NFS CSI driver installation and server accessibility. |
Install and configure the Splunk App for Data Science and Deep Learning in an air-gapped environment | Configure Kubernetes integration for the Splunk App for Data Science and Deep Learning |
This documentation applies to the following versions of Splunk® App for Data Science and Deep Learning: 5.2.0
Feedback submitted, thanks!