Splunk® App for Data Science and Deep Learning

Use the Splunk App for Data Science and Deep Learning

Configure OpenShift integration for the Splunk App for Data Science and Deep Learning

Integrate the Splunk App for Data Science and Deep Learning (DSDL) with Red Hat OpenShift to run data science workloads in a scalable, secure, and enterprise-ready manner. OpenShift provides a Kubernetes-based platform with additional features such as Routes for external service exposure, integrated software-defined networking (SDN), and robust role-based access control (RBAC). This integration is suitable for production environments where performance, reliability, and security are essential.

For OpenShift documentation see https://docs.openshift.com/.

Prerequisites

The following prerequisites must be met to configure an OpenShift integration for DSDL:

  • Splunk Enterprise installed and running.
  • Splunk Machine Learning Toolkit (MLTK) and Python for Scientific Computing (PSC) installed on the Splunk Enterprise instance.
  • DSDL installed on the Splunk Enterprise instance.
  • Access to an OpenShift cluster with appropriate permissions.
  • The OpenShift Container Platform command-line interface tool (oc) configured to interact with your OpenShift cluster.
  • Network connectivity between the Splunk Enterprise instance and the OpenShift cluster.

OpenShift configuration guidelines

Consider the following guidelines if you are configuring OpenShift integration for DSDL:

Guideline Description
Secure authentication Use secure authentication rather than basic authentication for certificates or tokens.
Publicly signed certificates Using publicly signed certificates minimizes browser warnings and security risks.
Transport layer security (TLS) termination Match the container's certificate approach with Route settings. For example use passthrough for self-signed, and reencrypt for custom certificates.
Permissions Assign minimal permissions to your service accounts or user tokens.
Component updates Keep OpenShift, MLTK, DSDL, and any containers updated to their latest and compatible versions to benefit from security fixes and performance improvements..
Monitor resources Use OpenShift's built-in monitoring or external tools to monitor CPU and memory usage.
Network policies If you need advanced security, define that explicitly to control Pod traffic.
Project isolation Keep DSDL workloads in a dedicated project for clarity and resource governance.
NFS storage Optional. If you have an external NFS server, installing the NFS container storage interface (CSI) driver can enable a shared filesystem for DSDL data across Pods.

Set up an OpenShift cluster

Before integrating with DSDL, set up an OpenShift cluster that meets the following requirements:

Requirement Details
OpenShift version Version 4.x or higher is needed for compatibility with DSDL.
Networking Verify that Pods can communicate across the cluster using the integrated software-defined networking (SDN).
  OpenShift uses an integrated SDN. Ensure proper configuration.
Ingress controller and routes OpenShift uses Routes for external services.
  Ensure the default OpenShift Router is configured and can expose routes.
Persistent storage DSDL requires PersistentVolumeClaims (PVCs) for model data and logs.

Set up a storage provisioner, such as NFS, OpenShift Container Storage, or other CSI solutions.

Role-based access control (RBAC) DSDL does not automate RBAC creation. You can manually assign RBAC details in the DSDL setup page.

Configuration steps

Complete the following steps to configure an OpenShift cluster:

  1. Install an OpenShift cluster:
    1. Use the OpenShift installer or an operator-based approach. See https://docs.openshift.com/.
    2. Ensure nodes are active and can communicate..
  2. Configure networking:
    1. Validate the integrated SDN configuration.
    2. Confirm Pods can communicate across the cluster.
  3. Set up persistent storage:
    1. Install a storage provisioner. For example OpenShift Container Storage, NFS CSI driver, or another CSI solution.
    2. Create a StorageClass for dynamic PersistentVolumeClaim (PVC).
    3. Test by creating a sample PVC to ensure it binds properly.
  4. Verify OpenShift router configuration:
    1. Confirm the default router is running.
    2. Configure wildcard DNS if needed for dynamic Route hostnames.
  5. Test cluster functionality:
    1. Deploy a simple app to ensure services are reachable using Routes.

(Optional) Use NFS for persistent storage

If you prefer to use Network File Storage (NFS) for shared persistent storage, you can configure an NFS CSI driver in your OpenShift cluster. This setup is useful if you plan to run models in their dedicated containers and require storage to be accessible by multiple containers.

Complete the following steps:

  1. Install the NFS CSI driver. See https://github.com/kubernetes-csi/csi-driver-nfs.
    See the following example:
    oc apply -k "github.com/kubernetes-csi/csi-driver-nfs/deploy/kubernetes/overlays/stable?ref=master"
    

    Adjust the path and version for your cluster.

  2. Configure an NFS server:
    1. Ensure you have an NFS server accessible to OpenShift worker nodes.
    2. Export a share, for example /srv/nfs, with the correct permissions.
  3. Create a StorageClass that references the NFS CSI driver, as shown in the following example:
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: nfs-csi
    provisioner: nfs.csi.k8s.io
    parameters:
      server: <NFS_SERVER_IP_OR_HOSTNAME>
      share: /srv/nfs
    reclaimPolicy: Delete
    volumeBindingMode: Immediate
    

    Replace <NFS_SERVER_IP_OR_HOSTNAME> with your NFS server address.

  4. Verify that a PVC can bind using this storage class as shown in the following example:
    oc create -f - <<EOF
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: test-nfs
    spec:
      storageClassName: nfs-csi
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 5Gi
    EOF

    Confirm the PVC transitions to Bound.

  5. Use the NFS StorageClass in your DSDL configuration:
    1. In the DSDL Setup page, specify nfs-csi, or the name you created, as the Storage Class.
    2. DSDL automatically creates PVCs using NFS for storing data and model artifacts.

Configure DSDL for OpenShift

Once your cluster is ready, configure DSDL within Splunk Enterprise:

  1. In DSDL, go to Configuration and then Setup.
  2. Select OpenShift and enter your cluster details:
    1. Authentication Mode. For example Cert & Key, User Token.
    2. Cluster Base URL. For example https://api.<cluster-domain>:6443.
    3. Namespace. Use an OpenShift Project, for example dsdl-project.
    4. Storage Class for PVCs. For example nfs-csi or ocs-storagecluster-ceph-rbd.
  3. Service type: Typically Route in OpenShift for external exposure.
  4. Route hostname: Provide a hostname. For example dsdl.apps.<cluster-domain> if you want a custom route.
  5. Test and save: DSDL will attempt to deploy containers in your OpenShift project.

Automatic deployment

After saving the necessary details on the DSDL setup page in Splunk Enterprise, DSDL automatically deploys its containers, persistent storage, and additional resources to your OpenShift cluster, simplifying the integration process.

Authentication modes

DSDL supports multiple authentication modes to connect to OpenShift. Choose the mode that best suits your environment:

Authentication mode Details When to use
Certificate and Key
  • Cluster base URL: https://api.<cluster-domain>:6443
  • Cluster certificate authority: Path to CA cert signing the OpenShift API.
  • Client certificate or client key: Typically retrieved from oc whoami --show-client-certificate and oc whoami --show-client-key.
Use for strong, mutual TLS authentication. Offers direct client certificates from OpenShift or a custom CA.
User Token
  • Cluster base URL: The OpenShift API address.
  • User token: Bearer token from a service account. For example, oc sa get-token dsdl-user.
Use for quick setup with minimal certificate handling. Ideal if you prefer to manage service accounts with limited permissions.
User Login
  • User name and password: Basic authorization credentials.
  • Cluster CA: CA certificate.
Not suitable for production due to less secure authentication.
Service Account (In-Cluster)
  • Namespace: The project containing the service account.
Use if Splunk Enterprise itself is deployed inside OpenShift.

Service types

In OpenShift, the common approach is using Routes, but NodePort is also supported:

Service type Details
Route Route hostname example: dsdl.apps.<cluster-domain>

TLS termination:

  • Use passthrough if you use self-signed certificates in the container.
  • Use edge or reencrypt if you have your own publicly signed certificates.

Mismatching TLS termination with the container's certificate expectations can cause SSL handshake errors. For example, if the container uses self-signed certificates, passthrough is often required.

NodePort Exposes a static port on each node. Common option for internal use or development and testing scenarios.

Namespace and resource management guidelines

See the following guidelines for namespace and resource management when using OpenShift:

  • Use oc new-project dsdl-project to create a new project for DSDL.
  • Specify dsdl-project as the Namespace in the DSDL Configuration page.
  • Adjust resource requests and set limits for large workloads.

Certificate guidelines

See the following guidelines for transport layer security (TLS) and certificate management when using OpenShift:

  • Self-signed certificates trigger browser warnings and potential vulnerabilities. For external production, use publicly signed certificates such as Let's Encrypt, DigiCert, or an internal certificate authority (CA).
  • Generate or obtain certificates matching your domain.
  • Add the certificates to the certificates directory in your DSDL container build context. For example dltk.pem, dltk.key.
  • Build and push a custom container image if needed.
  • Configure your Route for either reencrypt or edge TLS.
  • If you want to enforce domain matching enable Hostname Verification in DSDL.

Firewall considerations

Configure firewall rules to ensure secure communication between Splunk and OpenShift:

Component Description
OpenShift API Port 6443. Outbound traffic from Splunk to manage resources.
DSDL API Port 5000 or dynamically generated port. Bidirectional traffic for fit, apply, and summary commands.
Splunk REST API Port 8089. Use if the container-based notebooks or services call back to Splunk.
Splunk HTTP Event Collector (HEC) Port 8088 for on-premises or port 443 for Splunk Cloud.

When making any firewall rule changes, ensure the cluster nodes can reach Splunk on the relevant ports, and Splunk can reach the cluster API.

Configure TLS in an OpenShift Sandbox

The following are example steps for using OpenShift Sandbox or a development trial environment and want Transport Layer Security (TLS):

  1. Set the Route to passthrough for DSDL notebooks including Jupyter and TensorBoard.
  2. Extract the container's self-signed certificate with the command of openssl s_client -connect ... -showcerts.
  3. Add the .pem path in the DSDL setup under Certificate Settings.
  4. Manually create or delete PVCs in the sandbox if they get stuck.

TLS in an OpenShift Sandbox has the following limitations:

  • Only gp2 storage class is typically available. Storage classes such as gp3 or NFS might not be supported.
  • NodePort or advanced network settings might not be allowed.
  • You might need to manually configure pass-through routes for self-signed certificate usage.

Troubleshoot OpenShift configuration

Issue Troubleshoot
502 Bad Gateway or invalid and incomplete response This can be due to TLS mismatch or incorrect pass-through settings.

Ensure termination is set to passthrough if the container is using self-signed certificates.

Authentication failures Can occur because of incorrect tokens or expired service account.

Fix this by regenerating tokens and updating DSDL configuration. For example oc sa get-token.

Service exposure problems Can be caused by Route not pointing to the correct service or to an invalid TLS termination.

Check the router logs and verify that your Route's hostname and port are correct.

Resource limitations Pods can be stuck in "Pending" if there is insufficient CPU and memory or a lack of storage.

Scale your resources or adjust requests and limits..

Storage issues Permanent Virtual Circuits (PVCs) can remain in "Pending" if no suitable StorageClass is available.

For NFS, confirm the NFS CSI driver installation and server accessibility.
Verify the cluster's storage operator or logs for errors.

Last modified on 29 January, 2025
Install and configure the Splunk App for Data Science and Deep Learning in an air-gapped environment   Configure Kubernetes integration for the Splunk App for Data Science and Deep Learning

This documentation applies to the following versions of Splunk® App for Data Science and Deep Learning: 5.2.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters