Splunk® App for Data Science and Deep Learning

Use the Splunk App for Data Science and Deep Learning

Firewall requirements for Splunk and Docker communication

Docker integration with the Splunk App for Data Science and Deep Learning (DSDL) is suitable for single-machine setups with minimal firewall rules. External exposure is optional, and primarily for Jupyter or other data science interfaces.

Docker integration is not suitable for production environments due to the lack of Transport Layer Security (TLS ) on port 2375. Always verify the exact port mappings in the DSDL UI or logs before adjusting firewall rules. Use Kubernetes if you need secure, scalable, multi-instance deployments.

All external container processes such as Docker, Kubernetes, JupyterLab, and GPU management, including firewall management, are out of scope for Splunk platform support. Ensure that your environment is configured to securely connect Splunk to the container resources.

Docker integration summary

Local integration:

  • Designed for single-machine setups with minimal firewall rules.
  • External exposure is optional, primarily for Jupyter or other data science interfaces.

Remote integration:

  • Not advisable for production due to a lack of TLS on port 2375.
  • Use Kubernetes if you need secure, scalable, multi-instance deployments.

Dynamic ports:

  • In DEV mode, DSDL can dynamically assign ports to Jupyter and Spark, for example.
  • Always verify the exact port mappings in the DSDL UI or logs before adjusting firewall rules.

Docker integration limitations

The following are limitations of Docker integration with DSDL:

  • Security: Docker integration in DSDL does not support Transport Layer Security (TLS ) on port 2375.
    • Remote connections over port 2375 are unencrypted which is insecure for production.
  • Scalability: Docker is less flexible for distributed or large-scale deployments.
    • Kubernetes is a better option for secure, large-scale, or multi-instance environments, and resource management.
    • Kubernetes supports TLS for the API and provides good scalability, making it a more robust solution for production.

Local Docker integration

When Docker and the Splunk search head are co-located on the same machine, no external firewall rules are typically needed for Docker's communication with Splunk. Docker interacts through a local socket of unix://var/run/docker.sock or local port 2375, which does not require external exposure.

If certain DSDL services such as Jupyter or MLflow need external access, you must open the appropriate ports. In most local setups these services are only accessed from the same machine.

See the following table for a summary of guidelines for local Docker integration:

Local integration Description
Docker No additional firewall rules:
  • Docker to Splunk communication typically uses a local socket or 2375 on the same host.
  • External firewall settings are rarely needed.
External services: Jupyter, TensorBoard, MLflow, or Spark If Jupyter, TensorBoard, MLflow, or Spark must be accessed from outside, open their respective ports:
  • Jupyter: Inbound if external usage is required. Default port 8888.
  • MLflow: Optional if this service is exposed outside the local machine. Default port 6060.
  • Spark: Optional if this service is exposed outside the local machine. Default port 4040.
  • TensorBoard: Optional if this service is exposed outside the local machine. Default port 6006.

Remote Docker integration

While Docker can be configured for remote access, it poses a security risk to production environments due to the lack of TLS on port 2375. If you must temporarily enable remote Docker, consider the following fiewall rules:

Traffic direction Port Purpose Firewall rule
Outbound 2375. no TLS Docker API for container management. Required for remote access to Docker and remote Docker management.
Bidirectional 8089 Splunk REST API. Optional. Use to connect container to the Splunk REST API.
Bidirectional 5000 or dynamically assigned DSDL commands such as fit and apply. Required for DSDL operations.
Inbound 443 for Splunk Cloud otherwise 8088 Splunk HEC for data submissions. Optional if using HEC ingestion.
External 8888, 6060, 4040, 6006 Jupyter, MLflow, Spark, TensorBoard Only open if externally accessed.

See the following table for further information on both Splunk and service ports:

Port Description
Docker API port
  • Port 2375 provides remote Docker control without encryption.
  • Security warning: Unencrypted traffic is not suitable for sensitive or production deployments.
  • Firewall rule: Outbound traffic from Splunk to 2375 if Splunk needs to manage Docker containers remotely.
Splunk management port
  • Used for the Splunk REST API, which containers might call.
  • Firewall rule: If container-based services need to query Splunk, allow bidirectional access on port 8089.
DSDL API port
  • DSDL typically runs on port 5000 in DEV mode, but it can change dynamically.
  • Firewall rule: Allow bidirectional access on port 5000 or the assigned port for DSDL operations.
Splunk HEC port
  • If containerized tasks must send data to Splunk:
  • Port 443 for Splunk Cloud.
  • Port 8088 for Splunk on-premises.
  • Firewall rule: Inbound on the relevant port to receive data from Docker containers.
Jupyter
  • Default port 8888.
  • Used for accessing Jupyter notebooks. Inbound traffic on port 8888, or dynamically assigned port. Mandatory if accessing Jupyter from outside the machine.
MLflow
  • Default port 6060.
  • Used for tracking experiments and managing models. By default, these services run locally in DEV mode. If you expose ports to external networks, open inbound rules for port 6060. Optional.
Spark
  • Default port 4040
  • Used for monitoring Spark job execution. By default, these services run locally in DEV mode. If you expose ports to external networks, open inbound rules for port 4040. Optional.
TensorBoard
  • Default port 6006
  • Used for real-time insights into model training. By default, these services run locally in DEV mode. If you expose ports to external networks, open inbound rules for port 6006. Optional.
Last modified on 24 January, 2025
Set up the Splunk App for Data Science and Deep Learning using AWS and EKS   Firewall requirements for Splunk and Kubernetes communication

This documentation applies to the following versions of Splunk® App for Data Science and Deep Learning: 5.2.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters