How the Splunk App for Data Science and Deep Learning can help you

The Splunk App for Data Science and Deep Learning (DSDL) extends the Splunk platform to provide advanced analytics, machine learning, and deep learning by leveraging external containerized environments and popular data science frameworks.

DSDL can help you in the following ways:

Advanced analytics and machine learning integration

DSDL includes the following options to perform advanced analytics and machine learning integration:

Option	Description
Deep learning frameworks	Incorporate libraries such as TensorFlow, PyTorch, and Keras for neural network tasks like image recognition and natural language processing (NLP).
External computing resources	Offload resource-intensive computations to external container environments, optionally leveraging GPUs for accelerated model training.
Data science environments	Use tools such as JupyterLab, MLflow, and optionally Spark, and TensorBoard for development, experimentation, and visualization.

Seamless data handling

DSDL offers the following data handling options:

Option	Description
Data ingestion	Ingest and index data at scale, in real time or batch mode, using Splunk.
In-place data transformation	Use Splunk Search Processing Language (SPL) to clean, enrich, and transform data at the source.
Pull data into notebooks	Use the Splunk REST API to execute SPL searches within JupyterLab.
Push data to the notebook	Use staging commands such as `\| fit MLTKContainer mode=stage...` to transfer data from Splunk to the DSDL container.
Feature engineering	Leverage SPL or Python-based transformations to create refined features for improved model accuracy.

Model training and deployment

DSDL includes the following model training and deployment options:

Option	Description
Model training	Execute model training on GPU or CPU enabled containers, mitigating Splunk search head load and speeding up deep learning.
Inference execution	Perform inference in the external container environment and pull results back into the Splunk platform or dashboards.
Results integration	Return inference outputs to the Splunk platform using the Splunk HTTP Event Collector (HEC) for real-time monitoring.

Integration and automation

DSDL supports the following integrations to other apps and products:

Integration	Description
Splunk REST API	Dynamically pull data into notebooks or from Splunk, fostering an iterative approach.
Splunk HTTP Event Collector (HEC)	Stream inference results and logs back into Splunk for further analysis and alerting.
DSDL API	Run model training and inference commands from the Splunk search head, while containers handle the compute.
Notebook environments	Develop and monitor experiments using Jupyter, and optionally MLflow, Spark, and TensorBoard.
Splunk Observability	Monitor containers, inference processes, and performance metrics to ensure a stable and efficient deployment.

Container environment options

DSDL offers the following container environments options:

Container option	Description
Docker	Set up a straightforward environment, typically without Transport Layer Security (TLS), for smaller or development use cases.
Kubernetes	Orchestrate larger-scale environments using TLS-enabled Kubernetes clusters such as Amazon EKS or Red Hat OpenShift. This option provides a secure, scalable deployment of containers.

Related answers from Splunk Community

How the Splunk App for Data Science and Deep Learning can help you

Advanced analytics and machine learning integration

Seamless data handling

Model training and deployment

Integration and automation

Container environment options

Comments

How the Splunk App for Data Science and Deep Learning can help you

Was this topic useful?