How the Splunk App for Data Science and Deep Learning can help you
The Splunk App for Data Science and Deep Learning (DSDL) extends the Splunk platform to provide advanced analytics, machine learning, and deep learning by leveraging external containerized environments and popular data science frameworks.
DSDL can help you in the following ways:
- Advanced analytics and machine learning integration
- Seamless data handling
- Model training and deployment
- Integration and automation
- Container environment options
Advanced analytics and machine learning integration
DSDL includes the following options to perform advanced analytics and machine learning integration:
Option | Description |
---|---|
Deep learning frameworks | Incorporate libraries such as TensorFlow, PyTorch, and Keras for neural network tasks like image recognition and natural language processing (NLP). |
External computing resources | Offload resource-intensive computations to external container environments, optionally leveraging GPUs for accelerated model training. |
Data science environments | Use tools such as JupyterLab, MLflow, and optionally Spark, and TensorBoard for development, experimentation, and visualization. |
Seamless data handling
DSDL offers the following data handling options:
Option | Description |
---|---|
Data ingestion | Ingest and index data at scale, in real time or batch mode, using Splunk. |
In-place data transformation | Use Splunk Search Processing Language (SPL) to clean, enrich, and transform data at the source. |
Pull data into notebooks | Use the Splunk REST API to execute SPL searches within JupyterLab. |
Push data to the notebook | Use staging commands such as | fit MLTKContainer mode=stage... to transfer data from Splunk to the DSDL container.
|
Feature engineering | Leverage SPL or Python-based transformations to create refined features for improved model accuracy. |
Model training and deployment
DSDL includes the following model training and deployment options:
Option | Description |
---|---|
Model training | Execute model training on GPU or CPU enabled containers, mitigating Splunk search head load and speeding up deep learning. |
Inference execution | Perform inference in the external container environment and pull results back into the Splunk platform or dashboards. |
Results integration | Return inference outputs to the Splunk platform using the Splunk HTTP Event Collector (HEC) for real-time monitoring. |
Integration and automation
DSDL supports the following integrations to other apps and products:
Integration | Description |
---|---|
Splunk REST API | Dynamically pull data into notebooks or from Splunk, fostering an iterative approach. |
Splunk HTTP Event Collector (HEC) | Stream inference results and logs back into Splunk for further analysis and alerting. |
DSDL API | Run model training and inference commands from the Splunk search head, while containers handle the compute. |
Notebook environments | Develop and monitor experiments using Jupyter, and optionally MLflow, Spark, and TensorBoard. |
Splunk Observability | Monitor containers, inference processes, and performance metrics to ensure a stable and efficient deployment. |
Container environment options
DSDL offers the following container environments options:
Container option | Description |
---|---|
Docker | Set up a straightforward environment, typically without Transport Layer Security (TLS), for smaller or development use cases. |
Kubernetes | Orchestrate larger-scale environments using TLS-enabled Kubernetes clusters such as Amazon EKS or Red Hat OpenShift. This option provides a secure, scalable deployment of containers. |
Splunk App for Data Science and Deep Learning overview | Splunk App for Data Science and Deep Learning architecture |
This documentation applies to the following versions of Splunk® App for Data Science and Deep Learning: 5.2.0
Feedback submitted, thanks!