Splunk App for Data Science and Deep Learning components
The Splunk App for Data Science and Deep Learning (DSDL) enhances your data analytics capabilities with advanced analytics, external computing resources, streamlined workflows, model collaboration, and model monitoring.
DSDL is comprised of the following components:
- Splunk Machine Learning Toolkit (MLTK) app and Python for Scientific Computing (PSC) add-on
- Splunk search head with DSDL installed
- External containerized environment
- Integration components
Splunk Machine Learning Toolkit app and Python for Scientific Computing add-on
DSDL relies on both the Splunk Machine Learning Toolkit (MLTK) and its dependency, the Python for Scientific Computing (PSC) add-on. Both must be installed on your Splunk search head.
MLTK provides machine learning commands in the Splunk platform, including fit
and apply
, which are essential for training and applying models within Splunk searches. MLTK also stores model names and manages non-DSDL models trained directly on the search head. DSDL introduces the additional benefit of executing inference on external container resources.
PSC supplies the Python libraries and dependencies required for scientific computing and machine learning tasks within the Splunk platform.
Splunk search head with DSDL installed
DSDL is installed on the Splunk search head alongside MLTK and PSC, providing the interface and commands necessary to integrate with external data science environments.
This set up provides the following benefits:
- Centralized management: Configure data science tasks, manage models, and run searches from the Splunk search head.
- Extended capabilities: Offload inference and training to external containers, adding scalability and GPU acceleration.
- Shared resources: In a distributed setup, DSDL commands can be accessed by other search heads if permissions are configured accordingly.
External containerized environment
DSDL connects the Splunk platform to an external, containerized environment where advanced computations take place.
This external containerized environment includes the following options and benefits:
- Notebook environments: Run Jupyter, MLflow, and optionally Spark and TensorBoard.
- GPU utilization: Speed-up deep learning tasks by taking advantage of GPU hardware.
- Splunk REST API access: Execute SPL searches directly in notebooks for real-time data exploration. See Creating searches using the REST API in the REST API Tutorials manual.
Integration components
DSSL can integrate with other components in the Splunk platform:
- Splunk REST API: Enable interactive data retrieval from Splunk into the container environment.
- Splunk HTTP Event Collector (HEC): Send inference results and logs back to Splunk.
- DSDL API: Run model training and inference commands in external containers.
- Splunk Observability: Monitor performance and container health using Splunk Observability tools.
Splunk App for Data Science and Deep Learning architecture | Install or upgrade the Splunk App for Data Science and Deep Learning |
This documentation applies to the following versions of Splunk® App for Data Science and Deep Learning: 5.2.0
Feedback submitted, thanks!