Splunk® Machine Learning Toolkit

User Guide

Share data in the Splunk Machine Learning Toolkit

When the Splunk Machine Learning Toolkit (MLTK) is deployed on Splunk Enterprise, the Splunk platform sends aggregated usage data to Splunk Inc. ("Splunk") to help improve MLTK in future releases. For information about how to opt in or out, and how the data is collected, stored, and governed, see Share data in Splunk Enterprise.

What data is collected

The Splunk Machine Learning Toolkit collects the following basic usage information:

Component Description Example
algo_name Name of algorithm used in fit or apply.
{
  "algo_name": "StandardScaler"
}
app_context Name of the app from which search is run.
{
 "app_context": "Splunk_ML_Toolkit"
}
apply_time Time the apply command took.
{
  'apply_time': 0.005
}
app.session.Splunk_ML_Toolkit.changeSmartAssistantStep User progress through an MLTK Smart Assistant.
{
   component: app.session.Splunk_ML_Toolkit.changeSmartAssistantStep
   data: { [-]
     app: Splunk_ML_Toolkit
     experiment_id: 63fb7afba756455d8056b5e547f8545f
     experimentType: smart_outlier_detection
     page: smart_outlier_detection
     previousStep: learn
     step: define
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 7185ae51-04aa-2025-8a57-6e0340e50c46
   experienceID: d914fba4-7ca1-4370-a123-3a03a01d2569
   optInRequired: 3
   timestamp: 1585251931
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.createExperiment User creating an MLTK Experiment.
{
   component: app.session.Splunk_ML_Toolkit.createExperiment
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 09ca5db894894c86b20b083941acaae0
     experimentType: smart_forecast
     page: experiments
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 8318866b-f2f5-35a4-1348-b82486b3a41f
   experienceID: dfbde5b8-eb57-10a3-5ced-3be47f2b8ad2
   optInRequired: 3
   timestamp: 1583786919
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.createExperimentAlert Users creating alerts for MLTK Experiments.
{
      component: app.session.Splunk_ML_Toolkit.createExperimentAlert
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 46221dd8661d420aaa988ca7d41821ae
     experimentType: smart_forecast
     page: experiments
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 6bd85948-4f9b-ff9d-bf02-18defe062eec
   experienceID: f2c4f65b-a723-88af-875a-73737bbc9061
   optInRequired: 3
   timestamp: 1584480173
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 3
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.loadAssistant Number of times the user has loaded a MLTK Assistant.
{
   component: app.session.Splunk_ML_Toolkit.loadAssistant
   data: { [-]
     app: Splunk_ML_Toolkit
     experiment_id: 6196da5dc78f4606925295ead869f023
     experimentType: smart_clustering
     page: smart_clustering
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 54e3887b-acf3-ba6c-7f4f-cef1373c4d99
   experienceID: d914fba4-7ca1-4370-a123-3a03a01d2569
   optInRequired: 3
   timestamp: 1585270611
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.saveExperiment Users saving their work in MLTK Experiments.
{
  component: app.session.Splunk_ML_Toolkit.saveExperiment
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 4f390e49096c43adb05feb29fe9bfbbc
     experimentType: smart_outlier_detection
     page: smart_outlier_detection
}
deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: bdc34718-163c-56c0-3c7b-7d51380a258e
   experienceID: dfbde5b8-eb57-10a3-5ced-3be47f2b8ad2
   optInRequired: 3
   timestamp: 1583873964
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.scheduleExperimentTraining Users scheduling model re-training for MLTK Experiments.
{
   component: app.session.Splunk_ML_Toolkit.scheduleExperimentTraining
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 46221dd8661d420aaa988ca7d41821ae
     experimentType: smart_forecast
     page: experiments
     scheduleEnabled: true
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 629db0e3-0db1-0424-5e0d-f7e06e9965fb
   experienceID: f2c4f65b-a723-88af-875a-73737bbc9061
   optInRequired: 3
   timestamp: 1584480148
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 3
   visibility: anonymous,support
}
col_dimension Collects dimension of the dataset from model schema. Triggered during apply.
{ 
"col_dimension" : "Multiple columns single-dim input",
"command" : "onnx_input_shape
}
columns The number of columns being run through fit command.
{
 "columns": 10
}
command fit, apply, or score
{
  "command":"fit"
}
{
  "command":"apply"
}
{
  "command":"score"
}
csv_parse_time CSV parse time.
{
  "csv_parse_time": 0.019296
}
csv_read_time CSV read time.
{
  "csv_read_time": 0.019296
}
csv_render_time CSV render time.
{
  "csv_render_time" : 0.01162
}
deployment.app Apps installed per Splunk instance.
  component: deployment.app 
   data: { 
     enabled: true 
     host: monitoring 
     name: alert_webhook 
     version: 7.0.1 
   } 
   date: 2018-10-26 
   deploymentID: 99b6ffd8-2e80-5e3b-905c-8c6f6fd743a0 
   executionID: F0AE995E8653D768A360E73BE3F544 
   timestamp: 1540570045 
   transactionID: 89F7329E-86AD-BBFD-034F-209CB8A06F05 
   version: 3 
	visibility: anonymous, support
df_shape Shape of data input received from splunk. Triggered during apply.
{ 
"command" : "onnx_input_shape", 
"dataframe_shape" : "(768, 8)"
}
example_name Name of the Showcase example being run.
{
  'example_name': "'Predict Server Power Consumption'"
}
experiment_id ID of the fit and apply run on the Experiments page. All preprocessing steps and final fit have the same ID.
{
  "experiment_id": "6c47bca2776d4b6cb82685461d918180"
}
fit_time Amount of time it took to run the fit command.
{
  "fit_time": 39.87447
}
full_punct The punct of the data during fit or apply.
{
 "full_punct": [
...s-s-s[//:::.s-]s"s/-/////.s/."sss"://:/-//@:///-."s"/.s(;sssss)s/.s(,ss)s/...s/."s-ss
]
}
handle_time Time for the handler to handle the data.
{
  "handle_time": 0.274072
}
metrics_type Collects the type of request sent. Used to differentiate model upload and model inference call flows.

Contains two values:

  • onnx_upload
  • onnx_infer
{ 
"command" : "metrics_type",
"metrics_type" : "onnx_upload"
}
modelId Model ID in which user saves their model.
{
modelId: 56ce5ff2442604580eca0f57f36b5b9c
}
numColumns Total number of columns in the dataset.
{
 numColumns: 16 
}
numRows Total number of rows (events) in the dataset.
{
 numRows: 150
}
num_fields Total number of fields.
{
  "num_fields": 4
}
num_fields_fs Number of fields that have the fs for Field Selector prefix.
{
  "num_fields_fs": 9
}
num_fields_PC Number of fields that have the PC for preprocessed prefix.
{
  "num_fields_PC": 70
}
num_fields_prefixed Total number of preprocessed fields.
{
  "num_fields_prefixed": 28
}
num_fields_RS Number of fields that have the RS for Robust Scaler prefix.
{
  "num_fields_RS": 17
}
num_fields_SS Number of fields that have the SS for Standard Scaler prefix.
{
  "num_fields_SS": 30
}
num_fields_tfidf Number of fields that have used term frequency-inverse document frequency preprocessing.
{
  "num_fields_tfidf": 9
}
onnx_input_shape Shape of input data stored in the onnx model schema. Triggered during apply time.
{ 
 "command" : "onnx_input_shape",
 "onnx_input_shape" : "['unk__16', 8]"
}
onnx_model_size_on_disk Total size in MB taken up by the model file on the disk after encoding. Triggered during model upload.
{
"command": "onnx_model_size_on_disk_mb",
"onnx_model_size_on_disk_mb": 0.001977
}
onnx_upload_time Time taken to upload an onnx model file from UI. Triggered during model upload.
{
"command": "onnx_model_validate_and_upload",
"onnx_upload_time":0.8969220000000001
}
orig_sourcetype The original sourcetype of the machine data.
{
  "orig_sourcetype" : "access_combined_wcookie"
}
params Optional parameters used in fit step.
{
 "params": "{{\"with_std\": \"true\", \"with_mean\": \"true\"}}"
}
partialFit Whether or not the fit is a type of partial fit action.
{
partialFit: True
}
PID Process identifier associated with the command.
{
 "PID" : 63654
}
pipeline_stage Each preprocessing step on the Experiments page is assigned a number starting from 0. This helps determine the order of the preprocessing steps and length of the pipeline.
{
  "pipeline_stage": 0
}
rows The number of rows being run through fit command.
{
  'rows': 15627
}
scoringName Name of the scoring operation if whitelisted. If name is not whitelisted, logs the hash of the scoringName.
scoringName: mean_squared_error
scoringTimeSec Time taken by the scoring operation.
 scoringTimeSec: 3.398707
UUID Universally unique identifier associated with command. This is 128-bit and used to keep each fit/apply unique.
{
 "UUID": "7e0828e7-3059-4a43-8419-acc0e81f2f2d"
}
Last modified on 14 November, 2023
Troubleshooting the deep dives   Learn more about the Splunk Machine Learning Toolkit

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 5.4.1, 5.4.2


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters