Splunk® Machine Learning Toolkit

User Guide

Acrobat logo Download manual as PDF


This documentation does not apply to the most recent version of Splunk® Machine Learning Toolkit. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Share data in the Machine Learning Toolkit

When the Machine Learning Toolkit is deployed on Splunk Enterprise, the Splunk platform sends aggregated usage data to Splunk Inc. ("Splunk") to help improve MLTK in future releases. For information about how to opt in or out, and how the data is collected, stored, and governed, see Share data in Splunk Enterprise.

What data is collected

The Splunk Machine Learning Toolkit collects the following basic usage information:

Component Description Example
algo_name Name of algorithm used in fit or apply.
{
  "algo_name": "StandardScaler"
}
app_context Name of the app from which search is run.
{
 "app_context": "Splunk_ML_Toolkit"
}
apply_time Time the apply command took.
{
  'apply_time': 0.005
}
app.session.Splunk_ML_Toolkit.changeSmartAssistantStep User progress through an MLTK Smart Assistant.
{
   component: app.session.Splunk_ML_Toolkit.changeSmartAssistantStep
   data: { [-]
     app: Splunk_ML_Toolkit
     experiment_id: 63fb7afba756455d8056b5e547f8545f
     experimentType: smart_outlier_detection
     page: smart_outlier_detection
     previousStep: learn
     step: define
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 7185ae51-04aa-2025-8a57-6e0340e50c46
   experienceID: d914fba4-7ca1-4370-a123-3a03a01d2569
   optInRequired: 3
   timestamp: 1585251931
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.createExperiment User creating an MLTK Experiment.
{
   component: app.session.Splunk_ML_Toolkit.createExperiment
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 09ca5db894894c86b20b083941acaae0
     experimentType: smart_forecast
     page: experiments
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 8318866b-f2f5-35a4-1348-b82486b3a41f
   experienceID: dfbde5b8-eb57-10a3-5ced-3be47f2b8ad2
   optInRequired: 3
   timestamp: 1583786919
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.createExperimentAlert Users creating alerts for MLTK Experiments.
{
      component: app.session.Splunk_ML_Toolkit.createExperimentAlert
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 46221dd8661d420aaa988ca7d41821ae
     experimentType: smart_forecast
     page: experiments
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 6bd85948-4f9b-ff9d-bf02-18defe062eec
   experienceID: f2c4f65b-a723-88af-875a-73737bbc9061
   optInRequired: 3
   timestamp: 1584480173
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 3
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.loadAssistant Number of times the user has loaded a MLTK Assistant.
{
   component: app.session.Splunk_ML_Toolkit.loadAssistant
   data: { [-]
     app: Splunk_ML_Toolkit
     experiment_id: 6196da5dc78f4606925295ead869f023
     experimentType: smart_clustering
     page: smart_clustering
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 54e3887b-acf3-ba6c-7f4f-cef1373c4d99
   experienceID: d914fba4-7ca1-4370-a123-3a03a01d2569
   optInRequired: 3
   timestamp: 1585270611
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.saveExperiment Users saving their work in MLTK Experiments.
{
  component: app.session.Splunk_ML_Toolkit.saveExperiment
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 4f390e49096c43adb05feb29fe9bfbbc
     experimentType: smart_outlier_detection
     page: smart_outlier_detection
}
deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: bdc34718-163c-56c0-3c7b-7d51380a258e
   experienceID: dfbde5b8-eb57-10a3-5ced-3be47f2b8ad2
   optInRequired: 3
   timestamp: 1583873964
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 4
   visibility: anonymous,support
}
app.session.Splunk_ML_Toolkit.scheduleExperimentTraining Users scheduling model re-training for MLTK Experiments.
{
   component: app.session.Splunk_ML_Toolkit.scheduleExperimentTraining
   data: {
     app: Splunk_ML_Toolkit
     experiment_id: 46221dd8661d420aaa988ca7d41821ae
     experimentType: smart_forecast
     page: experiments
     scheduleEnabled: true
   }
   deploymentID: 88A80D96D80B30B6F48E3FF9A0B318
   eventID: 629db0e3-0db1-0424-5e0d-f7e06e9965fb
   experienceID: f2c4f65b-a723-88af-875a-73737bbc9061
   optInRequired: 3
   timestamp: 1584480148
   userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841
   version: 3
   visibility: anonymous,support
}
columns The number of columns being run through fit command.
{
 "columns": 10
}
command fit, apply, or score
{
  "command":"fit"
}
{
  "command":"apply"
}
{
  "command":"score"
}
csv_parse_time CSV parse time.
{
  "csv_parse_time": 0.019296
}
csv_read_time CSV read time.
{
  "csv_read_time": 0.019296
}
csv_render_time CSV render time.
{
  "csv_render_time" : 0.01162
}
deployment.app Apps installed per Splunk instance.
  component: deployment.app 
   data: { 
     enabled: true 
     host: monitoring 
     name: alert_webhook 
     version: 7.0.1 
   } 
   date: 2018-10-26 
   deploymentID: 99b6ffd8-2e80-5e3b-905c-8c6f6fd743a0 
   executionID: F0AE995E8653D768A360E73BE3F544 
   timestamp: 1540570045 
   transactionID: 89F7329E-86AD-BBFD-034F-209CB8A06F05 
   version: 3 
	visibility: anonymous, support
example_name Name of the Showcase example being run.
{
  'example_name': "'Predict Server Power Consumption'"
}
experiment_id ID of the fit and apply run on the Experiments page. All preprocessing steps and final fit have the same ID.
{
  "experiment_id": "6c47bca2776d4b6cb82685461d918180"
}
fit_time Amount of time it took to run the fit command.
{
  "fit_time": 39.87447
}
full_punct The punct of the data during fit or apply.
{
 "full_punct": [
...s-s-s[//:::.s-]s"s/-/////.s/."sss"://:/-//@:///-."s"/.s(;sssss)s/.s(,ss)s/...s/."s-ss
]
}
handle_time Time for the handler to handle the data.
{
  "handle_time": 0.274072
}
modelId Model ID in which user saves their model.
{
modelId: 56ce5ff2442604580eca0f57f36b5b9c
}
numColumns Total number of columns in the dataset.
{
 numColumns: 16 
}
numRows Total number of rows (events) in the dataset.
{
 numRows: 150
}
num_fields Total number of fields.
{
  "num_fields": 4
}
num_fields_fs Number of fields that have the fs for Field Selector prefix.
{
  "num_fields_fs": 9
}
num_fields_PC Number of fields that have the PC for preprocessed prefix.
{
  "num_fields_PC": 70
}
num_fields_prefixed Total number of preprocessed fields.
{
  "num_fields_prefixed": 28
}
num_fields_RS Number of fields that have the RS for Robust Scaler prefix.
{
  "num_fields_RS": 17
}
num_fields_SS Number of fields that have the SS for Standard Scaler prefix.
{
  "num_fields_SS": 30
}
num_fields_tfidf Number of fields that have used term frequency-inverse document frequency preprocessing.
{
  "num_fields_tfidf": 9
}
orig_sourcetype The original sourcetype of the machine data.
{
  "orig_sourcetype" : "access_combined_wcookie"
}
params Optional parameters used in fit step.
{
 "params": "{{\"with_std\": \"true\", \"with_mean\": \"true\"}}"
}
partialFit Whether or not the fit is a type of partial fit action.
{
partialFit: True
}
PID Process identifer associated with the command.
{
 "PID" : 63654
}
pipeline_stage Each preprocessing step on the Experiments page is assigned a number starting from 0. This helps determine the order of the preprocessing steps and length of the pipeline.
{
  "pipeline_stage": 0
}
rows The number of rows being run through fit command.
{
  'rows': 15627
}
scoringName Name of the scoring operation if whitelisted. If name is not whitelisted, logs the hash of the scoringName.
scoringName: mean_squared_error
scoringTimeSec Time taken by the scoring operation.
 scoringTimeSec: 3.398707
UUID Universally unique identifier associated with command. This is 128-bit and used to keep each fit/apply unique.
{
 "UUID": "7e0828e7-3059-4a43-8419-acc0e81f2f2d"
}
Last modified on 22 February, 2023
PREVIOUS
Troubleshooting the deep dives
  NEXT
Learn more about the Splunk Machine Learning Toolkit

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 5.2.0, 5.2.1, 5.2.2, 5.3.0, 5.3.1, 5.3.3


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters