Share data in the Splunk Machine Learning Toolkit

When the Splunk Machine Learning Toolkit (MLTK) is deployed on Splunk Enterprise, the Splunk platform sends aggregated usage data to Splunk Inc. ("Splunk") to help improve MLTK in future releases. For information about how to opt in or out, and how the data is collected, stored, and governed, see Share data in Splunk Enterprise.

What data is collected

The Splunk Machine Learning Toolkit collects the following basic usage information:

Component	Description	Example
`ai_processing_time`	Time taken to process the `ai` command request. Triggered during `ai` command usage.	{ "command": "ai", "ai_processing_time":0.7969220000000001 }
`algo_name`	Name of algorithm used in `fit` or `apply`.	{ "algo_name": "StandardScaler" }
`app_context`	Name of the app from which search is run.	{ "app_context": "Splunk_ML_Toolkit" }
`apply_time`	Time the `apply` command took.	{ 'apply_time': 0.005 }
`app.session.Splunk_ML_Toolkit.changeSmartAssistantStep`	User progress through an MLTK Smart Assistant.	{ component: app.session.Splunk_ML_Toolkit.changeSmartAssistantStep data: { [-] app: Splunk_ML_Toolkit experiment_id: 63fb7afba756455d8056b5e547f8545f experimentType: smart_outlier_detection page: smart_outlier_detection previousStep: learn step: define } deploymentID: 88A80D96D80B30B6F48E3FF9A0B318 eventID: 7185ae51-04aa-2025-8a57-6e0340e50c46 experienceID: d914fba4-7ca1-4370-a123-3a03a01d2569 optInRequired: 3 timestamp: 1585251931 userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841 version: 4 visibility: anonymous,support }
`app.session.Splunk_ML_Toolkit.createExperiment`	User creating an MLTK Experiment.	{ component: app.session.Splunk_ML_Toolkit.createExperiment data: { app: Splunk_ML_Toolkit experiment_id: 09ca5db894894c86b20b083941acaae0 experimentType: smart_forecast page: experiments } deploymentID: 88A80D96D80B30B6F48E3FF9A0B318 eventID: 8318866b-f2f5-35a4-1348-b82486b3a41f experienceID: dfbde5b8-eb57-10a3-5ced-3be47f2b8ad2 optInRequired: 3 timestamp: 1583786919 userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841 version: 4 visibility: anonymous,support }
`app.session.Splunk_ML_Toolkit.createExperimentAlert`	Users creating alerts for MLTK Experiments.	{ component: app.session.Splunk_ML_Toolkit.createExperimentAlert data: { app: Splunk_ML_Toolkit experiment_id: 46221dd8661d420aaa988ca7d41821ae experimentType: smart_forecast page: experiments } deploymentID: 88A80D96D80B30B6F48E3FF9A0B318 eventID: 6bd85948-4f9b-ff9d-bf02-18defe062eec experienceID: f2c4f65b-a723-88af-875a-73737bbc9061 optInRequired: 3 timestamp: 1584480173 userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841 version: 3 visibility: anonymous,support }
`app.session.Splunk_ML_Toolkit.loadAssistant`	Number of times the user has loaded a MLTK Assistant.	{ component: app.session.Splunk_ML_Toolkit.loadAssistant data: { [-] app: Splunk_ML_Toolkit experiment_id: 6196da5dc78f4606925295ead869f023 experimentType: smart_clustering page: smart_clustering } deploymentID: 88A80D96D80B30B6F48E3FF9A0B318 eventID: 54e3887b-acf3-ba6c-7f4f-cef1373c4d99 experienceID: d914fba4-7ca1-4370-a123-3a03a01d2569 optInRequired: 3 timestamp: 1585270611 userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841 version: 4 visibility: anonymous,support }
`app.session.Splunk_ML_Toolkit.saveExperiment`	Users saving their work in MLTK Experiments.	{ component: app.session.Splunk_ML_Toolkit.saveExperiment data: { app: Splunk_ML_Toolkit experiment_id: 4f390e49096c43adb05feb29fe9bfbbc experimentType: smart_outlier_detection page: smart_outlier_detection } deploymentID: 88A80D96D80B30B6F48E3FF9A0B318 eventID: bdc34718-163c-56c0-3c7b-7d51380a258e experienceID: dfbde5b8-eb57-10a3-5ced-3be47f2b8ad2 optInRequired: 3 timestamp: 1583873964 userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841 version: 4 visibility: anonymous,support }
`app.session.Splunk_ML_Toolkit.scheduleExperimentTraining`	Users scheduling model re-training for MLTK Experiments.	{ component: app.session.Splunk_ML_Toolkit.scheduleExperimentTraining data: { app: Splunk_ML_Toolkit experiment_id: 46221dd8661d420aaa988ca7d41821ae experimentType: smart_forecast page: experiments scheduleEnabled: true } deploymentID: 88A80D96D80B30B6F48E3FF9A0B318 eventID: 629db0e3-0db1-0424-5e0d-f7e06e9965fb experienceID: f2c4f65b-a723-88af-875a-73737bbc9061 optInRequired: 3 timestamp: 1584480148 userID: 60749ba2789ec1eee0ada6a0b5680512460559541023017ad6f5b4a3b0172841 version: 3 visibility: anonymous,support }
`col_dimension`	Collects dimension of the dataset from model schema. Triggered during `apply`.	{ "col_dimension" : "Multiple columns single-dim input", "command" : "onnx_input_shape }
`columns`	The number of columns being run through `fit` command.	{ "columns": 10 }
`command`	`fit`, `apply`, or `score`	{ "command":"fit" } { "command":"apply" } { "command":"score" }
`csv_parse_time`	CSV parse time.	{ "csv_parse_time": 0.019296 }
`csv_read_time`	CSV read time.	{ "csv_read_time": 0.019296 }
`csv_render_time`	CSV render time.	{ "csv_render_time" : 0.01162 }
`deployment.app`	Apps installed per Splunk instance.	component: deployment.app data: { enabled: true host: monitoring name: alert_webhook version: 7.0.1 } date: 2018-10-26 deploymentID: 99b6ffd8-2e80-5e3b-905c-8c6f6fd743a0 executionID: F0AE995E8653D768A360E73BE3F544 timestamp: 1540570045 transactionID: 89F7329E-86AD-BBFD-034F-209CB8A06F05 version: 3 visibility: anonymous, support
`df_shape`	Shape of data input received from splunk. Triggered during `apply`.	{ "command" : "onnx_input_shape", "dataframe_shape" : "(768, 8)" }
`example_name`	Name of the Showcase example being run.	{ 'example_name': "'Predict Server Power Consumption'" }
`experiment_id`	ID of the `fit` and `apply` run on the Experiments page. All preprocessing steps and final `fit` have the same ID.	{ "experiment_id": "6c47bca2776d4b6cb82685461d918180" }
`fit_time`	Amount of time it took to run the `fit` command.	{ "fit_time": 39.87447 }
`full_punct`	The punct of the data during `fit` or `apply`.	{ "full_punct": [ ...s-s-s[//:::.s-]s"s/-/////.s/."sss"://:/-//@:///-."s"/.s(;sssss)s/.s(,ss)s/...s/."s-ss ] }
`handle_time`	Time for the handler to handle the data.	{ "handle_time": 0.274072 }
`metrics_type`	Collects the type of request sent. Used to differentiate model upload and model inference call flows. Contains two values: `onnx_upload` `onnx_infer`	{ "command" : "metrics_type", "metrics_type" : "onnx_upload" }
`model`	To capture the LLM model name under the specific provider while running the `ai` command.	{ "command": "ai", "model":"gpt-4o" }
`modelId`	Model ID in which user saves their model.	{ modelId: 56ce5ff2442604580eca0f57f36b5b9c }
`model_upload`	Monitors the model upload process to determine if the model has been successfully uploaded and is ready for inference.	{ "command": "upload", "metrics_type": "onnx_upload" "model_upload": "1" }
`numColumns`	Total number of columns in the dataset.	{ numColumns: 16 }
`numRows`	Total number of rows (events) in the dataset.	{ numRows: 150 }
`num_fields`	Total number of fields.	{ "num_fields": 4 }
`num_fields_fs`	Number of fields that have the `fs` for Field Selector prefix.	{ "num_fields_fs": 9 }
`num_fields_PC`	Number of fields that have the `PC` for preprocessed prefix.	{ "num_fields_PC": 70 }
`num_fields_prefixed`	Total number of preprocessed fields.	{ "num_fields_prefixed": 28 }
`num_fields_RS`	Number of fields that have the `RS` for Robust Scaler prefix.	{ "num_fields_RS": 17 }
`num_fields_SS`	Number of fields that have the `SS` for Standard Scaler prefix.	{ "num_fields_SS": 30 }
`num_fields_tfidf`	Number of fields that have used term frequency-inverse document frequency preprocessing.	{ "num_fields_tfidf": 9 }
`onnx_input_shape`	Shape of input data stored in the onnx model schema. Triggered during apply time.	{ "command" : "onnx_input_shape", "onnx_input_shape" : "['unk__16', 8]" }
`onnx_model_size_on_disk`	Total size in MB taken up by the model file on the disk after encoding. Triggered during model upload.	{ "command": "onnx_model_size_on_disk_mb", "onnx_model_size_on_disk_mb": 0.001977 }
`onnx_upload_time`	Time taken to upload an onnx model file from UI. Triggered during model upload.	{ "command": "onnx_model_validate_and_upload", "onnx_upload_time":0.8969220000000001 }
`orig_sourcetype`	The original sourcetype of the machine data.	{ "orig_sourcetype" : "access_combined_wcookie" }
`params`	Optional parameters used in `fit` step.	{ "params": "{{\"with_std\": \"true\", \"with_mean\": \"true\"}}" }
`params`	Collects the boolean value of `supervise_split_by`. Checks whether DecisionTreeRegressor is used as part of DensityFunction.	{ "command": " "supervise_split_by": "true" " }
`partialFit`	Whether or not the `fit` is a type of partial fit action.	{ partialFit: True }
`PID`	Process identifier associated with the command.	{ "PID" : 63654 }
`pipeline_stage`	Each preprocessing step on the Experiments page is assigned a number starting from 0. This helps determine the order of the preprocessing steps and length of the pipeline.	{ "pipeline_stage": 0 }
`provider`	To capture the provider name while running the `ai` command.	{ "command": "ai", "provider": "Openai" }
`rows`	The number of rows being run through `fit` command.	{ 'rows': 15627 }
`rows`	The number of rows processed at a given `ai` command request.	{ "command": "ai", "rows":100 }
`rows_processor_time`	Time taken to process the rows in seconds while using the `ai` command request.	{ "command": "ai", "rows_processor_time":0.7969220000000001 }
`scoringName`	Name of the scoring operation if whitelisted. If name is not whitelisted, logs the hash of the `scoringName`.	scoringName: mean_squared_error
`scoringTimeSec`	Time taken by the scoring operation.	scoringTimeSec: 3.398707
`UUID`	Universally unique identifier associated with command. This is 128-bit and used to keep each `fit`/`apply` unique.	{ "UUID": "7e0828e7-3059-4a43-8419-acc0e81f2f2d" }

Related answers from Splunk Community

Share data in the Splunk Machine Learning Toolkit

What data is collected

Comments

Share data in the Splunk Machine Learning Toolkit

Was this topic useful?