Splunk® Machine Learning Toolkit

User Guide

Acrobat logo Download manual as PDF


This documentation does not apply to the most recent version of Splunk® Machine Learning Toolkit. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Configure the fit and apply commands

You can configure the fit and apply commands by setting properties in the mlspl.conf configuration file located in the default directory:

$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/default/mlspl.conf

In this file, you can specify default settings for all algorithms, or for an individual algorithm. To apply global settings, use the [default] stanza and algorithm-specific settings in a stanza named for the algorithm, for example, [LinearRegression] for the LinearRegression algorithm. Be aware that not all global settings can be set or overwritten in an algorithm-specific section. For details, see How to copy and edit a configuration file.

To avoid losing your configuration file changes when you upgrade the app, create a copy of the mlspl.conf file with only the modified stanzas and settings, then save it to $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/local/

Setting Default Description
max_inputs 100000 The maximum number of events an algorithm considers when fitting a model. If this limit is exceeded and use_sampling is true, the fit command downsamples its input using the Reservoir Sampling algorithm before fitting a model. If use_sampling is false and this limit is exceeded, the fit command throws an error.
use_sampling true Indicates whether to use Reservoir Sampling for data sets that exceed max_inputs or to instead throw an error.
max_fit_time 600 The maximum time, in seconds, to spend in the "fit" phase of an algorithm. This setting does not relate to the other phases of a search such as retrieving events from an index.
max_memory_usage_mb 1000 The maximum allowed memory usage, in megabytes, by the fit command while fitting a model.
max_model_size_mb 15 The maximum allowed size of a model, in megabytes, created by the fit command. Some algorithms (e.g. SVM and RandomForest) might create unusually large models, which can lead to performance problems with bundle replication.
Last modified on 07 December, 2017
PREVIOUS
Configure permissions for ML-SPL commands
  NEXT
Algorithms

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 2.3.0, 2.4.0, 3.0.0


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters