Splunk® Machine Learning Toolkit

User Guide

This documentation does not apply to the most recent version of Splunk® Machine Learning Toolkit. For documentation on the most recent version, go to the latest release.

What's new

Here's what's new in each version of the Splunk Machine Learning Toolkit:

Version 2.4.0

Features and improvements

Version 2.3.0

Features and improvements

  • Entries in the "Load Existing Settings" tab are now unique per-user instead of being shared with all users. Entries created prior to version 2.3 will continue to be accessible by all users.
  • Two new algorithms have been added:
    • ACF (autocorrelation function)
    • PACF (partial autocorrelation function)
  • The Forecast Time Series assistant now allows for the selection of the ARIMA forecasting algorithm. Additional panels have been added for inspecting properties unique to ARIMA models.

Version 2.2.1

This version contains bug fixes. See Fixed issues for details.

Version 2.2.0

Features and improvements

  • The preprocessing feature has been redesigned and is offered in the Predict Numeric Fields, Predict Categorical Fields, and Clustering Numeric Events assistants. See Preprocessing for information.
  • The ML-SPL API has been updated to make it easier for developers and partners to import custom algorithms in order to extend the capabilities of the Splunk Machine Learning Toolkit. See ML-SPL API Guide for information.
  • A new video overview of what's new in versions 2.1.0 and 2.2.0 of the Splunk Machine Learning Toolkit is available at http://tiny.cc/splunkmlupdate.

Version 2.1.0

Features and improvements

Enhancements to the Detect Numeric Outliers assistant:

  • You can now specify one or more fields to split by (up to 5). Specifying one or more split by fields enables you to see the values of the field you are analyzing grouped by the values of the split by fields in visualizations.
  • Enhanced visualizations including a new Data Distribution histogram that shows the number of data points within the threshold and the number of data points outside the threshold.


For more information, see Detect Numeric Outliers.

For bug fixes, see Fixed issues.

Version 2.0.1

The Downsampled Line Chart custom visualization now supports the same drilldown actions as the built-in Line Chart visualization.

For bug fixes, see Fixed issues.

Version 2.0.0

Features and improvements

  • The app has been renamed to "Machine Learning Toolkit."
  • New Cluster Numeric Events assistant that steps you through how to perform clustering on your own data. This assistant includes the ability to preprocess data by applying StandardScaler, PCA, or KernelPCA methods. See Cluster Numeric Events.
  • Updated examples for the Cluster Numeric Events showcase.
  • A streaming_apply setting has been added to the mlspl.conf file, which allows you to run the apply command on your indexers. For details, see Use your indexers to apply models.
  • The Predict Numeric Fields and Predict Categorical Fields assistants now support multiple algorithms.
  • A new visualization type has been added: Scatterplot matrix. This visualization is available in the Cluster Numeric Events assistant.
  • The Machine Learning Toolkit app has a walk-through tour and each assistant has its own walk-through tour.
  • A link to machine learning video tutorials has been added to the top menu bar and the Showcase page.
  • Tooltips have been added for the fields in each of the assistants.

Algorithms

  • The SGDClassifier algorithm is now supported. For details, see Algorithms.
  • The SGDRegressor algorithm is now supported. For details, see Algorithms.
  • The ARIMA algorithm is now supported. For details, see Algorithms.
  • The LogisticRegression algorithm supports a new parameter probabilities=<true|false>. For details, see Algorithms.
  • Summary support has been added to the RandomForestClassifier and RandomForestRegressor algorithms. For details, see Algorithms.
  • The BernoulliNB, GaussianNB, Birch, and StandardScaler algorithms support a new parameter partial_fit=<true|false>. For details, see Algorithms.

Version 1.3.0

Features and improvements

  • You can now create alerts within the Machine Learning Toolkit from some of the panels in the assistants. Alerts can be viewed under Scheduled Jobs > Alerts.
  • You can now schedule model training in the Predict Numeric Fields and Predict Categorical Fields assistants by clicking the icon on the right side of the Fit Model button.
Mlapp fitmodelscheduleicon.jpg
Schedules can be viewed under Scheduled Jobs > Scheduled Training.
  • The Training/Test split can now be set to a 100/0 split (no split).

Version 1.2.0

Features and improvements

  • The DecisionTreeClassifier and DecisionTreeRegressor algorithms are now supported. For details, see Algorithms.
  • The Detect Numeric Outliers assistant now includes an Include current point checkbox to support the "current" parameter of the streamstats command.
  • The Predict Numeric Fields assistant has an improved Actual vs. Predicted Line Chart, which replaces the Actual vs. Predicted Overlay.
  • Two macros in the Forecast Time Series assistant have been merged into one macro.
  • The max_features parameter of the RandomForestClassifier and RandomForestRegressor algorithms now accepts values with the float data type.
  • The Remove from history confirmation dialog box has been improved.
  • A basic framework has been implemented for displaying Bootstrap's modal dialog boxes in the Machine Learning Toolkit and Showcase UI.

Version 1.1.0

Features and improvements

  • The visualizations in the Cluster Events showcase have been updated.
  • The Predict Numeric Fields and Predict Categorical Fields assistants now allow you to enter wildcards in Fields to use for predicting. For example, to specify both the Packets Received and Packets Sent fields, enter "Packets*". Wildcards are case sensitive.
  • The Select All and Select None buttons on the Predict Numeric Fields and Predict Categorical Fields assistants have been moved inside the dropdown list.


Algorithms

  • The KernelRidge regression algorithm is now supported. For details, see Algorithms.


Bug fixes

The following bugs were fixed. For details, see Fixed issues.

  • Changing the time range or search mode on assistant search bars will now re-run the search in the search bar, the same as the default Search page in Splunk Enterprise.
  • Custom visualizations will now display time stamps correctly when the event time differs from browser time.
  • Caching issues have been fixed, and the app no longer loads old versions of resources after an update.
  • Exit points in assistants now correctly have the same time range as that assistant's search bar.

Version 1.0.0

This is the first release of the Machine Learning Toolkit and Showcase app.

Last modified on 01 September, 2017
Preprocessing   Known issues

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 2.4.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters