Splunk® Machine Learning Toolkit

ML-SPL API Guide

Support Vector Regressor example

You can add custom algorithms to the MLTK app. This example adds the Support Vector Regressor algorithm from scii-kit learn. For more information see the scikit-learn documentation: https://scikit-learn.org/stable/user_guide.html

Add the Support Vector Regressor algorithm

Follow these steps to add the Support Vector Regressor algorithm.

  1. Register the Support Vector Regressor algorithm.
  2. Create the Python file.
  3. Define the class.
  4. Define the init method.
  5. Define the register codecs method.

Register the Support Vector Regressor algorithm

Register the algorithm in algos.conf using one of the following methods.

Register the algorithm using the REST API

Use the following curl command to register using the REST API:

$ curl -k -u admin:<admin pass> https://localhost:8089/servicesNS/nobody/Splunk_ML_Toolkit/configs/conf-algos -d name="SVR"

Register the algorithm manually

Modify or create the algos.conf file located in $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/local/ and add the following stanza to register your algorithm:

 [SVR]

When you register the algorithm with this method, you must restart Splunk.

Create the Python file

Create the Python file in the algos folder. For this example, you create $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/SVR.py.

from sklearn.svm import SVR as _SVR
 
from base import BaseAlgo, RegressorMixin
from util.param_util import convert_params

Define the class

Inherit from both the RegressorMixin and the BaseAlgo. The class name is the name of the algorithm.

When inheriting from multiple classes make sure the RegressorMixin comes first. BaseAlgo raises errors if a method is not implemented. In this case, the methods are defined in RegressorMixin so you must list that class first.

class SVR(RegressorMixin, BaseAlgo):
	"""Predict numeric target variables via scikit-learn's SVR algorithm."""

Define the init method

The __init__ method passes the options from the search to the algorithm:

  • The RegressorMixin handle_options method is used to check for feature & target variables.
  • The RegressorMixin implicitly expects a variable estimator to be attached.
    def __init__(self, options):
        self.handle_options(options)

        params = options.get('params', {})
        out_params = convert_params(
            params,
            floats=['C', 'gamma'],
            strs=['kernel'],
            ints=['degree'],
        )

        self.estimator = _SVR(**out_params)


Define the register codecs method

Save the model so that it can be applied on new data. RegressorMixin has already defined the fit and apply methods, but to save, you must define the register_codecs method.

Add the following 2 items to serialize:

  1. The algorithm itself.
  2. The imported SVR module.
    @staticmethod
    def register_codecs():
        from codec.codecs import SimpleObjectCodec
        from codec import codecs_manager
        codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec)
        codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)

Most often, you will not need to use anything outside of the SimpleObjectCodec. If there are circular references or unusual properties to the algorithm, you might need to write your own. A codec defines how to serialize (save) and deserialize (load) Python objects into and from strings.

The following is an example of a custom codec for a subcomponent in the DecisionTreeClassifier algorithm:

from codec.codecs import BaseCodec


class TreeCodec(BaseCodec):
    @classmethod
    def encode(cls, obj):
        import sklearn.tree
        assert type(obj) == sklearn.tree._tree.Tree

        init_args = obj.__reduce__()[1]
        state = obj.__getstate__()

        return {
            '__mlspl_type': [type(obj).__module__, type(obj).__name__],
            'init_args': init_args,
            'state': state
        }

    @classmethod
    def decode(cls, obj):
        import sklearn.tree

        init_args = obj['init_args']
        state = obj['state']

        t = sklearn.tree._tree.Tree(*init_args)
        t.__setstate__(state)

        return t

Then in DecisionTreeClassifier.py, the register_codecs method looks as follows:

    @staticmethod
    def register_codecs():
        from codec.codecs import SimpleObjectCodec, TreeCodec
        codecs_manager.add_codec('algos.DecisionTreeClassifier', 'DecisionTreeClassifier', SimpleObjectCodec)
        codecs_manager.add_codec('sklearn.tree.tree', 'DecisionTreeClassifier', SimpleObjectCodec)
        codecs_manager.add_codec('sklearn.tree._tree', 'Tree', TreeCodec)

End-to-end example

This Support Vector Regressor example covers the following tasks:

  • Using the BaseAlgo and a mixin
  • Converting parameters
  • Using register_codecs

In addition to inheriting from the BaseAlgo class, this example also uses the RegressorMixin class. The mixin has already filled out the fit and apply methods meaning you only need to define the __init__ and register_codecs methods. See the scikit-learn documentation for more details on the Support Vector Regressor (SVR) algorithm.

from sklearn.svm import SVR as _SVR

from base import BaseAlgo, RegressorMixin
from util.param_util import convert_params


class SVR(RegressorMixin, BaseAlgo):

    def __init__(self, options):
        self.handle_options(options)

        params = options.get('params', {})
        out_params = convert_params(
            params,
            floats=['C', 'gamma'],
            strs=['kernel'],
            ints=['degree'],
        )

        self.estimator = _SVR(**out_params)

    @staticmethod
    def register_codecs():
        from codec.codecs import SimpleObjectCodec
        from codec import codecs_manager
        codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec)
        codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)
Last modified on 13 February, 2024
Agglomerative Clustering example   Savitzky-Golay Filter example

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 4.4.0, 4.4.1, 4.4.2, 4.5.0, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0, 5.3.1, 5.3.3, 5.4.0, 5.4.1, 5.4.2, 5.5.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters