Support Vector Regressor example
You can add custom algorithms to the MLTK app. This example adds the Support Vector Regressor algorithm from scii-kit learn. For more information see the scikit-learn documentation: https://scikit-learn.org/stable/user_guide.html
Add the Support Vector Regressor algorithm
Follow these steps to add the Support Vector Regressor algorithm.
- Register the Support Vector Regressor algorithm.
- Create the Python file.
- Define the class.
- Define the init method.
- Define the register codecs method.
Register the Support Vector Regressor algorithm
Register the algorithm in algos.conf
using one of the following methods.
Register the algorithm using the REST API
Use the following curl command to register using the REST API:
$ curl -k -u admin:<admin pass> https://localhost:8089/servicesNS/nobody/Splunk_ML_Toolkit/configs/conf-algos -d name="SVR"
Register the algorithm manually
Modify or create the algos.conf
file located in $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/local/
and add the following stanza to register your algorithm:
[SVR]
When you register the algorithm with this method, you must restart Splunk.
Create the Python file
Create the Python file in the algos
folder. For this example, you create $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/SVR.py
.
from sklearn.svm import SVR as _SVR from base import BaseAlgo, RegressorMixin from util.param_util import convert_params
Define the class
Inherit from both the RegressorMixin and the BaseAlgo. The class name is the name of the algorithm.
When inheriting from multiple classes make sure the RegressorMixin comes first. BaseAlgo raises errors if a method is not implemented. In this case, the methods are defined in RegressorMixin so you must list that class first.
class SVR(RegressorMixin, BaseAlgo): """Predict numeric target variables via scikit-learn's SVR algorithm."""
Define the init method
The __init__
method passes the options from the search to the algorithm:
- The RegressorMixin
handle_options
method is used to check for feature & target variables. - The RegressorMixin implicitly expects a variable estimator to be attached.
def __init__(self, options): self.handle_options(options) params = options.get('params', {}) out_params = convert_params( params, floats=['C', 'gamma'], strs=['kernel'], ints=['degree'], ) self.estimator = _SVR(**out_params)
Define the register codecs method
Save the model so that it can be applied on new data. RegressorMixin has already defined the fit
and apply
methods, but to save, you must define the register_codecs
method.
Add the following 2 items to serialize:
- The algorithm itself.
- The imported SVR module.
@staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec from codec import codecs_manager codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec) codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)
Most often, you will not need to use anything outside of the SimpleObjectCodec. If there are circular references or unusual properties to the algorithm, you might need to write your own. A codec defines how to serialize (save) and deserialize (load) Python objects into and from strings.
The following is an example of a custom codec for a subcomponent in the DecisionTreeClassifier algorithm:
from codec.codecs import BaseCodec class TreeCodec(BaseCodec): @classmethod def encode(cls, obj): import sklearn.tree assert type(obj) == sklearn.tree._tree.Tree init_args = obj.__reduce__()[1] state = obj.__getstate__() return { '__mlspl_type': [type(obj).__module__, type(obj).__name__], 'init_args': init_args, 'state': state } @classmethod def decode(cls, obj): import sklearn.tree init_args = obj['init_args'] state = obj['state'] t = sklearn.tree._tree.Tree(*init_args) t.__setstate__(state) return t
Then in DecisionTreeClassifier.py, the register_codecs
method looks as follows:
@staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec, TreeCodec codecs_manager.add_codec('algos.DecisionTreeClassifier', 'DecisionTreeClassifier', SimpleObjectCodec) codecs_manager.add_codec('sklearn.tree.tree', 'DecisionTreeClassifier', SimpleObjectCodec) codecs_manager.add_codec('sklearn.tree._tree', 'Tree', TreeCodec)
End-to-end example
This Support Vector Regressor example covers the following tasks:
- Using the
BaseAlgo
and a mixin - Converting parameters
- Using
register_codecs
In addition to inheriting from the BaseAlgo class, this example also uses the RegressorMixin class. The mixin has already filled out the fit and apply methods meaning you only need to define the __init__
and register_codecs
methods. See the scikit-learn documentation for more details on the Support Vector Regressor (SVR) algorithm.
from sklearn.svm import SVR as _SVR from base import BaseAlgo, RegressorMixin from util.param_util import convert_params class SVR(RegressorMixin, BaseAlgo): def __init__(self, options): self.handle_options(options) params = options.get('params', {}) out_params = convert_params( params, floats=['C', 'gamma'], strs=['kernel'], ints=['degree'], ) self.estimator = _SVR(**out_params) @staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec from codec import codecs_manager codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec) codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)
Agglomerative Clustering example | Savitzky-Golay Filter example |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 4.4.0, 4.4.1, 4.4.2, 4.5.0, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0, 5.3.1, 5.3.3, 5.4.0, 5.4.1, 5.4.2, 5.5.0
Feedback submitted, thanks!