
Support Vector Regressor example
This example adds scikit-learn's Support Vector Regressor algorithm to the Splunk Machine Learning Toolkit. This Support Vector Regressor example covers the following tasks:
- Using the
BaseAlgo
and a mixin - Converting parameters
- Using
register_codecs
In addition to inheriting from the BaseAlgo class, this example also uses the RegressorMixin class. The mixin has already filled out the fit and apply methods meaning you only need to define the __init__ and register_codecs methods. See the scikit-learn documentation for more details on the Support Vector Regressor (SVR) algorithm.
Steps
Follow these steps to add the Support Vector Regressor algorithm.
- Register the algorithm in
algos.conf
using one of the following methods.
- Register the algorithm using the REST API:
$ curl -k -u admin:<admin pass> https://localhost:8089/servicesNS/nobody/Splunk_ML_Toolkit/configs/conf-algos -d name="SVR"
- Register the algorithm manually:
Modify or create thealgos.conf
file located in$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/local/
and add the following stanza to register your algorithm[SVR]
When you register the algorithm with this method, you must restart Splunk.
- Register the algorithm using the REST API:
- Create the python file in the
algos
folder. For this example, we create$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/SVR.py
.
from sklearn.svm import SVR as _SVR from base import BaseAlgo, RegressorMixin from util.param_util import convert_params
- Define the class.
Here we inherit from both the RegressorMixin and the BaseAlgo.When inheriting from multiple classes here, we need to make sure the RegressorMixin comes first. BaseAlgo will raise errors if a method is not implemented. In this case, our methods are defined in RegressorMixin so we must list that class first.
class SVR(RegressorMixin, BaseAlgo): """Predict numeric target variables via scikit-learn's SVR algorithm."""
- Define the __init__ method.
- Note we use the RegressorMixin's handle_options method to check for feature & target variables.
- The RegressorMixin also implicitly expects a variable 'estimator' to be attached to self.
def __init__(self, options): self.handle_options(options) params = options.get('params', {}) out_params = convert_params( params, floats=['C', 'gamma'], strs=['kernel'], ints=['degree'], ) self.estimator = _SVR(**out_params)
-
Define the register_codecs method.
- We would like to save the model so that it can be applied on new data.
- RegressorMixin has already defined the fit & apply methods for us, but to save, we must define the register_codecs method
- Here we add two things to serialize:
- one, the algorithm itself
- two, the imported SVR module
@staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec from codec import codecs_manager codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec) codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)
Most often, you will not need to use anything outside of the SimpleObjectCodec but sometimes if there are circular references or unusual properties to the algorithm, you may need to write your own. Writing your own codec sounds harder than it really is. A codec defines how to serialize (save) and deserialize (load) python objects into and from strings. Here is an example of a custom codec needed for a subcomponent in the DecisionTreeClassifier algorithm.
from codec.codecs import BaseCodec class TreeCodec(BaseCodec): @classmethod def encode(cls, obj): import sklearn.tree assert type(obj) == sklearn.tree._tree.Tree init_args = obj.__reduce__()[1] state = obj.__getstate__() return { '__mlspl_type': [type(obj).__module__, type(obj).__name__], 'init_args': init_args, 'state': state } @classmethod def decode(cls, obj): import sklearn.tree init_args = obj['init_args'] state = obj['state'] t = sklearn.tree._tree.Tree(*init_args) t.__setstate__(state) return t
So then in DecisionTreeClassifier.py, the register_codecs method looks like this:
@staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec, TreeCodec codecs_manager.add_codec('algos.DecisionTreeClassifier', 'DecisionTreeClassifier', SimpleObjectCodec) codecs_manager.add_codec('sklearn.tree.tree', 'DecisionTreeClassifier', SimpleObjectCodec) codecs_manager.add_codec('sklearn.tree._tree', 'Tree', TreeCodec)
Finished example
from sklearn.svm import SVR as _SVR from base import BaseAlgo, RegressorMixin from util.param_util import convert_params class SVR(RegressorMixin, BaseAlgo): def __init__(self, options): self.handle_options(options) params = options.get('params', {}) out_params = convert_params( params, floats=['C', 'gamma'], strs=['kernel'], ints=['degree'], ) self.estimator = _SVR(**out_params) @staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec from codec import codecs_manager codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec) codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)
PREVIOUS Agglomerative Clustering example |
NEXT Savitzky-Golay Filter example |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 4.4.0, 4.4.1, 4.4.2, 4.5.0, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0, 5.3.1, 5.3.3, 5.4.0, 5.4.1
Feedback submitted, thanks!