Support Vector Regressor
This example covers the following tasks:
- using the
BaseAlgo
and a mixin class - converting parameters
- using the
register_codecs
method
In this example, you will add scikit-learn's Support Vector Regressor algorithm to the Splunk Machine Learning Toolkit. See the scikit-learn documentation for details on the Support Vector Regressor class.
The custom algorithm inherits from two classes: BaseAlgo
and RegressorMixin
. The mixin has already filled out the fit
and apply
methods for us, you only need to define the __init__
and register_codecs
methods.
This example uses the ML-SPL API available in the Splunk Machine Learning Toolkit version 2.2.0 and later. Verify your Splunk Machine Learning Toolkit version before using this example.
Steps
Do the following:
- Register the algorithm in
__init__.py
.
Modify the__init__.py
file located in$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos
to register your algorithm by adding it to the list:__all__ = [ "SVR", "LinearRegression", "Lasso", ... ]
- Create the python file in the
algos
folder. For this example, create$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/SVR.py
.
from sklearn.svm import SVR as _SVR from base import BaseAlgo, RegressorMixin from util.param_util import convert_params
- Define the class.
Inherit from both theRegressorMixin
andBaseAlgo
.When inheriting from multiple classes here make sure the RegressorMixin comes first. BaseAlgo will raise errors if a method is not implemented. In this case,the methods are defined in RegressorMixin so you must list that class first.
class SVR(RegressorMixin, BaseAlgo): """Predict numeric target variables via scikit-learn's SVR algorithm."""
- Define the __init__ method.
- Use the
RegressorMixin
’shandle_options
method to check for feature & target variables. - The
RegressorMixin
also implicitly expects the attributeestimator
to be set onself
.
def __init__(self, options): self.handle_options(options) params = options.get('params', {}) out_params = convert_params( params, floats=['C', 'gamma'], strs=['kernel'], ints=['degree'], ) self.estimator = _SVR(**out_params)
- Use the
-
Define the register_codecs method.
- To apply the model to new data, you must save the model.
RegressorMixin
has already defined thefit
&apply
methods - to save a model, you must define theregister_codecs
method- Add two things to serialize:
- the custom algorithm class
- the imported SVR module
@staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec from codec import codecs_manager codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec) codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)
Most often, you will not need to use anything other than the SimpleObjectCodec. If there are circular references or unusual properties to the algorithm, you may need to write your own. A codec defines how to serialize (save) and deserialize (load) python objects into and from strings. Here is an example of a custom codec needed for a subcomponent in the DecisionTreeClassifier algorithm.
from codec.codecs import BaseCodec class TreeCodec(BaseCodec): @classmethod def encode(cls, obj): import sklearn.tree assert type(obj) == sklearn.tree._tree.Tree init_args = obj.__reduce__()[1] state = obj.__getstate__() return { '__mlspl_type': [type(obj).__module__, type(obj).__name__], 'init_args': init_args, 'state': state } @classmethod def decode(cls, obj): import sklearn.tree init_args = obj['init_args'] state = obj['state'] t = sklearn.tree._tree.Tree(*init_args) t.__setstate__(state) return t
In DecisionTreeClassifier.py, the
register_codecs
method is:@staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec, TreeCodec codecs_manager.add_codec('algos.DecisionTreeClassifier', 'DecisionTreeClassifier', SimpleObjectCodec) codecs_manager.add_codec('sklearn.tree.tree', 'DecisionTreeClassifier', SimpleObjectCodec) codecs_manager.add_codec('sklearn.tree._tree', 'Tree', TreeCodec)
Finished example
from sklearn.svm import SVR as _SVR from base import BaseAlgo, RegressorMixin from util.param_util import convert_params class SVR(RegressorMixin, BaseAlgo): def __init__(self, options): self.handle_options(options) params = options.get('params', {}) out_params = convert_params( params, floats=['C', 'gamma'], strs=['kernel'], ints=['degree'], ) self.estimator = _SVR(**out_params) @staticmethod def register_codecs(): from codec.codecs import SimpleObjectCodec from codec import codecs_manager codecs_manager.add_codec('algos.SVR', 'SVR', SimpleObjectCodec) codecs_manager.add_codec('sklearn.svm.classes', 'SVR', SimpleObjectCodec)
Agglomerative Clustering | Savitzky-Golay Filter |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 2.2.0, 2.2.1
Feedback submitted, thanks!