Write an algorithm class
The algorithm class must implement certain methods to operate with upstream processes. These methods are the entry points to an algorithm, where the data and options are specified as arguments.
Methods
Method | Required | Arguments |
---|---|---|
__init__ | yes | self, options |
fit | yes | self, df, options |
apply | only for saved models | self, df, options |
register_codecs | only for saved models | (none) |
partial_fit | no | self, df, options |
summary | no | self, options |
Arguments
Argument | Description |
---|---|
options | A dictionary of information from the search, for example:
{ 'args': [u'sepal_width', u'petal*'], 'params': {u'fit_intercept': u't'}, 'feature_variables': ['petal*'], 'target_variable': ['sepal_width'] 'algo_name': u'LinearRegression', 'mlspl_limits': { ... }, } This dictionary of options includes: - args (list) - a list of the fields used - params (dict) - any parameters (key-value) pairs in the search - feature_variables (list) - fields to be used as features - target_variable (list) - the target field for prediction - algo_name (str) - the name of algorithm - mlspl_limits (dict): mlspl.conf stanza properties that may be used in utility methods Other keys that may exist depending on the search: - model_name (str) - the name of the model being saved ('into' clause) - output_name (str) - the name of the output ('as' clause) |
df | A pandas DataFrame of the input data from the search results. See
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html. |
Attributes
Inside of the fit
method, two attributes can be attached to self by the search
command.
Attribute | Description |
---|---|
self.feature_variables | list - the wildcard matched list of fields present from the search |
self.target_variable | str - the name of the target field (only present if from-clause is used) |
BaseAlgo class
A custom algorithm template for you to use is provided below.
from base import BaseAlgo class CustomAlgoTemplate(BaseAlgo): def __init__(self, options): # Option checking & initializations here pass def fit(self, df, options): # Fit an estimator to df, a pandas DataFrame of the search results pass def partial_fit(self, df, options): # Incrementally fit a model pass def apply(self, df, options): # Apply a saved model # Modify df, a pandas DataFrame of the search results return df @staticmethod def register_codecs(): # Add codecs to the codec manager pass
Using the template above in a search, as in the example below, reflects the input data back to the search.
| fit CustomAlgoTemplate *
These are all described in detail in the $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/base.py
BaseAlgo class as shown below.
Register an algorithm | Running process and method calling convention |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 3.2.0, 3.3.0, 3.4.0, 4.0.0, 4.1.0, 4.2.0, 4.3.0
Feedback submitted, thanks!