Write a Python algorithm class
To add a custom algorithm to the Splunk Machine Learning Toolkit (MLTK) you must write a Python algorithm class. The algorithm class must implement certain methods to properly operate with upstream processes. These methods are the entry points to an algorithm, where the data and options are specified as arguments.
Best practices
Follow these best practices when writing an algorithm class:
- Assume invalid input.
- If there is a parameter passed in make sure you check that it is valid.
- If you require a particular field, for example,
_time
, make sure you check for its presence and error accordingly.
Methods
Methods are the entry point to the custom algorithm. Refer to the following table for details about each method:
Method | Required | Arguments |
---|---|---|
__init__
|
Yes | self, options |
fit
|
Yes | self, df, options |
apply
|
Only for saved models | self, df, options |
register_codecs
|
Only for saved models | (none) |
partial_fit
|
No | self, df, options |
summary
|
No | self, options |
Arguments
Specify data and options as arguments. Refer to the following table for details about each argument:
Argument | Description |
---|---|
options | Options include:
Other options that may exist depending on the search:
|
df | A pandas DataFrame of the input data from the search results. See
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html. |
Example:
The following example is a dictionary of information from the search:
{ 'args': [u'sepal_width', u'petal*'], 'params': {u'fit_intercept': u't'}, 'feature_variables': ['petal*'], 'target_variable': ['sepal_width'] 'algo_name': u'LinearRegression', 'mlspl_limits': { ... }, }
Attributes
Inside of the fit
method, two attributes can be attached to self by the search
command. Refer to the following table for details about each attribute:
Attribute | Description |
---|---|
self.feature_variables (list)
|
The wildcard matched list of fields present from the search |
self.target_variable (str)
|
The name of the target field. This field is only present if the from clause is used.
|
Register the custom algorithm in the MLTK app | Custom algorithm template |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 4.4.0, 4.4.1, 4.4.2, 4.5.0, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0, 5.3.1, 5.3.3, 5.4.0, 5.4.1, 5.4.2, 5.5.0
Feedback submitted, thanks!