Savitzky-Golay Filter
This example uses the ML-SPL API available in the Splunk Machine Learning Toolkit version 2.2.0 and later. Verify your Splunk Machine Learning Toolkit version before using this example.
This example covers the following:
- using BaseAlgo
- converting parameters
- using the prepare_features utility
- using an arbitrary function to transform data
In this example, we will add scipy's implementation of a Savitzky-Golay signal processing filter to the Splunk Machine Learning Toolkit. See https://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.signal.savgol_filter.html.
Since the savgol_filter is just a function, we will do all the work in the fit method and return our transformed values there.
Steps
- Register the algorithm in
__init__.py
.
Modify the__init__.py
file located in$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos
to "register" your algorithm by adding it to the list here:__all__ = [ "SavgolFilter", "LinearRegression", "Lasso", ... ]
- Create the python file in the
algos
folder. For this example, we create$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/SavgolFilter.py
.
import numpy as np from scipy.signal import savgol_filter from base import BaseAlgo from util.param_util import convert_params from util import df_util
- Define the class.
class SavgolFilter(BaseAlgo): """Use scipy's savgol_filter to run a filter over fields."""
- Define the __init__ method.
Since there isn't an "estimator" per se, we just attach the params to self to use later.def __init__(self, options): # set parameters params = options.get('params', {}) out_params = convert_params( params, ints=['window_length', 'polyorder', 'deriv'] ) # set defaults for parameters if 'window_length' in out_params: self.window_length = out_params['window_length'] else: self.window_length = 5 if 'polyorder' in out_params: self.polyorder = out_params['polyorder'] else: self.polyorder = 2 if 'deriv' in out_params: self.deriv = out_params['deriv'] else: self.deriv = 0
- Define the fit method.
def fit(self, df, options): X = df.copy() X, nans, columns = df_util.prepare_features(X, self.feature_variables) # Define a wrapper function def f(x): return savgol_filter(x, self.window_length, self.polyorder, self.deriv) # Apply that function along each column of X y_hat = np.apply_along_axis(f, 0, X) names = ['SG_%s' % col for col in columns] output_df = df_util.create_output_dataframe(y_hat, nans, names) df = df_util.merge_predictions(df, output_df) return df
Finished example
import numpy as np from scipy.signal import savgol_filter from base import BaseAlgo from util.param_util import convert_params from util import df_util class SavgolFilter(BaseAlgo): def __init__(self, options): # set parameters params = options.get('params', {}) out_params = convert_params( params, ints=['window_length', 'polyorder', 'deriv'] ) # set defaults for parameters if 'window_length' in out_params: self.window_length = out_params['window_length'] else: self.window_length = 5 if 'polyorder' in out_params: self.polyorder = out_params['polyorder'] else: self.polyorder = 2 if 'deriv' in out_params: self.deriv = out_params['deriv'] else: self.deriv = 0 def fit(self, df, options): X = df.copy() X, nans, columns = df_util.prepare_features(X, self.feature_variables) def f(x): return savgol_filter(x, self.window_length, self.polyorder, self.deriv) y_hat = np.apply_along_axis(f, 0, X) names = ['SG_%s' % col for col in columns] output_df = df_util.create_output_dataframe(y_hat, nans, names) df = df_util.merge_predictions(df, output_df) return df
Support Vector Regressor | User facing messages |
This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 2.3.0
Feedback submitted, thanks!