Splunk® Machine Learning Toolkit

ML-SPL API Guide

This documentation does not apply to the most recent version of Splunk® Machine Learning Toolkit. For documentation on the most recent version, go to the latest release.

Savitzky-Golay Filter

This example uses the ML-SPL API available in the Splunk Machine Learning Toolkit version 2.2.0 and later. Verify your Splunk Machine Learning Toolkit version before using this example.

This example covers the following:

  • using BaseAlgo
  • converting parameters
  • using the prepare_features utility
  • using an arbitrary function to transform data


In this example, we will add scipy's implementation of a Savitzky-Golay signal processing filter to the Splunk Machine Learning Toolkit. See https://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.signal.savgol_filter.html.

Since the savgol_filter is just a function, we will do all the work in the fit method and return our transformed values there.

Steps

  1. Register the algorithm in __init__.py.
    Modify the __init__.py file located in $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos to "register" your algorithm by adding it to the list here:
    __all__ = [
        "SavgolFilter",
        "LinearRegression",
        "Lasso",
        ...
        ]
    
  2. Create the python file in the algos folder. For this example, we create $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/SavgolFilter.py.
    import numpy as np
    from scipy.signal import savgol_filter
     
    from base import BaseAlgo
    from util.param_util import convert_params
    from util import df_util
    
  3. Define the class.
    class SavgolFilter(BaseAlgo):
        """Use scipy's savgol_filter to run a filter over fields."""
    
  4. Define the __init__ method.
    Since there isn't an "estimator" per se, we just attach the params to self to use later.
        def __init__(self, options):
            # set parameters
            params = options.get('params', {})
            out_params = convert_params(
                params,
                ints=['window_length', 'polyorder', 'deriv']
            )
    
            # set defaults for parameters
            if 'window_length' in out_params:
                self.window_length = out_params['window_length']
            else:
                self.window_length = 5
    
            if 'polyorder' in out_params:
                self.polyorder = out_params['polyorder']
            else:
    
                self.polyorder = 2
                if 'deriv' in out_params:
                    self.deriv = out_params['deriv']
                else:
                    self.deriv = 0
    
  5. Define the fit method.
       def fit(self, df, options):
            X = df.copy()
            X, nans, columns = df_util.prepare_features(X, self.feature_variables)
     
    	# Define a wrapper function
            def f(x):
                return savgol_filter(x, self.window_length, self.polyorder, self.deriv)
     
    	# Apply that function along each column of X
            y_hat = np.apply_along_axis(f, 0, X)
     
            names = ['SG_%s' % col for col in columns]
            output_df = df_util.create_output_dataframe(y_hat, nans, names)
            df = df_util.merge_predictions(df, output_df)
     
            return df
    

    Finished example

    import numpy as np
    from scipy.signal import savgol_filter
    
    from base import BaseAlgo
    from util.param_util import convert_params
    from util import df_util
    
    
    class SavgolFilter(BaseAlgo):
    
        def __init__(self, options):
            # set parameters
            params = options.get('params', {})
            out_params = convert_params(
                params,
                ints=['window_length', 'polyorder', 'deriv']
            )
    
            # set defaults for parameters
            if 'window_length' in out_params:
                self.window_length = out_params['window_length']
            else:
                self.window_length = 5
    
            if 'polyorder' in out_params:
                self.polyorder = out_params['polyorder']
            else:
    
                self.polyorder = 2
                if 'deriv' in out_params:
                    self.deriv = out_params['deriv']
                else:
                    self.deriv = 0
    
        def fit(self, df, options):
            X = df.copy()
            X, nans, columns = df_util.prepare_features(X, self.feature_variables)
    
            def f(x):
                return savgol_filter(x, self.window_length, self.polyorder, self.deriv)
    
            y_hat = np.apply_along_axis(f, 0, X)
    
            names = ['SG_%s' % col for col in columns]
            output_df = df_util.create_output_dataframe(y_hat, nans, names)
            df = df_util.merge_predictions(df, output_df)
    
            return df
    
Last modified on 19 July, 2017
Support Vector Regressor   User facing messages

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 2.3.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters