Splunk® Machine Learning Toolkit

ML-SPL API Guide

Acrobat logo Download manual as PDF


This documentation does not apply to the most recent version of Splunk® Machine Learning Toolkit. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Savitzky-Golay Filter

This example covers the following:

  • using BaseAlgo
  • converting parameters
  • using the prepare_features utility
  • using an arbitrary function to transform data


In this example, you will add SciPy's implementation of a Savitzky-Golay signal processing filter to the Splunk Machine Learning Toolkit. See the SciPy documentation for details on the filter.

Since SciPy's savgol_filter is just a function, we will do all the work in the fit method and return the transformed values there.

This example uses the ML-SPL API available in the Splunk Machine Learning Toolkit version 2.2.0 and later. Verify your Splunk Machine Learning Toolkit version before using this example.

Steps

  1. Register the algorithm in __init__.py.
    Modify the __init__.py file located in $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos to "register" your algorithm by adding it to the list here:
    __all__ = [
        "SavgolFilter",
        "LinearRegression",
        "Lasso",
        ...
        ]
    
  2. Create the python file in the algos folder. For this example, we create $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/SavgolFilter.py.
    import numpy as np
    from scipy.signal import savgol_filter
     
    from base import BaseAlgo
    from util.param_util import convert_params
    from util import df_util
    
  3. Define the class.
    class SavgolFilter(BaseAlgo):
        """Use SciPy's savgol_filter to run a filter over fields."""
    
  4. Define the __init__ method.
    Since there isn't an estimator class like other examples, attach the params to the object (self) to use later.
        def __init__(self, options):
            # set parameters
            params = options.get('params', {})
            out_params = convert_params(
                params,
                ints=['window_length', 'polyorder', 'deriv']
            )
    
            # set defaults for parameters
            if 'window_length' in out_params:
                self.window_length = out_params['window_length']
            else:
                self.window_length = 5
    
            if 'polyorder' in out_params:
                self.polyorder = out_params['polyorder']
            else:
    
                self.polyorder = 2
                if 'deriv' in out_params:
                    self.deriv = out_params['deriv']
                else:
                    self.deriv = 0
    
  5. Define the fit method.
       def fit(self, df, options):
            X = df.copy()
            X, nans, columns = df_util.prepare_features(X, self.feature_variables)
     
    	# Define a wrapper function
            def f(x):
                return savgol_filter(x, self.window_length, self.polyorder, self.deriv)
     
    	# Apply that function along each column of X
            y_hat = np.apply_along_axis(f, 0, X)
     
            names = ['SG_%s' % col for col in columns]
            output_df = df_util.create_output_dataframe(y_hat, nans, names)
            df = df_util.merge_predictions(df, output_df)
     
            return df
    

    Finished example

    import numpy as np
    from scipy.signal import savgol_filter
    
    from base import BaseAlgo
    from util.param_util import convert_params
    from util import df_util
    
    
    class SavgolFilter(BaseAlgo):
    
        def __init__(self, options):
            # set parameters
            params = options.get('params', {})
            out_params = convert_params(
                params,
                ints=['window_length', 'polyorder', 'deriv']
            )
    
            # set defaults for parameters
            if 'window_length' in out_params:
                self.window_length = out_params['window_length']
            else:
                self.window_length = 5
    
            if 'polyorder' in out_params:
                self.polyorder = out_params['polyorder']
            else:
    
                self.polyorder = 2
                if 'deriv' in out_params:
                    self.deriv = out_params['deriv']
                else:
                    self.deriv = 0
    
        def fit(self, df, options):
            X = df.copy()
            X, nans, columns = df_util.prepare_features(X, self.feature_variables)
    
            def f(x):
                return savgol_filter(x, self.window_length, self.polyorder, self.deriv)
    
            y_hat = np.apply_along_axis(f, 0, X)
    
            names = ['SG_%s' % col for col in columns]
            output_df = df_util.create_output_dataframe(y_hat, nans, names)
            df = df_util.merge_predictions(df, output_df)
    
            return df
    
Last modified on 20 June, 2017
PREVIOUS
Support Vector Regressor
  NEXT
User facing messages

This documentation applies to the following versions of Splunk® Machine Learning Toolkit: 2.2.0, 2.2.1


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters