
Custom search commands
Although the Splunk search language is extensive, you may want to write your own custom search commands. You can add a custom search script to create your own search command. Use Python to add your search script.
Note: The search command API does not support recursive searching. To build a search that runs recursively, use the REST search API.
Get started
There are two steps to building a custom search command:
- Build the search command in Python.
- Add an entry to
commands.conf
to make your custom command accessible.
Types of commands
Command | Description |
---|---|
streaming | A streaming command is applied as results travel through the search pipeline.
If your script is not streaming, it only processes a single chunk of results. You can specify a search (that contains only streaming commands) to be executed before your non-streaming script, if your script is the very first non-streaming command in the pipeline, or if you have 'requires_preop' set to true (false by default). |
generating | A generating command must be the first command specified in a search. Generating commands rely on being passed useful arguments. |
retevs | retains events in commands.conf .
This setting indicates that this script outputs events when given events as input. By default this is set to false, meaning that the Timeline never represents the output of this command. Although there is no universal definition of what an event is, generally, if you intend to retain the |
reqsop | 'requires_preop' in commands.conf.
This setting indicates if the string in the 'preop' variable must be executed, regardless if this script is the first non-streaming command in a search pipeline or not. |
timeorder | Represents both 'generates_timeorder' and 'overrides_timeorder' in commands.conf.
'overrides_timeorder' overrides the order of the input to the script. For example, if the input to this script is in descending time order, the output will be in ascending time order. 'generates_timeorder' only applies to generating commands. This setting indicates that the script will ignore the order of the input and always generate output in descending time order. |
Build your search command in Python
Python search commands rely on Intersplunk.py
to grab events from the search pipeline and pass the modified events back. The arguments passed to your script in sys.argv
are the same arguments you use when searching with the command.
Handling input
The simplest way to get data to your search command is to use splunk.Intersplunk.readResults
, which takes three optional parameters and returns a list of dicts representing the list of input events. The optional parameters are input_buf
, settings
, and has_header
.
Parameter | Default | Description |
---|---|---|
inputbuf = None | file | None | Indicates where to read input from.
Set to None by default, which means your search command expects to get data from sys.stdin. |
settings = None | dict | None | Indicates where to store any information found in the input header.
Set to None by default, which means do not record the settings. |
has_header = True | False | True | Indicates whether or not to expect an input header. |
Here's an example call to splunk.Intersplunk.readResults
:
results = splunk.Intersplunk.readResults(None, None, True)
This indicates that you are reading results from the search pipeline. The input to your script is either pure CSV, or a header section followed by a blank line followed by pure CSV. By setting True in the above example, your command expects a header with your results. If you set this to False, you must also set the enableheader key in the commands.conf entry for your command.
If your script does not expect a header section in the input, you can directly use the Python csv module to read the input. For example:
import csv r = csv.reader(sys.stdin) for l in r: ...
The advantage of this configuration is that you can break at any time in the for loop and only lines in the input that you've iterated to will already be read into memory, leading to much better performance for some cases.
Sending output
Intersplunk can also be used to construct your script's output. splunk.Intersplunk.generateErrorResults takes a string and writes the correct error output to sys.stdout. splunk.Intersplunk.outputResults takes a list of dict objects and writes the appropriate CSV output to sys.stdout.
To output data, add:
splunk.Intersplunk.outputResults(results)
The output of your script is expected to be pure CSV. To indicate an error, return a CSV with a single "ERROR" column and a single row (besides the header row) with the contents of the message.
Debugging your script
If your script has 'supports_getinfo' = true
, the first argument to your script must either be __GETINFO__
or __EXECUTE__
. Setting 'supports_getinfo' = true
is a good tool for debugging as it allows your script to be called with the command arguments at parse time, before any execution of the search. Any syntax errors stops your search query being executed. If you call your script with __GETINFO__
, you can also dynamically specify the properties of your script (such as streaming or not) depending on your arguments.
If your script has 'supports_getinfo' set to True, you should first make a call such as:
(isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv)
This call strips the first argument from sys.argv and checks if you are in GETINFO mode or EXECUTE mode. If you are in GETINFO mode, your script should use splunk.Intersplunk.outputInfo() to return the properties of your script or splunk.Intersplunk.parseError() if the arguments are invalid. The definition of outputInfo() is as follows:
def outputInfo(streaming, generating, retevs, reqsop, preop, timeorder=False):
Note: You can also set these attributes in commands.conf.
Add an entry to commands.conf
You must create a commands.conf entry for your command in $SPLUNK_HOME/etc/apps/<app_name>/local/commands.conf
. To see all the possible settings in commands.conf, see the command.conf.spec, in the Admin Manual.
The following is a basic example that just enables a script:
[<script_name>] filename = mypyscript.py
The stanza name in commands.conf is the name of the search script. Use this name to call the script from a search. Also, you must set the 'filename' key, which is the name of the script file. The script should be in either $SPLUNK_HOME/etc/apps/<app_name>/bin/
or $SPLUNK_HOME/etc/searchscripts
. The the app directory is usually the best location for the script.
Example
# Copyright (C) 2005-2009 Splunk Inc. All Rights Reserved. Version 3.0 import csv import sys import splunk.Intersplunk import string (isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv) if len(sys.argv) < 2: splunk.Intersplunk.parseError("No arguments provided") trendInfoList = [] # list of dictionaries of information about trendlines validTypes = ['sma', 'wma', 'ema'] maxPeriod = 10000 i = 1 while i<len(sys.argv): # expect argument in format: <type><period>(<fieldname>) [as <newname>] arg = sys.argv[i] pos = arg.find('(') if (pos < 1) or arg[-1] != ')': splunk.Intersplunk.parseError("Invalid argument '%s'" % arg) name = arg[0:pos] field = arg[pos+1:len(arg)-1] if len(field) == 0 or field[0:2] == '__': splunk.Intersplunk.parseError("Invalid or empty field '%s'" % field) trendtype = None period = 0 try: for t in validTypes: if name[0:len(t)] == t: trendtype = t period = int(name[len(t):]) if (period < 2) or (period > maxPeriod): raise ValueError except ValueError: splunk.Intersplunk.parseError("Invalid trend period for argument '%s'" % arg) if trendtype is None: splunk.Intersplunk.parseError("Invalid trend type for argument '%s'" % arg) newname = arg; if (i+2<len(sys.argv)) and (string.lower(sys.argv[i+1]) == "as"): newname = sys.argv[i+2] i += 3 else: i += 1 trendInfoList.append({'type' : trendtype, 'period' : period, 'field' : field, 'newname' : newname, 'vals': [], 'last': None}) if isgetinfo: splunk.Intersplunk.outputInfo(False, False, True, False, None, True) # outputInfo automatically calls sys.exit() results = splunk.Intersplunk.readResults(None, None, True) for res in results: # each res is a dict of fields to values for ti in trendInfoList: if ti['field'] not in res: continue try: ti['vals'].append(float(res[ti['field']])) except ValueError: continue # ignore non-numeric values if len(ti['vals']) > ti['period']: ti['vals'].pop(0) elif len(ti['vals']) < ti['period']: continue # not enough data yet newval = None if ti['type'] == 'sma': # simple moving average newval = sum(ti['vals']) / ti['period'] elif ti['type'] == 'wma': # weighted moving average Total = 0 for i in range(len(ti['vals'])): Total += (i+1)*(ti['vals'][i]) newval = Total / (ti['period'] * (ti['period']+1) / 2) elif ti['type'] == 'ema': # exponential moving average if (ti['last'] is None): newval = ti['vals'][-1] else: alpha = float(2.0 / (ti['period'] + 1.0)) newval = (alpha * ti['vals'][-1]) + (1 - alpha) * ti['last'] ti['last'] = newval res[ti['newname']] = str(newval) splunk.Intersplunk.outputResults(results)
Answers
Have questions? Visit Splunk Answers to see what questions and answers other Splunk users had about custom search commands.
PREVIOUS Modular inputs |
NEXT Splunk Enterprise API is RESTful |
This documentation applies to the following versions of Splunk® Enterprise: 6.2.0, 6.2.1, 6.2.2, 6.2.3, 6.2.4, 6.2.5, 6.2.6, 6.2.7, 6.2.8, 6.2.9, 6.2.10, 6.2.11, 6.2.12, 6.2.13, 6.2.14, 6.2.15
Typo correction: Please add a space in "if you intend to retain the _rawand _time" between "_raw" and "and" so "_rawand" becomes "_raw and".