Get data from APIs and other remote data interfaces through scripted inputs
Splunk can accept events from scripts that you provide. Scripted input is useful in conjunction with Windows and *nix command-line tools, such as
top, etc. You can use scripted input to get data from application program interfaces (APIs) and other remote data interfaces and message queues. You can then use commands like
iostat on that data to generate metrics and status data.
Note: This topic describes how to add scripted inputs that you've already written to your set of Splunk inputs. To learn how to develop scripted inputs in the first place, see "Build scripted inputs" in the Developing Views and Apps for Splunk Web manual.
You configure scripted inputs from Splunk Manager or by editing inputs.conf.
Note: On Windows platforms, you can enable text-based scripts, such those in perl and python, with an intermediary Windows batch (
.bat) or PowerShell (
Caution: Scripts launched through scripted input inherit Splunk's environment, so be sure to clear environment variables that can affect your script's operation. The only environment variable that's likely to cause problems is the library path (most commonly known as
LD_LIBRARY_PATH on Linux, Solaris, and FreeBSD).
Starting with release 4.2, Splunk logs any
stderr messages generated by scripted inputs to
Add a scripted input in Splunk Web
To add a scripted input in Splunk Web:
A. Go to the Add New page
You add a scripted input from the Add New page in Splunk Web. You can get there through two routes:
- Splunk Manager
- Splunk Home
It doesn't matter which route you use to get there; the Add New page itself is the same either way.
Via Splunk Manager:
1. Click Manager in the upper right-hand corner of Splunk Web.
2. In the Data section of the Manager page, click Data Inputs.
3. Click Scripts.
4. Click the New button to add an input.
Via Splunk Home:
1. Click the Add Data link in Splunk Home. This brings you to a page called "Data recipes".
2. Click the Run and collect the output of a script link to add an input.
B. Specify the scripted input
1. In the Command text box, specify the script command, including the path to the script.
2. In Interval, specify the interval in seconds between script runtimes. The default is 60 (seconds).
3. Enter a new Source name to override the default source value, if necessary.
Important: Consult Splunk support before changing this value.
4. To access other settings, check More settings. A number of additional settings appear. You can usually go with the defaults for these settings. If you want to set them explicitly, here's what they're for:
a. You can change the Host value, if necessary.
b. You can set the Source type. Source type is a default field added to events. Source type is used to determine processing characteristics, such as timestamps and event boundaries. For information on overriding Splunk's automatic source typing, see "Override automatic source type assignment" in this manual.
c. You can set the Index for this input. Leave the value as "default", unless you have defined multiple indexes to handle different types of events. In addition to indexes for user data, Splunk has a number of utility indexes, which also appear in this dropdown box.
5. Click Save.
Add a scripted input via inputs.conf
You add a scripted input in
inputs.conf by adding a
Here is the syntax for the
[script://$SCRIPT] <attrbute1> = <val1> <attrbute2> = <val2> ...
Note the following:
$SCRIPTis the fully-qualified path to the location of the script.
- As a best practice, put your script in the
bin/directory nearest the
inputs.confwhere your script is specified. For example, if you are configuring
$SPLUNK_HOME/etc/system/local/inputs.conf, place your script in
$SPLUNK_HOME/etc/system/bin/. If you're working on an application in
$SPLUNK_HOME/etc/apps/$APPLICATION/, put your script in
All attributes are optional. Here is the list of available attributes:
interval = <number>|<cron schedule>
- Indicates how often to execute the specified command. Specify either an integer value representing seconds or a valid cron schedule.
- Defaults to 60.0 seconds.
- When a
cron scheduleis specified, the script does not execute on start up, but rather at the times defined by the cron schedule.
- Splunk keeps one invocation of a script per instance. Intervals are based on when the script completes. So if you have a script configured to run every 10 minutes and the script takes 20 minutes to complete, the next run will occur 30 minutes after the first run.
- For constant data streams, enter 1 (or a value smaller than the script's interval).
- For one-shot data streams, enter -1. Setting
intervalto -1 will cause the script to run each time the splunk daemon restarts.
index = <string>
- Sets the index where events from this input will be stored.
- Splunk prepends the
- Defaults to
main, or whatever you have set as your default index.
- For more information about the index field, see "How indexing works" in the Managing Indexers and Clusters manual.
sourcetype = <string>
- Sets the sourcetype key/field for events from this input.
- Explicitly declares the source type for this data, as opposed to allowing it to be determined automatically. This is important both for searchability and for applying the relevant formatting for this type of data during parsing and indexing.
- Sets the sourcetype key's initial value. The key is used during parsing/indexing, in particular to set the source type field during indexing. It is also the source type field used at search time.
<string>is prepended with 'sourcetype::'.
- If not set explicitly, Splunk picks a source type based on various aspects of the data. There is no hard-coded default.
- For more information about source types, see "Why source types matter", in this manual.
source = <string>
- Sets the source key/field for events from this input.
- Note: Splunk does not recommend that you override the source key. Typically, the input layer will provide a more accurate string to aid in problem analysis and investigation, accurately recording the file from which the data was retreived. Consider use of source types, tagging, and search wildcards before overriding this value.
- Splunk prepends
- Defaults to the input file path.
disabled = <true | false>
disabledis a boolean value that can be set to true if you want to disable the input.
- Defaults to
- Defaults to
If you want the script to run continuously, write the script to never exit and set it on a short interval. This helps to ensure that if there is a problem the script gets restarted. Splunk keeps track of scripts it has spawned and will shut them down upon exit.
Using a wrapper script
Generally, it is good practice to write a wrapper script for scripted inputs that use commands with arguments. In some cases, the command can contain special characters that Splunk escapes when validating text entered in Splunk Web. This causes updates to a previously configured input to fail to save.
- Note: Characters that Splunk escapes when validating text are those that should not be in paths, such as equals (
=) and semi-colon (
For example, the following scripted input is not correctly saved when edited in Splunk Web because Splunk escapes the equals (=) sign in the parameter to the
[script://$SPLUNK_HOME/etc/apps/myApp/bin/myUtil.py file=my_datacsv] disabled = false
To avoid this problem, write a wrapper script that contains the scripted input. (Inputs updated by editing the conf file directly are not subject to this input validation.) For information on writing wrapper scripts, see "Scripted inputs overview" in the Developing Views and Apps for Splunk Web manual.
Example using inputs.conf
This example shows the use of the UNIX
top command as a data input source:
1. Create a new application directory. This example uses
$ mkdir $SPLUNK_HOME/etc/apps/scripts
2. All scripts should be run out of a
bin/ directory inside your application directory:
$ mkdir $SPLUNK_HOME/etc/apps/scripts/bin
3. This example uses a small shell script
$ #!/bin/sh top -bn 1 # linux only - different OSes have different parameters
4. Make sure the script is executable:
chmod +x $SPLUNK_HOME/etc/apps/scripts/bin/top.sh
5. Test that the script works by running it via the shell:
The script should send one
6. Add the script entry to
[script:///opt/splunk/etc/apps/scripts/bin/top.sh] interval = 5 # run every 5 seconds sourcetype = top # set sourcetype to top source = script://./bin/top.sh # set source to name of script
Note: You may need to modify props.conf:
- By default Splunk breaks the single
topentry into multiple events.
- The easiest way to fix this problem is to tell the Splunk server to break only before something that does not exist in the output.
For example, adding the following to
$SPLUNK_HOME/etc/apps/scripts/default/props.conf forces all lines into a single event:
[top] BREAK_ONLY_BEFORE = <stuff>
Since there is no timestamp in the top output we need to tell Splunk to use the current time. This is done in
props.conf by setting:
DATETIME_CONFIG = CURRENT
Set interval attribute to cron schedule
In the above example, you can also set the
interval attribute to a "cron" schedule by specifying strings like the following:
0 * * * *: Means run once an hour, at the top of the hour.
*/15 9-17 * * 1-5: Means run every 15 minutes from 9 am until 5 pm, on Monday to Friday.
15,35,55 0-6,20-23 1 */2 *: Means run at 15, 35, and 55 minutes after the hour, between midnight and 7 am and again between 8pm and midnight, on the first of every even month (February, April, June and so on).
For more information about setting cron schedules, read "CRONTAB(5) on the Crontab website.
Monitor changes to your file system
Find more things to monitor with crawl
This documentation applies to the following versions of Splunk® Enterprise: 5.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 5.0.7, 5.0.8, 5.0.9, 5.0.10, 5.0.11, 5.0.12, 5.0.13, 5.0.14, 5.0.15, 5.0.16, 5.0.17, 5.0.18