Pre-process binary formats
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Pre-process binary formats
The Splunk Server can be configured to run a pre-processor script on files before it processes them internally. This is especially useful for files with binary formats that can't easily be parsed into searchable segments.
Preprocessing can only be used on files configured as batch inputs, rather than tail inputs or non-file sources.
The example below runs gunzipit over all files in the /u3/archives/ directory with a .dat extension.
etc/bundles/local/props.conf
[source::/u3/archives/....dat] invalid_cause = needs_preprocess preprocessing_script = gunzipit
The included script $SPLUNK_HOME/bin/untarit is a good script example. The batchfile input module invokes scripts with three parameters set: $0 is the script name, $1 is the pathname of the file to be processed, and $2 is the name of the file.
#!/bin/bash cd "$1" # Wait for file to be copied entirely don't want to start untarring before that happens. while /sbin/fuser -s $2 do sleep 0.5 /sbin/fuser -s $2 done tar --overwrite -zxf "$2" #untar the file rm -f "$2"
Skipping the binary check
The Splunk Server checks each file it loads to determine if it is a binary file or not. To skip this check and process files regardless of their format, set the following parameter in props.conf.
NO_BINARY_CHECK=true
You can set this property for all files, or for those that match a specific host, source or source type.
This documentation applies to the following versions of Splunk: 2.1 , 2.2 , 2.2.1 , 2.2.3 , 2.2.6 View the Article History for its revisions.