Extract additional meta data (e.g. user, severity) from events
This documentation does not apply to the most recent version of Splunk. Click here for the latest version.
Contents
Extract additional meta data (e.g. user, severity) from events
This example would create a searchable ssn:: meta data value inside events that contained US Social Security Numbers, which are of the form "017-46-3879"
Note if you do not need to search on the custom field but only show it in a report:, then you can do this without the overhead of indexing and storage by using a search-time field. This reduces indexing time and storage needed to persist custom metadata and also provides more flexibility to change your logic later without re-indexing. See Define Search-Time Report Fields for more information.
etc/bundles/local/regexes.conf
First, create a regular expression to find Social Security Numbers here in the format "[nnn-nn-nnnn]", prepend them with ssn:: , and store them as ssn keys that are displayed in Splunk results.
[ssn] # This is the default for [ssn]: REGEX = \[(\d\d\d-\d\d-\d\d\d\d)\] DEST_KEY = _meta FORMAT = $0 ssn::$1
Attributes
You may need to set one or more of these.
- DEFAULT_VALUE
The fallback value to set if the regex does not match.
- DEST_KEY
Where the matching value should be stored. Either a built-in Splunk meta data value, such as !MetaData:Host , or a new value you create, such as pid . The special format !MetaData: is only for built-in meta data values - host, source, sourcetype. If you create a new key, users will be able to Splunk for it in the Splunk box, for example:
pid::4853
Don't start DEST_KEY values with an underscore, e.g. use pid not _pid . Underscored values are reserved for special use.
- FORMAT
How to place matching regular expression section(s) into each event. Specify each matching section as // , // , etc.
- PREPEND
String to put in front of the set value at DEST_KEY . This would be the meta data name you want users to search for, such as host:: or pid , but you must also set WRITE_META to be True for it to be Splunkable from the interface.
- REGEX
The regular expression. It can have more than one matching section.
- SOURCE_KEY
A parallel to DEST_KEY. This is where Splunk should look for matches. The default is _raw , the stream of data passing through the pipeline. You can also use any value defined as DEST_KEY or built in, such as MetaData:Host or pid from the above example.
- WRITE_META
A boolean value - True or False - that tells Splunk whether or not to display the set value in search results along with host:: , source:: and sourcetype:: . Either way, users can search for the value.
etc/bundles/local/props.conf
Then add an entry to map the regex function to a sourcetype.
[my_custom_sourcetype] REGEXES-ssn = ssn
Attributes
- REGEXES-*
String
Default: There is no default set.
Specifies a comma-delimited list of regular expression entries in regexes.conf. You should name your attribute with an arbitrary names such as REGEXES-ssn, so that your regexes for metadata extraction do not override other regexes that should also apply to the event.
Extracting syslog severity:: meta data
The configuration below shows how to make syslog-ng events splunkable by severity, by indexing severity:: meta data on each event. (Thanks to dtshaffer for the example.)
syslog-ng.conf
Configure syslog-ng to log event severity levels.
template("$MONTH/$DAY/$YEAR $HOUR:$MIN:$SEC $HOST [$LEVEL] $MSG\n")
template_escape(no)
etc/bundles/local/regexes.conf
Create a regex to assign a meta data value severity:: value on each event.
[syslog-severity]
DEST_KEY = severity
REGEX = ^(?:\d+\/\d+\/\d+ \d{2}:\d{2}:\d{2})\s+\S+\s+\[(\S+)\]
FORMAT = severity::$1
etc/bundles/local/props.conf
Add an entry for your syslog events that invokes the regex created above.
[syslog] REGEXES-syslog-severity = syslog-severity
Splunk it!
After restarting Splunk, you should be able to search new syslog events by severity:: as shown below.
sourcetype::syslog index::main ( severity::alert OR severity::crit OR severity::err )
Real-world example: Windows Event IDs
Customer is attempting to get splunk to operate in a primarily windows environment. A major part of this is to be able to generate a bunch of 'canned queries' that level 1 support can run against systems on a scheduled basis (replicating current snare server functionality). The easiest way to do this appeared to be to search on windows event id's (as snare does). To prevent matches with other numbers in event data, the customer is attempting to extract only the event id's from data with sourcetype::windows.
cbrsq002.asda.org.au MSWinEventLog 1 Security 725 Wed Sep 06 21:55:08 2006 576 Security SYSTEM User Success Audit CBRSQ002 Privilege Use Special privileges assigned to new logon: User Name: Domain: Logon ID: (0x0,0x4F41CE73) Assigned: SeBackupPrivilege SeRestorePrivilege SeDebugPrivilege SeChangeNotifyPrivilege SeAssignPrimaryTokenPrivilege 527
To extract the events you need to have an entry in props.conf like:
[windows] MAX_TIMESTAMP_LOOKAHEAD = 32 REGEXES = syslog-host,eventid SHOULD_LINEMERGE = False TYPING_CONFIG = /etc/event-types/current/windows.xml maxDist = 150
In regexes.conf you need to have an entry to extract the data and append it to the meta tag:
[eventid] DEST_KEY = eventid REGEX = \d\d:\d\d:\d\d\s\d\d\d\d\s+(\w+) FORMAT = eventid::$1 DEFAULT_VALUE = eventid::0
Real-world example
Customer has an in-house trading application. This application logs a transaction like:
5234074322006061300000000020060616000000000ARB
Splunk needs to parse as shown below
- 523407432 - account number
- 20060613 - trade date
- 20060616 - settle date
- ARB - symbol
To achieve this without modifying the event itself you need to modify to configuration files found in the /opt/splunk/etc/bundles/local directory. You need to add the following to your props.conf
[source::<path of the logfile>]
REGEXES-inhouse = inhouse
This stanza will call the regex "inhouse" contained in your regexes.conf. You need to add the following stanza to regexes.conf:
[inhouse]
REGEX = ^(.{9})(.{8})(.{9})(.{8})(.{9})(.{3})
DEST_KEY = _MetaData:IndexTerms
FORMAT = $0 AccountNumber::$1 TradeDate::$2 SettleDate::$4
Symbol::$6
This change will create the meta-tags AccountNumber::, TradeDate::, SettleDate::, and Symbol::. This allow you to search for the term AccountNumber::523407432 and see all events that contain that meta tag.
This documentation applies to the following versions of Splunk: 2.1 , 2.2 , 2.2.1 , 2.2.3 , 2.2.6 View the Article History for its revisions.