Splunk® Data Stream Processor

Function Reference

Acrobat logo Download manual as PDF

Acrobat logo Download topic as PDF

Apply Timestamp Extraction

This topic describes how to use the function in the Splunk Data Stream Processor.

Description

This function extracts a timestamp from your record's body using a provided extraction type. When you send data to DSP with a missing timestamp, the time of ingestion in epoch-time is assigned to your record. To extract a timestamp from your record to use as the timestamp instead, use the Apply Timestamp Extraction function. By default, this function looks for a timestamp in body and places the extracted timestamp in timestamp.

If the timestamp extraction is successful, the rule, pattern, and configuration type used to match and extract the timestamp in your data is outputted in a new field called _rule. If timestamp extraction is not successful, then the rule returned in _rule is NO_MATCH.

Because the universal forwarder does not do any timestamp extraction on its' own, you may need to use this function to extract timestamps from a universal forwarder's event body. For convenience, this function is also included with the universal forwarder template. See Process data from a universal forwarder in DSP.

Function Input/Output Schema

Function Input
collection<record<R>>
This function takes in collections of records with schema R.
Function Output
collection<record<S>>
This function outputs collections of records with schema S.

Syntax

The required fields are in bold.

If extraction_type = auto:

apply_timestamp_extraction
extraction_type = auto
source_field = <field>
destination_field = <field>
tz = <timezone>

If extraction_type = current:

apply_timestamp_extraction
extraction_type = current
source_field = <field>
destination_field = <field>

If extraction_type = advanced:

apply_timestamp_extraction
extraction_type = advanced
time_format = <string>
time_prefix = <regex>
max_time_lookahead = <integer>
tz = <timezone>
fallback_to_auto = <bool>

If extraction_type = config:

apply_timestamp_extraction
extraction_type = config
props_conf = <string>
fallback_to_auto = <bool>

Required arguments

extraction_type
Syntax: auto | current | advanced | config
Description: See the table for a description of each extraction type.
Example: auto
Extraction Type Description
Auto Extract timestamps automatically using both the built-in DSP timestamp rules and Splunk software's datetime.xml file. For more details on how the auto setting extracts timestamps, see "Auto timestamp rules".
Current Apply the current server time, in epoch format, to all records.
Advanced Extract a timestamp by inputting a specific strptime() format and specifying other optional parameters.

The following strptime variables are not supported: %c, %+, %Ez, %X, %x, %w, %s.

See the Enhanced strptime() support section in the Splunk Enterprise documentation for more information.

Config Extract a timestamp based on an existing timestamp props.conf configuration. See the props.conf timestamp extraction configuration. Use this setting to reuse and migrate existing timestamp extraction configurations that you already have.

The following strptime variables are not supported: %c, %+, %Ez, %X, %x, %w, %s.

How the auto timestamp extraction setting works

The auto setting uses both built-in DSP timestamp rules and the datetime.xml file to detect and extract timestamps. The first rule that matches is used. See the following table for built-in DSP timestamp rules and examples. For datetime.xml information, see datetime.xml in the Splunk Cloud documentation. For a copy of datetime.xml, download this zip file.

Timestamp rule Timestamp example Extracted Epoch time example
catalina_timestamp() Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile 1271296282000L
cisco_timestamp() Tag=49: Msg: May 9 2018 21:30:45.493: %IOSXE-4-PLATFORM: R0/0: kernel: hrtime 1525901445493L
date_timestamp() 12/31/2017-05:43:11.325 test_user Provider=any oledb provider's name;OledbKey1=someValue;OledbKey2=someValue; 1514698991325L
eventlog_timestamp() 20120623053423.123 Audit Success 1340429663123L
haproxy_timestamp() 127.0.0.1:39759 09/Dec/2013:12:59:46.633 loadbalancer default/instance8 0/51536/1/48082/99627 200 83285 1386593986633L
http_timestamp() 04/May/2015:13:17:15 +0200 evita postfix/smtpd1713: connect from camomile.cloud9.net168.100.1.3 1430745435000L
iso8601_timestamp() 2014-02-15T23:39:43.945958Z my-test-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.000073 0.001048 0.000057 200 200 0 29 \"GET http://www.example.com:80/ HTTP/1.1\" 1392507583945L
nagios_timestamp() 1427925600 CURRENT HOST STATE: nagioshost;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 2.24 ms 1427925600L
other_timestamp() Mon Aug 31 09:30:48 PST 2015 proxy_fcgi:error pid 28787:tid 140169587934976 (70008)Partial results are valid but processing is incomplete 1441038648000L
redis_timestamp() "30200:C 06 May 21:25:10.186 * RDB: 6 MB of memory used by copy-on-write 1557177910186L
rfc822_timestamp() <34>Jan 12 06:30:00 2432 apache_server: 1.2.3.4 - - 12/Jan/2011:06:29:59 +0100 \"GET /foo/bar.html HTTP/1.1\" 301 96 \"-\" \"Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.12) 1299096000000L
rfc2822_timestamp() Sat Mar 02 2011 15:00:00 EST 1547274600000L
syslog_timestamp() May 11 15:17:02 meow.soy.se CRON10973: pam_unix(cron:session): session opened for user root by (uid=0) 1557587822000L
syslog3164_timestamp() <34>Jan 12 06:30:00 2432 apache_server: 1.2.3.4 - - [12/Jan/2011:06:29:59 +0100] \"GET /foo/bar.html HTTP/1.1\" 301 96 \"-\" \"Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.12) 1557587822000L
tomcat_timestamp() 2014-01-09 20:03:28,269 -0800 ERROR com.example.service.ExampleService - something completely unexpected happened... 1389326608269L

Optional arguments

source_field
Syntax: string
Description: The field name to extract a timestamp from. You can use this optional argument for all extraction types.
Default: body
Example: body
destination_field
Syntax: string
Description: The field name to place the extracted timestamp in. You can use this optional argument for all extraction types.
Default: timestamp
Example: timestamp
tz
Syntax: string
Description: The timezone for the record. You can use this optional argument for the auto and advanced extraction types.
Default: UTC
Example: PST
fallback_to_auto
Syntax: boolean
Description: Boolean value to specify whether to use auto in cases where advanced or config do not extract a timestamp. You can use this optional argument for the advanced and config extraction types.
Default: false
Example: true
max_time_lookahead
Syntax: integer
Description: The number of characters into a field that DSP should look for a timestamp. You can use this optional argument in the advanced extraction type.
Default: 128
Example: 100
time_prefix
Syntax: regex
Description: A regular expression to match text that appears before the timestamp. If specified, this function only looks for a timestamp in the text following the end of the first regular expression match. For example, if time_prefix is set to abc123, only text following the first occurrence of abc123 is used for timestamp extraction. If the time_prefix cannot be found in the field, then timestamp extraction does not occur. You can use this optional argument in the advanced extraction type.
Default: empty string
Example: <date>
time_format
Syntax: string
Description: Specify a strptime format string to extract the timestamp. The time_format starts reading after the time_prefix. If both are specified, the time_prefix regular expression must match up to and include the character before the time_format date. You can use this optional argument in the advanced extraction type. The following strptime variables are not supported: %c, %+, %Ez, %X, %x, %w, %s. See the Enhanced strptime() support section in the Splunk Enterprise documentation for more information.
Default: empty string
Example: %Y-%m-%d

Usage

This section contains additional usage information about the Apply Timestamp Extraction function.

Organize the data stream based on whether a timestamp was able to be extracted

This function outputs a new top-level field called _rule containing information on whether the extraction was successful. Specifically, this field contains the rule, pattern, and configuration type used to match and extract the timestamp. If timestamp extraction is not successful, then the rule returned in the _rule field is NO_MATCH.

In order to better organize your data, you might find it helpful to create a branch in your data pipeline where one branch contains a stream of data that matched a rule and the other branch contains a stream of data that did not match a rule. To do this, branch your pipeline after the Apply Timestamp Extraction function and add a where function to each branched data stream. Configure one where function to filter for records that match a rule and the other where function to filter for records that do not match a rule. After adding those changes to a pipeline, you should end up with a pipeline that looks similar to this:

$statement_2 = | from splunk_firehose() | apply_timestamp_extraction fallback_to_auto=false extraction_type="auto";
| from $statement_2 | where "NO_MATCH"!=map_get(_rule, "_rule_id");
| from $statement_2 | where "NO_MATCH"=map_get(_rule, "_rule_id");

Which props.conf settings can I reuse

This table lists the props.conf settings that you can reuse in the Apply Timestamp Extraction function. Any other Timestamp Extraction configurations are not supported in the Apply Timestamp Extraction function. DSP also uses Splunk software timestamp extraction precedence if you have multiple props.conf stanzas. See props.conf in the Admin Manual for more information on the settings themselves and for information about precedence order.

Setting Description Default
TIME_PREFIX = <regular expression> If set, Splunk software scans the event text for a match for this regex in event text before attempting to extract a timestamp. empty string
MAX_TIMESTAMP_LOOKAHEAD = <integer> The number of characters into an event Splunk software should look for a timestamp. 128
TIME_FORMAT = <strptime-style format> Specifies a "strptime" format string to extract the date. empty string
TZ = <timezone identifier> Specifies a timezone to use. empty string
MAX_DAYS_AGO = <integer> The maximum number of days in the past, from the current date as provided by the input layer that an extracted date can be valid. 2000 (5.48 years)
MAX_DAYS_HENCE = <integer> The maximum number of days in the future, from the current date as provided by the input layer that an extracted date can be valid. 2

SPL2 examples

Examples of common use cases follow. The following examples in this section assume that you are using the SPL2 Pipeline Builder.

1. Extract a timestamp from the body field using built-in rules into the timestamp field

Suppose the body field contains a timestamp that you want to use as the record's timestamp. This example uses the extraction_type="auto" to search the body field for a timestamp and extract the timestamp that matches either datetime.xml or DSP's built-in timestamp rules. Once a timestamp has been matched and extracted, the pattern and rule used to match is returned in the new top-level field _rule.

... | apply_timestamp_extraction extraction_type="auto" |... ;

Incoming record:

Record {
 body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile"
 timestamp = "1591110134963"
}

Outgoing record:

Record {
 body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile"
 timestamp = "1271296282000"
 _rule = "{"_rule_id":"GROK_TIMESTAMP","_rule_description":"Grok based timestamp patterns","_metadata":{"_pattern":"%{CATALINA_DATESTAMP}"}}"
}

2. Set the timestamp for all incoming records to current server time

... | apply_timestamp_extraction extraction_type="current" | ...;

Incoming record:

Record {
 body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile"
 timestamp = "1591110134963"
}

Outgoing record:

Record {
 body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile"
 timestamp = "1591280304957"
 _rule = "{"_rule_id":"CURRENT_TIMESTAMP","_rule_description":"Current processing timestamp","_metadata":null}"
}

3. Extract a timestamp from the body field by specifying a time_format and a time_prefix

In this example, suppose you want to use some date that is in the event's body field rather than the existing timestamp which is the time that the event was ingested into DSP. This example uses the time_prefix to specify the text that always preceeds the date, and time_format to extract the date.

...| apply_timestamp_extraction time_format="%Y-%m-%d" time_prefix=/checkin_date:/ extraction_type="advanced" | ...;

Incoming record:

Record {
 body = "checkin_date:1998-12-31"
 timestamp = "1591284656777"
}

Outgoing record:

Record {
 body = "checkin_date:1998-12-31"
 timestamp = "915062400000"
 _rule = "{"_rule_id":"ADVANCED_TIMESTAMP","_rule_description":"Advanced time-stamp rule with user defined timestamp pattern","_metadata":{"_pattern":"%Y-%m-%d"}}	"
}

4. Extract a timestamp from body using a pre-existing props.conf configuration

Suppose you have an existing props.conf configuration that you'd like to migrate to DSP in order to perform timestamp extraction. In this example, we use the config extraction type to reuse and migrate an existing props.conf configuration that extracts timestamps using two different formats depending on the event's sourcetype.

... | apply_timestamp_extraction props_conf="TIME_PREFIX=a.b\nTIME_FORMAT=%y-%B-%d %T\nTZ=MST\n[props_timestamp]\nTIME_FORMAT=%A, %B %d %Y %I:%M:%S%p\nTZ=PST" fallback_to_auto=false extraction_type="config" | ...;

Incoming records:

Record { 
body =  "a.b Hello Tuesday, April 09 2019 12:00:00AM rootfs/usr/bin/planet[3422]: ERRO \"StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with",
timestamp = 11111111,
sourcetype = "props_timestamp",
source = "config"
}
Record { 
body =  "a.b Hello 18-December-25 15:00:00 rootfs/usr/bin/planet[3422]: ERRO "StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with",
timestamp = 11111111,
source = "config"
}

Outgoing records:

Record { 
body = "a.b Hello Tuesday, April 09 2019 12:00:00AM rootfs/usr/bin/planet[3422]: ERRO \"StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with",
timestamp = 1554793200000,
sourcetype = "props_timestamp",
source = "config"
_rule = {"_rule_id":"PROPS_CONF_TIMESTAMP","_rule_description":"","_metadata":{"_pattern":"%A, %B %d %Y %I:%M:%S%p"}}
}
Record { 
body = "a.b Hello 18-December-25 15:00:00 rootfs/usr/bin/planet[3422]: ERRO "StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with",
timestamp = 1545775200000,
source = "config"
_rule = {"_rule_id":"PROPS_CONF_TIMESTAMP","_rule_description":"n","_metadata":{"_pattern":"%y-%B-%d %T"}}
}
Last modified on 13 November, 2020
PREVIOUS
Apply ML Model
  NEXT
Batch Bytes

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters