Apply Timestamp Extraction
This topic describes how to use the function in the .
Description
This function extracts a timestamp from your record's body
using a provided extraction type. When you send data to the with a missing timestamp
, the time of ingestion in epoch-time is assigned to your record. To extract a timestamp from your record to use as the timestamp
instead, use the Apply Timestamp Extraction function. By default, this function looks for a timestamp in body
and places the extracted timestamp in timestamp
.
If the timestamp extraction is successful, the rule, pattern, and configuration type used to match and extract the timestamp in your data is outputted in a new field called _rule
. If timestamp extraction is not successful, then the rule returned in _rule
is NO_MATCH
.
Because the universal forwarder does not do any timestamp extraction on its' own, you may need to use this function to extract timestamps from a universal forwarder's event body. For convenience, this function is also included with the universal forwarder template. See Process data from a universal forwarder.
Function Input/Output Schema
- Function Input
- collection<record<R>>
- This function takes in collections of records with schema R.
- Function Output
- collection<record<S>>
- This function outputs collections of records with schema S.
Syntax
The required fields are in bold.
If extraction_type = auto:
- apply_timestamp_extraction
- extraction_type = auto
- source_field = <field>
- destination_field = <field>
- tz = <timezone>
If extraction_type = current:
- apply_timestamp_extraction
- extraction_type = current
- source_field = <field>
- destination_field = <field>
If extraction_type = advanced:
- apply_timestamp_extraction
- extraction_type = advanced
- time_format = <string>
- time_prefix = <regex>
- max_time_lookahead = <integer>
- tz = <timezone>
- fallback_to_auto = <bool>
If extraction_type = config:
- apply_timestamp_extraction
- extraction_type = config
- props_conf = <string>
- fallback_to_auto = <bool>
Required arguments
- extraction_type
- Syntax: auto | current | advanced | config
- Description: See the table for a description of each extraction type.
- Example in Canvas View: auto
Extraction Type | Description |
---|---|
Auto | Extract timestamps automatically using both the built-in DSP timestamp rules and Splunk software's datetime.xml file. For more details on how the auto setting extracts timestamps, see "Auto timestamp rules".
|
Current | Apply the current server time, in epoch format, to all records. |
Advanced | Extract a timestamp by inputting a specific strptime() format and specifying other optional parameters.
The following strptime variables are not supported: %c, %+, %Ez, %X, %x, %w, %s. See the Enhanced strptime() support section in the Splunk Enterprise documentation for more information. |
Config | Extract a timestamp based on an existing timestamp props.conf configuration. See the props.conf timestamp extraction configuration. Use this setting to reuse and migrate existing timestamp extraction configurations that you already have.
The following strptime variables are not supported: %c, %+, %Ez, %X, %x, %w, %s. |
How the auto timestamp extraction setting works
The auto
setting uses both built-in DSP timestamp rules and the datetime.xml
file to detect and extract timestamps. The first rule that matches is used. See the following table for built-in DSP timestamp rules and examples. For datetime.xml
information, see datetime.xml in the Splunk Cloud documentation. For a copy of datetime.xml
, download this zip file.
Timestamp rule | Timestamp example | Extracted Epoch time example |
---|---|---|
catalina_timestamp() |
Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile |
1271296282000L
|
cisco_timestamp() |
Tag=49: Msg: May 9 2018 21:30:45.493: %IOSXE-4-PLATFORM: R0/0: kernel: hrtime |
1525901445493L
|
date_timestamp() |
12/31/2017-05:43:11.325 test_user Provider=any oledb provider's name;OledbKey1=someValue;OledbKey2=someValue; |
1514698991325L
|
eventlog_timestamp() |
20120623053423.123 Audit Success |
1340429663123L
|
haproxy_timestamp() |
127.0.0.1:39759 09/Dec/2013:12:59:46.633 loadbalancer default/instance8 0/51536/1/48082/99627 200 83285 |
1386593986633L
|
http_timestamp() |
04/May/2015:13:17:15 +0200 evita postfix/smtpd1713: connect from camomile.cloud9.net168.100.1.3 |
1430745435000L
|
iso8601_timestamp() |
2014-02-15T23:39:43.945958Z my-test-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.000073 0.001048 0.000057 200 200 0 29 \"GET http://www.example.com:80/ HTTP/1.1\" |
1392507583945L
|
nagios_timestamp() |
1427925600 CURRENT HOST STATE: nagioshost;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 2.24 ms |
1427925600L
|
other_timestamp() |
Mon Aug 31 09:30:48 PST 2015 proxy_fcgi:error pid 28787:tid 140169587934976 (70008)Partial results are valid but processing is incomplete |
1441038648000L
|
redis_timestamp() |
"30200:C 06 May 21:25:10.186 * RDB: 6 MB of memory used by copy-on-write |
1557177910186L
|
rfc822_timestamp() |
<34>Jan 12 06:30:00 2432 apache_server: 1.2.3.4 - - 12/Jan/2011:06:29:59 +0100 \"GET /foo/bar.html HTTP/1.1\" 301 96 \"-\" \"Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.12) |
1299096000000L
|
rfc2822_timestamp() |
Sat Mar 02 2011 15:00:00 EST |
1547274600000L
|
syslog_timestamp() |
May 11 15:17:02 meow.soy.se CRON10973: pam_unix(cron:session): session opened for user root by (uid=0) |
1557587822000L
|
syslog3164_timestamp() |
<34>Jan 12 06:30:00 2432 apache_server: 1.2.3.4 - - [12/Jan/2011:06:29:59 +0100] \"GET /foo/bar.html HTTP/1.1\" 301 96 \"-\" \"Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.12) |
1557587822000L
|
tomcat_timestamp() |
2014-01-09 20:03:28,269 -0800 ERROR com.example.service.ExampleService - something completely unexpected happened... |
1389326608269L
|
Optional arguments
- source_field
- Syntax: string
- Description: The field name to extract a timestamp from. You can use this optional argument for all extraction types.
- Default: body
- Example in Canvas View: body
- destination_field
- Syntax: string
- Description: The field name to place the extracted timestamp in. You can use this optional argument for all extraction types.
- Default: timestamp
- Example in Canvas View: timestamp
- tz
- Syntax: string
- Description: The timezone for the record. You can use this optional argument for the auto and advanced extraction types.
- Default: UTC
- Example in Canvas View: America/Los_Angeles
- fallback_to_auto
- Syntax: boolean
- Description: Boolean value to specify whether to use
auto
in cases whereadvanced
orconfig
do not extract a timestamp. You can use this optional argument for the advanced and config extraction types. - Default: false
- Example in Canvas View: true
- max_time_lookahead
- Syntax: integer
- Description: The number of characters into a field that the should look for a timestamp. You can use this optional argument in the advanced extraction type.
- Default: 128
- Example in Canvas View: 100
- time_prefix
- Syntax: regex
- Description: A regular expression to match text that appears before the timestamp. If specified, this function only looks for a timestamp in the text following the end of the first regular expression match. For example, if
time_prefix
is set toabc123
, only text following the first occurrence ofabc123
is used for timestamp extraction. If thetime_prefix
cannot be found in the field, then timestamp extraction does not occur. You can use this optional argument in the advanced extraction type. - Default: empty string
- Example in Canvas View: <date>
- time_format
- Syntax: string
- Description: Specify a
strptime
format string to extract the timestamp. Thetime_format
starts reading after thetime_prefix
. If both are specified, thetime_prefix
regular expression must match up to and include the character before thetime_format
date. You can use this optional argument in the advanced extraction type. The following strptime variables are not supported: %c, %+, %Ez, %X, %x, %w, %s. See the Enhanced strptime() support section in the Splunk Enterprise documentation for more information.
- Default: empty string
- Example in Canvas View: %Y-%m-%d
Usage
This section contains additional usage information about the Apply Timestamp Extraction function.
Organize the data stream based on whether a timestamp was able to be extracted
This function outputs a new top-level field called _rule
containing information on whether the extraction was successful. Specifically, this field contains the rule, pattern, and configuration type used to match and extract the timestamp. If timestamp extraction is not successful, then the rule returned in the _rule
field is NO_MATCH
.
In order to better organize your data, you might find it helpful to create a branch in your data pipeline where one branch contains a stream of data that matched a rule and the other branch contains a stream of data that did not match a rule. To do this, branch your pipeline after the Apply Timestamp Extraction function and add a where function to each branched data stream. Configure one where function to filter for records that match a rule and the other where function to filter for records that do not match a rule. After adding those changes to a pipeline, you should end up with a pipeline that looks similar to this:
$statement_2 = | from splunk_firehose() | apply_timestamp_extraction fallback_to_auto=false extraction_type="auto"; | from $statement_2 | where "NO_MATCH"!=map_get(_rule, "_rule_id"); | from $statement_2 | where "NO_MATCH"=map_get(_rule, "_rule_id");
Which props.conf settings can I reuse
This table lists the props.conf
settings that you can reuse in the Apply Timestamp Extraction function. Any other Timestamp Extraction configurations are not supported in the Apply Timestamp Extraction function. DSP also uses Splunk software timestamp extraction precedence if you have multiple props.conf
stanzas. See props.conf in the Admin Manual for more information on the settings themselves and for information about precedence order.
Setting | Description | Default |
---|---|---|
TIME_PREFIX = <regular expression> | If set, Splunk software scans the event text for a match for this regex in event text before attempting to extract a timestamp. | empty string |
MAX_TIMESTAMP_LOOKAHEAD = <integer> | The number of characters into an event Splunk software should look for a timestamp. | 128 |
TIME_FORMAT = <strptime-style format> | Specifies a "strptime" format string to extract the date. | empty string |
TZ = <timezone identifier> | Specifies a timezone to use. | empty string |
MAX_DAYS_AGO = <integer> | The maximum number of days in the past, from the current date as provided by the input layer that an extracted date can be valid. | 2000 (5.48 years) |
MAX_DAYS_HENCE = <integer> | The maximum number of days in the future, from the current date as provided by the input layer that an extracted date can be valid. | 2 |
SPL2 examples
Examples of common use cases follow. The following examples in this section assume that you are in the SPL View.
1. Extract a timestamp from the body field using built-in rules into the timestamp field
Suppose the body
field contains a timestamp that you want to use as the record's timestamp. This example uses the extraction_type="auto"
to search the body
field for a timestamp and extract the timestamp that matches either datetime.xml
or DSP's built-in timestamp rules. Once a timestamp has been matched and extracted, the pattern and rule used to match is returned in the new top-level field _rule
.
... | apply_timestamp_extraction extraction_type="auto" |... ;
Incoming record:
Record { body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile" timestamp = "1591110134963" }
Outgoing record:
Record { body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile" timestamp = "1271296282000" _rule = "{"_rule_id":"GROK_TIMESTAMP","_rule_description":"Grok based timestamp patterns","_metadata":{"_pattern":"%{CATALINA_DATESTAMP}"}}" }
2. Set the timestamp for all incoming records to current server time
... | apply_timestamp_extraction extraction_type="current" | ...;
Incoming record:
Record { body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile" timestamp = "1591110134963" }
Outgoing record:
Record { body = "Apr 15, 2010 1:51:22 AM org.apache.catalina.loader.WebappClassLoader validateJarFile" timestamp = "1591280304957" _rule = "{"_rule_id":"CURRENT_TIMESTAMP","_rule_description":"Current processing timestamp","_metadata":null}" }
3. Extract a timestamp from the body field by specifying a time_format and a time_prefix
In this example, suppose you want to use some date that is in the event's body
field rather than the existing timestamp which is the time that the event was ingested into DSP. This example uses the time_prefix
to specify the text that always preceeds the date, and time_format
to extract the date.
...| apply_timestamp_extraction time_format="%Y-%m-%d" time_prefix=/checkin_date:/ extraction_type="advanced" | ...;
Incoming record:
Record { body = "checkin_date:1998-12-31" timestamp = "1591284656777" }
Outgoing record:
Record { body = "checkin_date:1998-12-31" timestamp = "915062400000" _rule = "{"_rule_id":"ADVANCED_TIMESTAMP","_rule_description":"Advanced time-stamp rule with user defined timestamp pattern","_metadata":{"_pattern":"%Y-%m-%d"}} " }
4. Extract a timestamp from body using a pre-existing props.conf configuration
Suppose you have an existing props.conf
configuration that you'd like to migrate to DSP in order to perform timestamp extraction. In this example, we use the config
extraction type to reuse and migrate an existing props.conf
configuration that extracts timestamps using two different formats depending on the event's sourcetype.
... | apply_timestamp_extraction props_conf="TIME_PREFIX=a.b\nTIME_FORMAT=%y-%B-%d %T\nTZ=MST\n[props_timestamp]\nTIME_FORMAT=%A, %B %d %Y %I:%M:%S%p\nTZ=PST" fallback_to_auto=false extraction_type="config" | ...;
Incoming records:
Record { body = "a.b Hello Tuesday, April 09 2019 12:00:00AM rootfs/usr/bin/planet[3422]: ERRO \"StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with", timestamp = 11111111, sourcetype = "props_timestamp", source = "config" }
Record { body = "a.b Hello 18-December-25 15:00:00 rootfs/usr/bin/planet[3422]: ERRO "StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with", timestamp = 11111111, source = "config" }
Outgoing records:
Record { body = "a.b Hello Tuesday, April 09 2019 12:00:00AM rootfs/usr/bin/planet[3422]: ERRO \"StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with", timestamp = 1554793200000, sourcetype = "props_timestamp", source = "config" _rule = {"_rule_id":"PROPS_CONF_TIMESTAMP","_rule_description":"","_metadata":{"_pattern":"%A, %B %d %Y %I:%M:%S%p"}} }
Record { body = "a.b Hello 18-December-25 15:00:00 rootfs/usr/bin/planet[3422]: ERRO "StartProcess(&{0xc00056a2d0 0xc0003b61a8 <nil> [/bin/systemctl is-active registry.service] [{PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin} {HTTP_PROXY } {http_proxy } {HTTPS_PROXY } {https_proxy } {NO_PROXY 0.0.0.0/0,.local} {ENV /etc/container-environment} {BASH_ENV /etc/container-environment} {ETCDCTL_CERT_FILE /var/state/etcd.cert} {ETCDCTL_KEY_FILE /var/state/etcd.key} {ETCDCTL_CA_FILE /var/state/root.cert} {ETCDCTL_PEERS https://127.0.0.1:2379} {KUBECONFIG /etc/kubernetes/kubectl.kubeconfig}]}) failed with", timestamp = 1545775200000, source = "config" _rule = {"_rule_id":"PROPS_CONF_TIMESTAMP","_rule_description":"n","_metadata":{"_pattern":"%y-%B-%d %T"}} }
Apply ML Model (beta) | Batch Bytes |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02
Feedback submitted, thanks!