Evaluation functions specific to sdselect
This topic describes evaluation functions that are specific to the sdselect
command, for use with federated searches of remote datasets. You use evaluation functions to evaluate an expression, based on your events, and return a result.
extract(<str>,<regular_expression>)
Description
This function returns the first occurrence of the pattern matched by the regular expression in the string.
Usage
The <str>
argument is the name of a string field.
The <regular_expression>
argument uses Java regular expression syntax to match and extract a pattern in the values of the field identified by <str>
.
The extract()
evaluation expression supports only one capturing group per instance of the regular expression. If you want to use the same regular expression to capture different parts of a field, you must have multiple instances of extract()
in the search and have different capturing groups in the expression in each instance.
You can use this function in the following parts of sdselect
searches:
- Fields you select with
sdselect
. - Application of aggregate statistical functions to evaluation functions.
- WHERE, GROUPBY, and ORDERBY clauses.
See sdselect command syntax details.
Examples
These examples show different ways to use the extract()
evaluation function to extract fields from events in your sdselect
datasets.
1. Extract a single value from a field
This search returns the first word in the description
field as the value of a new short_description
field.
| sdselect clientip, extract(description, "^(.+)") AS short_description FROM s3-provider
2. Extract different values from the same field
If you need to extract multiple values from the same field or string, you must use multiple instances of extract()
to do it, with a different capturing group identified in each instance, as extract()
does not support multiple capturing groups in the same regular expression.
For example, say you have a field in your data named sourceipaddress
that is composed of IP address values like trustedadvisor.buttercupgames.com
, and you want to break those addresses up into start, middle, and end terms. This is how you might use the extract()
function to do this in an an sdselect
search:
|sdselect extract(sourceipaddress, "([^.]+)") AS start_term, extract(sourceipaddress, "[^.]+\\.([^.]+)") AS middle_term, extract(sourceipaddress, "[^.]+\\.[^.]+\\.([^.]+)") AS end_term FROM s3-provider
The results of that search would look something like this:
start_term | middle_term | end_term |
---|---|---|
trustedadvisor |
buttercupgames |
com |
support |
buttercupgames |
com |
professionalservices |
buttercupgames |
com |
timestamp_from_unixtime(<time>)
Description
This function takes a numeric UNIX time format timestamp or a field with a numeric UNIX time format timestamp as its argument and converts that input into an equivalent SQL timestamp
data type value.
Usage
Use the timestamp_from_unixtime
evaluation function in the WHERE clause of an sdselect
search to directly compare timestamps in the numeric UNIX time format with SQL timestamp
data type values. This evaluation function is necessary because Splunk software cannot apply type casting to evaluate searches like this:
| sdselect accountid, region FROM my_asl_data WHERE time_dt < 1728348383
The timestamp_from_unixtime
function is especially useful for federated searches of Amazon Security Lake datasets that take advantage of partition time fields. The default partition time field for Amazon Security Lake events is time_dt
, which has SQL timestamp
values.
You can use the timestamp_from_unixtime
function only with the sdselect
command. You can apply the timestamp_from_unixtime
function to fields that you select with sdselect
, as well as to WHERE clause arguments in sdselect
searches.
You cannot apply the timestamp_from_unixtime
function to GROUPBY or ORDERBY clause arguments in sdselect
searches.
For more information about running federated searches of Amazon Security Lake datasets, see About Federated Analytics.
Using timestamp_from_unixtime for search optimization
Because timestamp_from_unixtime
does not transform SQL timestamp
data type field values, it provides better optimization for searches that involve partition fields than timestamp_to_unixtime
.
In other words, although the following two searches return the same results, the timestamp_from_unixtime
option might offer better search performance.
| sdselect accountid, region FROM my_asl_data WHERE time_dt < timestamp_from_unixtime(1728348383)
| sdselect accountid, region FROM my_asl_data WHERE 1728348383 >= timestamp_to_unixtime(time_dt)
Basic examples
When you run an sdselect
search against an Amazon Security Lake dataset, and you select a specific time window for that search, Splunk software applies timestamp_from_unixtime
to the WHERE clause in that search when it processes the search.
However, you can also run a search against an Amazon Security Lake dataset that explicitly adds filters, including the partition time field, time_dt
, to the WHERE clause, like this:
| sdselect accountid, region FROM my_ASL_data WHERE time_dt < timestamp_from_unixtime(1728348383)
Like many other evaluation functions, timestamp_from_unixtime
supports nested functions:
| sdselect accountid, region FROM my_ASL_data WHERE time_dt < timestamp_from_unixtime(relative_time(now(), "-1h@h"))
timestamp_to_unixtime(<time>)
Description
This function takes a field with a SQL timestamp
data type as its main argument and converts its value into an equivalent numeric UNIX time format timestamp.
Usage
Use the timestamp_to_unixtime
evaluation function in the WHERE clause of an sdselect
search to directly compare time fields that have SQL timestamp
values with timestamps in the numeric UNIX time format.
The timestamp_to_unixtime
function is especially useful for federated searches of Amazon Security Lake datasets that take advantage of partition time fields. The default partition time field for Amazon Security Lake events is time_dt
, which has SQL timestamp
values.
The timestamp_to_unixtime
function fails if its argument is a field with a data type other than SQL timestamp
.
Basic example
This example converts the SQL timestamp
data type values of time_dt
to numeric UNIX time format timestamp values and compares them to the provided UNIX time format timestamp.
| sdselect accountid, region FROM my_ASL_data WHERE 1728348383 >= timestamp_to_unixtime(time_dt)
See also
- Function information
- Evaluation functions quick reference
- Related functions
- now function
- relative time function
Use time fields in sdselect searches | sdselect command examples |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.3.2411
Feedback submitted, thanks!