Splunk Cloud Platform

Federated Search

Evaluation functions specific to sdselect

This topic describes evaluation functions that are specific to the sdselect command, for use with federated searches of remote datasets. You use evaluation functions to evaluate an expression, based on your events, and return a result.

extract(<str>,<regular_expression>)

Description

This function returns the first occurrence of the pattern matched by the regular expression in the string.

Usage

The <str> argument is the name of a string field.

The <regular_expression> argument uses Java regular expression syntax to match and extract a pattern in the values of the field identified by <str>.

The extract() evaluation expression supports only one capturing group per instance of the regular expression. If you want to use the same regular expression to capture different parts of a field, you must have multiple instances of extract() in the search and have different capturing groups in the expression in each instance.

You can use this function in the following parts of sdselect searches:

  • Fields you select with sdselect.
  • Application of aggregate statistical functions to evaluation functions.
  • WHERE, GROUPBY, and ORDERBY clauses.

See sdselect command syntax details.

Examples

These examples show different ways to use the extract() evaluation function to extract fields from events in your sdselect datasets.

1. Extract a single value from a field

This search returns the first word in the description field as the value of a new short_description field.

| sdselect clientip, extract(description, "^(.+)") AS short_description FROM s3-provider

2. Extract different values from the same field

If you need to extract multiple values from the same field or string, you must use multiple instances of extract() to do it, with a different capturing group identified in each instance, as extract() does not support multiple capturing groups in the same regular expression.

For example, say you have a field in your data named sourceipaddress that is composed of IP address values like trustedadvisor.buttercupgames.com, and you want to break those addresses up into start, middle, and end terms. This is how you might use the extract() function to do this in an an sdselect search:

|sdselect extract(sourceipaddress, "([^.]+)") AS start_term, extract(sourceipaddress, "[^.]+\\.([^.]+)") AS middle_term, extract(sourceipaddress, "[^.]+\\.[^.]+\\.([^.]+)") AS end_term FROM s3-provider

The results of that search would look something like this:

start_term middle_term end_term
trustedadvisor
buttercupgames
com
support
buttercupgames
com
professionalservices
buttercupgames
com

timestamp_from_unixtime(<time>)

Description

This function takes a numeric UNIX time format timestamp or a field with a numeric UNIX time format timestamp as its argument and converts that input into an equivalent SQL timestamp data type value.

Usage

Use the timestamp_from_unixtime evaluation function in the WHERE clause of an sdselect search to directly compare timestamps in the numeric UNIX time format with SQL timestamp data type values. This evaluation function is necessary because Splunk software cannot apply type casting to evaluate searches like this:

| sdselect accountid, region FROM my_asl_data WHERE time_dt < 1728348383

The timestamp_from_unixtime function is especially useful for federated searches of Amazon Security Lake datasets that take advantage of partition time fields. The default partition time field for Amazon Security Lake events is time_dt, which has SQL timestamp values.

You can use the timestamp_from_unixtime function only with the sdselect command. You can apply the timestamp_from_unixtime function to fields that you select with sdselect, as well as to WHERE clause arguments in sdselect searches.

You cannot apply the timestamp_from_unixtime function to GROUPBY or ORDERBY clause arguments in sdselect searches.

For more information about running federated searches of Amazon Security Lake datasets, see About Federated Analytics.

Using timestamp_from_unixtime for search optimization

Because timestamp_from_unixtime does not transform SQL timestamp data type field values, it provides better optimization for searches that involve partition fields than timestamp_to_unixtime.

In other words, although the following two searches return the same results, the timestamp_from_unixtime option might offer better search performance.

| sdselect accountid, region FROM my_asl_data WHERE time_dt < timestamp_from_unixtime(1728348383)

| sdselect accountid, region FROM my_asl_data WHERE 1728348383 >= timestamp_to_unixtime(time_dt)

Basic examples

When you run an sdselect search against an Amazon Security Lake dataset, and you select a specific time window for that search, Splunk software applies timestamp_from_unixtime to the WHERE clause in that search when it processes the search.

However, you can also run a search against an Amazon Security Lake dataset that explicitly adds filters, including the partition time field, time_dt, to the WHERE clause, like this:

| sdselect accountid, region FROM my_ASL_data WHERE time_dt < timestamp_from_unixtime(1728348383)

Like many other evaluation functions, timestamp_from_unixtime supports nested functions:

| sdselect accountid, region FROM my_ASL_data WHERE time_dt < timestamp_from_unixtime(relative_time(now(), "-1h@h"))

timestamp_to_unixtime(<time>)

Description

This function takes a field with a SQL timestamp data type as its main argument and converts its value into an equivalent numeric UNIX time format timestamp.

Usage

Use the timestamp_to_unixtime evaluation function in the WHERE clause of an sdselect search to directly compare time fields that have SQL timestamp values with timestamps in the numeric UNIX time format.

The timestamp_to_unixtime function is especially useful for federated searches of Amazon Security Lake datasets that take advantage of partition time fields. The default partition time field for Amazon Security Lake events is time_dt, which has SQL timestamp values.

The timestamp_to_unixtime function fails if its argument is a field with a data type other than SQL timestamp.

Basic example

This example converts the SQL timestamp data type values of time_dt to numeric UNIX time format timestamp values and compares them to the provided UNIX time format timestamp.

| sdselect accountid, region FROM my_ASL_data WHERE 1728348383 >= timestamp_to_unixtime(time_dt)

See also

Function information
Evaluation functions quick reference
Related functions
now function
relative time function
sdselect command
sdselect command overview
sdselect command syntax details
sdselect command usage
sdselect command WHERE clause operations
Use time fields in sdselect searches
sdselect command examples for Amazon S3
Last modified on 18 February, 2025
Use time fields in sdselect searches   sdselect command examples

This documentation applies to the following versions of Splunk Cloud Platform: 9.3.2411


Please expect delayed responses to documentation feedback while the team migrates content to a new system. We value your input and thank you for your patience as we work to provide you with an improved content experience!

Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters