sdselect command syntax details
Syntax
The required syntax is in bold.
- | sdselect
- [reuse_search_results=<bool>]
- ( <field-list> | <stats-func> | <eval-func>)...
- <from-clause>
- [WHERE <eval-expression>]
- [GROUPBY ((<field-list> | <eval-func>)... [span=[<unsigned_int>]<timescale>])]
- [ORDERBY (<field-list> | <eval-func>)...]
- [LIMIT <unsigned_int>]
Required arguments
- <field-list>
- Syntax: <field>, …
- Description: List 1 or more fields to group results.
- Do not place quotation marks around nested fields. Place single quotation marks around the names of flattened fields that begin with numeric characters or contain special characters or spaces. See Special handling for sdselect syntax elements.
- <stats-func>
- Syntax: (count | count(<field>) | <function>(<field>))... [AS <string>]
- Description: Run a count of all events, a basic count of a field, or a statistical function on a field. You have the option of renaming the results with the AS keyword.
- The following table lists the supported functions for
sdselect
by type of function. Use the links in the table to see descriptions and examples for each function. For an overview about using functions with commands, see Statistical and charting functions in the Splunk Cloud Platform Search Reference.
Type of function Supported functions and syntax Aggregate functions avg()
count()
distinct_count()
,dc()
estdc()
max()
median()
min()
mode()
perc<int>()
range()
stdev()
stdevp()
sum()
var()
varp()
Multivalue stats and chart functions values()
Time functions earliest()
earliest_time()
latest()
latest_time()
- The
sdselect
command does not support usage ofc()
, the abbreviated form ofcount()
.
- The
sdselect
command does not support usage of thedistinct_count()
aggregate function or thevalues()
multivalue function in conjunction with any of the time functions.
- When you apply a field with string values to a statistical function that expects fields with numeric values,
sdselect
converts the string field values into numeric field values. For example, say you have a search that starts like this:| sdselect avg(bytes) FROM...
Ifbytes
is a field with string values,sdselect
converts those string values into numeric values before it applies the values to theavg()
function.
- If you use either the
earliest_time()
function or thelatest_time()
function in a search, you must apply all functions in the search to the same field.
- For example, the following search is valid. It uses the
earliest_time()
function and applies the same field,end_time
, to both functions in the search. | sdselect avg(end_time), earliest_time(end_time) from my_csv_data
- The following search is invalid. It uses the
earliest_time()
function, but it applies different fields,end_time
andnap_time
, to the functions in the search. | sdselect avg(end_time), earliest_time(nap_time) from my_csv_data
- <eval-func>
- Syntax: (json_extract(<json>, <path>) | json_extract_exact(<json>, <key>))... [AS <string>]
- Description: Use the
json_extract()
orjson_extract_exact()
evaluation functions to extract and return values from JSON material in your events. Optionally rename those results. See sdselect command usage for detailed information about how application ofjson_extract()
andjson_extract_exact()
is limited insdselect
searches.
- No other evaluation functions are supported for projection in
sdselect
searches.
- <from-clause>
- Syntax: FROM ( federated:<federated_index_name> | <federated_index_name>)
- Description: Use the FROM clause to specify the federated index that maps to an AWS Glue Data Catalog table dataset that you want to search. You can optionally prefix federated index names with
federated:
.
- In Splunk Web, to see a list of federated indexes that you have defined for your deployment, navigate to Settings and then Federated Search and select the Federated Indexes tab.
- For more information, see Map a federated index to an AWS Glue Data Catalog dataset.
Optional arguments
- reuse_search_results
- Syntax: reuse_search_results=<bool>
- Description: Specifies whether an
sdselect
search can reuse the result set from the last successful run of the samesdselect
search, as long as that previous search run took place within the previous 24 hours.sdselect
searches that reuse search results can benefit from improved search performance and a reduction in data scan unit consumption. Thereuse_search_results
argument defaults totrue
, which means thatsdselect
searches reuse results whenever it is possible for them to do so.
- Reuse of
sdselect
search results is useful forsdselect
searches that do not return different result sets within a given time frame. For example,sdselect
searches with absolute or fixed date-to-date time ranges are good candidates for search result reuse. If you run such searches with a frequency below 24 hours, you might see an increase in their performance and a reduction in their data scan consumption.
- When you run an
sdselect
search that reuses search results, an informational message appears in the Job Inspector that states that the results for the last successful run of the search have been reused for the current search job. The only exception to this rule aresdselect
searches with an all-time time range. When you run ansdselect
with an all-time time range, Splunk software displays a warning message under the search bar stating that the results from the last successful run of the all-time search have been reused.
- Turn search result reuse off for an
sdselect
search by addingreuse_search_results=false
to the search string. You might turn search result reuse off forsdselect
searches that return different result sets each time they run, when you run them on a frequent basis. For example, you might turn off search result reuse forsdselect
searches with relative time ranges, such as the last week to date, or the hour before the current hour, when you run those searches more than once in a 24 hour period.
- You do not need to turn off search result reuse for an
sdselect
search with a relative time range when more than 24 hours elapses between runs of the search.
Search result reuse is available only for Federated Search for Amazon S3
sdselect
searches. Federated Analytics does not support thereuse_search_results
argument insdselect
searches of remote Amazon Security Lake datasets.- Default: true
WHERE clause arguments
Use the WHERE clause to filter results. The WHERE clause is optional.
The sdselect
command uses an application of the WHERE clause that is similar to that of the where
command. See where in the Splunk Cloud Platform Search Reference.
For additional information about using the WHERE clause, see sdselect command WHERE clause operations.
- <where-clause>
- Syntax: WHERE <eval-expression>
- Description: Use the WHERE clause to filter the search results.
- The WHERE clause must precede the GROUPBY, ORDERBY, and LIMIT clauses if you use those clauses in your
sdselect
search.
- <eval-expression>
- Syntax: <eval-expression>
- Description: A combination of values, variables, operators, and functions that represent the value of your destination field. See sdselect command usage.
- If you use the WHERE clause you must include an
<eval-expression>
. This<eval-expression>
must be a Boolean expression, where the expression returns eithertrue
orfalse
. The WHERE clause returns only the results for which the<eval-expression>
returnstrue
.
- The
<eval-expression>
is case-sensitive with regard to field values. The search head checks the syntax of theeval-expression
before it runs the search and returns an error message if the expression is invalid.
- The following table lists the supported evaluation functions for
sdselect
by type of function. Use the links in the table to see descriptions and examples of each type of function.<eval-expression>
supports the following evaluation functions:
Type of function Supported functions and syntax Comparison and conditional functions like(<str>,<pattern>)
Conversion functions tonumber(<str>)
tostring(<value>)
Date and time functions now()
relative_time(<time>,<specifier>)
strftime(<time>,<format>)
strptime(<str>,<format>)
JSON functions json_extract(<json>,<path>)
json_extract_exact(<json>,<string>)
Text functions lower(<str>)
upper(<str>)
- See Evaluation functions in the Splunk Cloud Platform Search Reference. The
<eval-expression>
additionally supports Boolean, mathematical, and comparison operators.
- The
sdselect
command supports only the required arguments for thetonumber()
andtostring()
functions. For thetonumber()
function,sdselect
does not allow the<base>
argument. Thesdselect
command processes alltonumber()
string-to-number conversions in default base 10.
- For the
tostring()
function, thesdselect
command does not support the<format>
argument. Insdselect
searches, you can usetostring()
only to facilitate straightforward number-to-string conversions.
- There are some restrictions to the usage of date and time evaluation functions in the
sdselect
WHERE clause. See Apply date and time evaluation functions to fields in the WHERE clause.
- There are restrictions to the usage of the
json_extract()
andjson_extract_exact()
functions insdselect
operations. For more information, see sdselect command usage.
- Nested fields, certain kinds of field names, and literal strings require special handling in the
<eval-expression>
. See Special handling for sdselect syntax elements.
The
<eval-expression>
does not support usage of plus ( + ) and dot ( . ) characters to concatenate field names. You can use plus and dot characters to concatenate literal strings that are enclosed within double quotes.
GROUPBY clause arguments
You can use the GROUPBY clause to organize search results by field values or time spans. The GROUPBY clause is optional.
- <groupby-clause>
- Syntax: GROUPBY ((<field-list> | <eval-func>)... [span=[<unsigned_int>]<timescale>])
- Description: Group search results together according to field values and time ranges.
- If you use GROUPBY you must specify a
field-list
oreval-func
.
- You can optionally specify a
span
for a GROUPBY clause. If you specify aspan
, the<field-list>
must include the name of the Unix time field for the federated index invoked in thesdselect
search. See Map a federated index to an AWS Glue Data Catalog table dataset for more information about the Unix time field.
- The GROUPBY clause must follow the WHERE clause if you are using both clauses in conjunction with
sdselect
.
- The GROUPBY clause must precede the ORDERBY and LIMIT clauses, if you use either of those clauses.
- If you use the GROUPBY clause in a
sdselect
search without an ORDERBY clause,sdselect
sorts the search results by the fields according to the following sequence: - By the order that you have listed the fields in the GROUPBY clause.
- By the values of the fields in the GROUPBY clause in ascending alphanumeric order.
- For information about how ORDERBY clause sort operations interact with the GROUPBY clause, see GROUPBY and ORDERBY event sort interoperation.
- <field-list>
- Syntax: <field>, ...
- Description: Specify 1 or more fields by which to group results. If you specify a time field in the
<field-list>
, you must also specify a<span>
argument. Separate each field name in the field list with a comma.
- Nested fields, certain kinds of other field names, and literal strings require special handling in the
GROUPBY
clause. See Special handling for sdselect syntax elements.
- <eval-func>
- Syntax: (json_extract(<json>, <path>) | json_extract_exact(<json>, <key>)), ...
- Description: Use the
json_extract()
orjson_extract_exact()
evaluation functions to group by values extracted from JSON material in your events.
- See sdselect command usage for detailed information about how application of
json_extract()
andjson_extract_exact()
is limited insdselect
searches.
- No other evaluation functions are supported for projection in
sdselect
searches.
- <span>
- Syntax: span=[<unsigned_int>]<timescale>
- Description: The
<span>
of each time bin. If you use the GROUPBY clause to group by a time field, use the<span>
argument to group the time buckets. You can specify time spans such asGROUPBY <Unix time field> span=1h
orGROUPBY <Unix time field> span=5d
.
The
sdselect
command does not supportauto
as a value for thespan
argument.
- <timescale>
- Syntax: <sec> | <min> | <hr> | <day> | <month> | <year>
- Description: Time scale units. The
sdselect
command does not support subseconds. - Default: 1 second
- The following table describes the different kinds of time scale units that
span
supports and the valid values for each type of time scale unit.
- The following table describes the different kinds of time scale units that
Time scale Syntax Description <sec> s | sec | secs | second | seconds Time scale in seconds. <min> m | min | mins | minute | minutes Time scale in minutes. <hr> h | hr | hrs | hour | hours Time scale in hours. <day> d | day | days Time scale in days. <month> mon | month | months Time scale in months. <year> y | yr | year | years Time scale in years.
ORDERBY clause arguments
You can use the ORDERBY clause to sort the search results. The ORDERBY clause is optional.
- <orderby-clause>
- Syntax: ORDERBY (<field-list> | <eval-func>)...
- Description: Sort search results by the values of the field or fields specified for the clause.
- The ORDERBY clause must follow the WHERE and GROUPBY clauses if you use either clause in your
sdselect
search. - The ORDERBY clause must precede the LIMIT clause.
- <field-list>
- Syntax: <field> [ASC | DESC], ...
- Description: You must specify at least 1
<field>
for an ORDERBY clause. You can optionally use the ASC or DESC modifiers to indicate whethersdselect
sorts events by thefield
values in ascending or descending order. If you do not specify ASC or DESC for a field, by defaultsdselect
sorts the field in ascending order.
- If you specify multiple fields for an ORDERBY clause, separate the fields and their ASC or DESC modifiers by commas. When you specify multiple ORDERBY fields or evaluation functions,
sdselect
sorts search results by the fields and functions in the order that you list them.
- For example, say your search has the following ORDERBY clause:
ORDERBY id name city
. In this case, the ORDERBY clause sorts the search results first byid
. Then, the clause sorts rows that have matchingid
values byname
. Finally, the clause sorts rows with matchingid
andname
values bycity
.
- Nested fields, certain kinds of field names, and literal strings require special handling in the ORDERBY clause. See Special handling for sdselect syntax elements.
- If your search includes a GROUPBY clause, the fields you specify in the ORDERBY clause must be aggregated fields, or fields that appear in the GROUPBY clause. If your search does not include a GROUPBY clause, for the ORDERBY clause you can specify any field that exists in the dataset you are searching.
- If you want to use a renamed aggregated field, the ORDERBY clause must refer to the field by its rename. The following ORDERBY clause example renames an aggregated field from
scan_count
toscan_avg
:
| sdselect avg(scan_count) AS scan_avg FROM my_csv_data GROUPBY sid,action ORDERBY scan_avg DESC
- <eval-func>
- Syntax: (json_extract(<json>, <path>) | json_extract_exact(<json>, <key>)) [ASC | DESC], ...
- Description: Use the
json_extract()
andjson_extract_exact()
functions to sort by values extracted from JSON material in your events.
- You can optionally use the ASC or DESC modifiers to indicate whether
sdselect
sorts the extracted values in ascending or descending order. If you do not specify ASC or DESC for an evaluation function, by defaultsdselect
sorts the extracted values in ascending order.
- If you specify multiple evaluation functions for an ORDERBY clause, separate the evaluation functions and their ASC or DESC modifiers by commas. When you specify multiple ORDERBY fields or evaluation functions,
sdselect
sorts search results by the fields and evaluation functions in the order that you list them.
- For more information about how ORDERBY clause sort operations interact with the GROUPBY clause, see GROUPBY and ORDERBY event sort interoperation.
- See sdselect command usage for detailed information about how application of
json_extract()
andjson_extract_exact()
is limited insdselect
searches.
- No other evaluation functions are supported for GROUPBY in
sdselect
searches.
LIMIT clause arguments
You can use the LIMIT clause to specify the maximum number of search results to return. The LIMIT clause is optional.
- <limit-clause>
- Syntax: LIMIT <unsigned_int>
- Description: Set the maximum number of search results that an
sdselect
search can return. - Default: 100,000 results
- You must place the LIMIT clause after the WHERE, GROUPBY, and ORDERBY clauses, if you use any of them in the search.
- For information about changing the default number of results returned by an
sdselect
search without a LIMIT clause, see Control the maximum number of returned results if a LIMIT clause is not present.
See also
sdselect command overview | sdselect command usage |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.2.2406
Feedback submitted, thanks!