Splunk® Cloud Services

SPL2 Search Reference

union command usage

The union command is a generating command. Generating commands fetch information from the datasets, without any transformations.

You can use the union command at the beginning of your search to combine two datasets or later in your search where you can combine the incoming search results with a dataset.

Specifying a dataset

You can declare, or specify, a dataset several different ways. Here are some examples:

Type of declaration Description Example
Dataset references Specifying an existing dataset that is defined in the Metadata Catalog. The datasets in this example are indexes.

...| union main, customers, purchases

Transient Specifying a SPL subsearch as the dataset. Subsearches are enclosed in square brackets.

...| union [search main | stats count() by host ], [from customers | stats count() by host]

Fluent The search results that are piped into the union command are referred to as a fluent dataset. This type of declaration has a union command that contains one or more subsearches.

... <some search criteria> | union [<subsearch1>], [<subsearch2>]

Literal Using literal values that you type in as subsearches. Each subsearch is a dataset. This example shows three separate literal dataset declarations.

from [{state:"Washington", population:39557045}] | union [{state:"California", population:753591}, {state:"Oregon", population:4190713}]

Mixed Specifying a mixture of the types of declarations.


This example begins with a fluent, contains a dataset reference <ds1>, includes a subsearch comprised of SPL syntax <subsearch1>, and then a subsearch comprised of literal values.

... | <union ds1, [ <subsearch1> ], [ { "state": "Washington", "population": 39557045 } ]


Semantics

If all of the datasets that are unioned together are streamable time-series, the union command attempts to interleave the data from all datasets into one globally sorted list of events or metrics. The list is based on the _time field in descending order. Otherwise, the union command returns all the rows from the first dataset, followed by all the rows from the second dataset, and so on.

Interleaving results

When two datasets are retrieved from disk in time descending order, which is the default sort order, the union command interleaves the results. The interleave is based on the _time field. For example, suppose you have the following datasets:

dataset_A

_time Host Bytes
4 mailsrv1 2412
1 dns15 231

dataset_B

_time Host Bytes
3 router1 23
2 dns12 22o

Both datasets are descending order by _time. When | union dataset_A, dataset_B is run, the following dataset is the result.

_time Host Bytes
4 mailsrv1 2412
3 router1 23
2 dns12 22o
1 dns15 231

See also

union command
union command overview
union command syntax details
union command examples
Last modified on 18 March, 2024
union command syntax details   union command examples

This documentation applies to the following versions of Splunk® Cloud Services: current


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters