Splunk® Enterprise

Search Reference

Download manual as PDF

Download topic as PDF

union

Description

Merges the results from two or more datasets into one dataset. One of the datasets can be a result set that is then piped into the union command and merged with a second dataset.

The union command appends or merges event from the specified datasets, depending on whether the dataset is streaming or non-streaming and where the command is run. The union command runs on indexers in parallel where possible, and automatically interleaves results on the _time when processing events. See Usage.

If you are familiar with SQL but new to SPL, see Splunk SPL for SQL users.

Syntax

union [<subsearch-options>] <dataset> [<dataset>...]

Required arguments

dataset
Syntax: <dataset-type>:<dataset-name> | <subsearch>
Description: The dataset that you want to perform the union on. The dataset can be either a named or unnamed dataset.
  • A named dataset is comprised of <dataset-type>:<dataset-name>. For <dataset-type> you can specify a data model, a saved search, or an inputlookup. For example datamodel:"internal_server.splunkdaccess".
  • A subsearch is an unnamed dataset.


When specifying more than one dataset, use a space or a comma separator between the dataset names.

Optional arguments

subsearch-options
Syntax: maxtime=<int> maxout=<int> timeout=<int>
Description: You can specify one set of subsearch-options that apply to all of the subsearches. You can specify one or more of the subsearch-options. These options apply only when the subsearch is treated as a non-streaming search.
  • The maxtime argument specifies the maximum number of seconds to run the subsearch before finalizing. The default is 60 seconds.
  • The maxout argument specifies the maximum number of results to return from the subsearch. The default is 50000 results. This value is the maxresultrows setting is in the [searchresults] stanza in the limits.conf file.
  • The timeout argument specifies the maximum amount of time, in seconds, to cache the subsearch results. The default is 300 seconds.

Usage

The union command is a generating command.

Optimized syntax for streaming datasets

With streaming datasets, instead of this syntax:

<streaming_dataset1> | union <streaming_dataset2>

Your search is more efficient with this syntax:

... | union <streaming_dataset1>, <streaming_dataset2>

When the <streaming_dataset1> is placed before the union command, the search is processed as an append.
When the <streaming_dataset1> is placed after the union command, the search is processed as a multisearch, which is more efficient.

Where the command is run

Whether the datasets are streaming or non-streaming determines if the union command is run on the indexers or the search head. The following table specifies where the command is run.

Dataset type Dataset 1 is streaming Dataset 1 is non-streaming
Dataset 2 is streaming Indexers Search head
Dataset 2 is non-streaming Search head Search head

Interleaving results

When two datasets are retrieved from disk in time descending order, which is the default sort order, the union command interleaves the results. The interleave is based on the _time field. For example, you have the following datasets:

dataset_A

_time host bytes
4 mailsrv1 2412
1 dns15 231

dataset_B

_time host bytes
3 router1 23
2 dns12 22o


Both datasets are descending order by _time. When | union dataset_A, dataset_B is run, the following dataset is the result.

_time host bytes
4 mailsrv1 2412
3 router1 23
2 dns12 22o
1 dns15 231

Examples

1. Union events from two subsearches

The following example merges events from index a and index b. New fields type and mytype are added in each subsearch using the eval command.

| union [search index=a | eval type = "foo"] [search index=b | eval mytype = "bar"]

2. Union the results of a subsearch to the results of the main search

The following example appends the current results of the main search with the tabular results of errors from the subsearch.

... | chart count by category1 | union [search error | chart count by category2]

3. Union events from a data model and events from an index

The following example unions a built-in data model that is an internal server log for REST API calls and the events from index a.

... | union datamodel:"internal_server.splunkdaccess" [search index=a]

4. Specify the subsearch options

The following example sets a maximum of 20,000 results to return from the subsearch. The example specifies to limit the duration of the subsearch to 120 seconds. The example also sets a maximum time of 600 seconds (5 minutes) to cache the subsearch results.

... | chart count by category1 | union maxout=20000 maxtime=120 timeout=600 [search error | chart count by category2]

See also

PREVIOUS
typer
  NEXT
uniq

This documentation applies to the following versions of Splunk® Enterprise: 6.6.0, 6.6.1, 6.6.2, 6.6.3


Was this documentation topic helpful?

Enter your email address, and someone from the documentation team will respond to you:

Please provide your comments here. Ask a question or make a suggestion.

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters