union command: Overview, syntax, and usage
The SPL2 union
command merges the results from two or more datasets into one larger dataset. One of the datasets can be the incoming search results that are then piped into the union
command and merged with a second dataset.
Use these links to quickly navigate to the main sections in this topic:
How the SPL2 union command works
You can use the SPL2 union
command to merge datasets that have identical or different field names. The following sections illustrate how to use the union
command in both of these situations.
Datasets with identical field names
Consider the following two datasets:
products-amer
productID | product_name | supplierID | supplier_name | categoryID |
---|---|---|---|---|
BS-AG-G09 | Benign Space Debris | A51G-USA | Area 51 Games | ARCADE |
SF-BVS-G01 | Grand Theft Scooter | IP-PAN | Isthmus Pastimes | ARCADE |
products-apac
productID | product_name | supplierID | supplier_name | categoryID |
---|---|---|---|---|
DC-SG-G02 | Dream Crusher | PMG-KOR | Play More Games | STRATEGY |
PZ-SG-G05 | Puppies vs. Zombies | TF-JAP | Tiger Fun | STRATEGY |
SC-MG-G10 | SIM Cubicle | PMG-KOR | Play More Games | SIMULATION |
You can use the SPL2 union
command to bring these dataset together. For example:
$products = union products-amer, products-apac
The results look something like this:
productID | product_name | supplierID | supplier_name | categoryID |
---|---|---|---|---|
BS-AG-G09 | Benign Space Debris | A51G-USA | Area 51 Games | ARCADE |
DC-SG-G02 | Dream Crusher | PMG-KOR | Play More Games | STRATEGY |
PZ-SG-G05 | Puppies vs. Zombies | TF-JAP | Tiger Fun | STRATEGY |
SC-MG-G10 | SIM Cubicle | PMG-KOR | Play More Games | SIMULATION |
SF-BVS-G01 | Grand Theft Scooter | IP-PAN | Isthmus Pastimes | ARCADE |
Datasets with different field names
Consider the following events from two datasets:
products-apac
productID | product_name | supplierID | supplier_name | categoryID |
---|---|---|---|---|
DC-SG-G02 | Dream Crusher | PMG-KOR | Play More Games | STRATEGY |
suppliers_apac
supplierId | supplier_name | contact_name | address | |
---|---|---|---|---|
PMG-KOR | Play More Games | Vanya Patel | vanya@sample.com | 234 Sejong-daero ... Seoul South Korea |
Notice that both events have a field called supplier_name
and fields for the supplier ID, but with different capitalization: supplierID
and supplierId
.
You can use the union
command to bring these dataset together. For example:
$products = union products-apac, suppliers_apac
When the datasets are unioned, the fields from both datasets added to the output. The NULL value is added to fields that were not in the original event.
The results look something like this:
address | categoryID | contact_name | productID | product_name | supplierID | supplierId | supplier_name | |
---|---|---|---|---|---|---|---|---|
NULL | STRATEGY | NULL | NULL | DC-SG-G02 | Dream Crusher | PMG-KOR | NULL | Play More Games |
234 Sejong-daero ... Seoul South Korea | NULL | Vanya Patel | vanya@sample.com | NULL | NULL | NULL | PMG-KOR | Play More Games |
Both events have a field called supplier_name
that appears in the output with the same value. However, because the supplier ID fields have different capitalization, both fields appear in the output, even though the fields have the same value.
Syntax
The required syntax is in bold.
- union
- <dataset> ["," <dataset>...]
Required arguments
- dataset
- Syntax: [ <dataset-kind>"."]<dataset-name>
- Description: The dataset that you want to perform the union on. Because dataset names must be unique, you only need to specify the dataset kind for built-in datasets that include the kind. The dataset can be the incoming set of search results, a dataset that has been defined in the Metadata Catalog, or a literal dataset that you type in. To perform a union with the incoming search results, you only need to specify one dataset. See the Usage section.
Usage
The union
command is a generating command. Generating commands fetch information from the datasets, without any transformations.
You can use the union
command at the beginning of your search to combine two datasets or later in your search where you can combine the incoming search results with a dataset.
Specifying a dataset
You can declare, or specify, a dataset several different ways. Here are some examples:
Type of declaration | Description | Example |
---|---|---|
Dataset references | Specifying an existing dataset. The datasets in this example are indexes. |
|
Transient | Specifying a SPL subsearch as the dataset. Subsearches are enclosed in square brackets. |
|
Fluent | The search results that are piped into the union command are referred to as a fluent dataset. This type of declaration has a union command that contains one or more subsearches.
|
|
Literal | Using literal values that you type in as subsearches. Each subsearch is a dataset. This example shows three separate literal dataset declarations. |
|
Mixed | Specifying a mixture of the types of declarations.
|
|
Semantics
If all of the datasets that are unioned together are streamable time-series, the union
command attempts to interleave the data from all datasets into one globally sorted list of events or metrics. The list is based on the _time
field in descending order. Otherwise, the union
command returns all the rows from the first dataset, followed by all the rows from the second dataset, and so on.
Interleaving results
When two datasets are retrieved from disk in time descending order, which is the default sort order, the union
command interleaves the results. The interleave is based on the _time
field. For example, suppose you have the following datasets:
dataset_A
_time | Host | Bytes |
---|---|---|
4 | mailsrv1 | 2412 |
1 | dns15 | 231 |
dataset_B
_time | Host | Bytes |
---|---|---|
3 | router1 | 23 |
2 | dns12 | 22o |
Both datasets are descending order by _time
. When | union dataset_A, dataset_B
is run, the following dataset is the result.
_time | Host | Bytes |
---|---|---|
4 | mailsrv1 | 2412 |
3 | router1 | 23 |
2 | dns12 | 22o |
1 | dns15 | 231 |
See also
- union command
- union command: Examples
timewrap command: Examples | union command: Examples |
This documentation applies to the following versions of Splunk® Cloud Services: current
Feedback submitted, thanks!