from command: Examples

The following are examples for using the SPL2 from command. To learn more about the from command, see How the SPL2 from command works.

You can specify the clauses in the from command in uppercase or lowercase. These examples use uppercase for readability.

Some of these examples start with the SELECT clause and others start with the FROM clause. Both of these clauses are valid syntax for the from command when it is used in a search.

1. Specify string values in quotations

The following search shows that string values in field-value pairs must be enclosed in double quotation marks.

FROM my_index sourcetype="syslog" ...

Because string values must be enclosed in double quotation marks, you can reverse the order of field-value pairs. For example, the previous search can also be specified this way:

FROM my_index "syslog"=sourcetype ...

2. Specify a JSON array as a dataset literal

The following search uses a JSON array of nested objects as the source dataset:

$city_bridges = from [{"name": "London","Bridges": [{ "name": "Tower Bridge", "length": 801 }, { "name": "Millennium Bridge", "length": 1066 }] }, {"name": "Venice", "Bridges": [{ "name": "Rialto Bridge", "length": 157 }, { "name": "Bridge of Sighs", "length": 36 }, { "name": "Ponte della Paglia" }] }, {"name": "San Francisco","Bridges": [{ "name": "Golden Gate Bridge", "length": 8981 }, { "name": "Bay Bridge", "length": 23556 }] } ]

The results look like this:

name	Bridges
London	[{ "name": "Tower Bridge", "length": 801 }, { "name": "Millennium Bridge", "length": 1066 }]
Venice	[{ "name": "Rialto Bridge", "length": 157 }, { "name": "Bridge of Sighs", "length": 36 }, { "name": "Ponte della Paglia" }]
San Francisco	[{ "name": "Golden Gate Bridge", "length": 8981 }, { "name": "Bay Bridge", "length": 23556 }]

For a list of sample arrays that you can use, see Sample dataset literals in the SPL2 Search Manual.

3. Search a metric index

The following search looks for data in the _metrics index:

SELECT earliest_time(_value), metric_name FROM _metrics WHERE like(metric_name, "deploy%") GROUP BY metric_name

To use a wildcard in the WHERE clause, you cannot use the asterisk ( * ) wildcard character. You must use the like function. See Comparison and Conditional functions.

4. Pipeline examples

These examples show how to use the from command in a pipeline.

Select data to be processed in a pipeline

The following pipeline selects a subset of the data received by the Edge Processor or Ingest Processor and then sends it to a destination, such as an index or Amazon S3 bucket. The $source parameter refers to an internal dataset in the Edge Processor or Ingest Processor.

$pipeline = | from $source | into $destination

Mask sensitive information using a pipeline

The following pipeline selects a subset of the data received by the Edge Processor or Ingest Processor, replaces the credit card numbers in the _raw field with the word "<redacted>", and then sends the masked data to a destination. The $source parameter refers to an internal dataset in the Edge Processor or Ingest Processor.

$pipeline = | from $source 
| eval _raw=replace(_raw, /[1-5][0-9]{15}/i, "<redacted>") 
| into $destination

Filter extracted data in pipelines

The following example filters a set of audit logs so that only failed login attempts remain. First, the record types and result values must be extracted from the logs.

$pipeline = | from $source  
| rex field=_raw /type=(?P<RecordType>[A-Z_]+).*res=(?P<Result>\w+)/
| where RecordType = "USER_LOGIN"
| where Result = "failed"
| fields - RecordType, Result
| into $destination

5. Search using wildcards

You can use a wildcard character ( * ) in the SELECT clause to search for similar field names. You must enclose the wildcard syntax in single quotation marks. For example:

SELECT 'host*' FROM main ...

You can use a wildcard to search for only internal fields, which begin with an underscore ( _ ) character . For example:

FROM main SELECT '_*'

The WHERE clause does not support the wildcard character ( * ). However you can use the like function to perform a wildcard search. For example:

FROM main WHERE ipaddress LIKE "198.%"...

The like function supports several syntaxes, see Comparison and Conditional functions.

6. Specify multiple expressions in the WHERE clause

Use the WHERE clause to filter the data by specifying one or more expressions. You need to separate multiple expressions using logical operators, such as AND and OR.

FROM index=_internal WHERE like(source, "%license%") AND type="usage" | stats sum(b) BY idx

The WHERE clause uses the like function to perform a search with wildcard. The WHERE clause does not support the asterisk ( * ) wildcard character. For more information about the like function, see Comparison and Conditional functions.

For more information about logical operators, see Predicate expressions in the SPL2 Search Manual.

7. Search for multiple terms in events

You can search for multiple terms in your events by using a search literal in the WHERE clause. An AND operator is implied between the terms specified in the search literal. To specify a search literal, you enclose the list of terms in backtick characters ( ` ).

The following search looks for the terms invalid AND user AND sshd[5258] and returns the events that contain all three terms:

SELECT _time, source FROM main WHERE `invalid user sshd[5258]`

For more information, see Search literals in expressions in the SPL2 Search Manual.

8. Specify a single field in the GROUP BY clause

You can specify one or more fields to group by. In this example a single field, host, is specified.

When using the from command, if the GROUP BY clause is specified, the SELECT clause must also be specified. The SELECT clause must contain either an aggregation or the fields in the GROUP BY clause. In this example, the SELECT clause contains the aggregation avg(cpu_usage):

SELECT avg(cpu_usage) AS 'Avg Usage' FROM my_index WHERE sourcetype="syslog" GROUP BY host

9. Specify a time span in the GROUP BY clause

You can arrange search results in groups using a time span.

When using the from command, if the GROUP BY clause is specified, the SELECT clause must also be specified.

The following search returns web access error information, grouped by host in 5 minute time spans.

SELECT count(), host, _time FROM index WHERE sourcetype="webaccess" AND `ERROR` GROUP BY host, span(_time, 5m)

There are several ways to specify a time span with the GROUP BY clause, see from command: Syntax.

10. Sorting search results using the ORDER BY clause

Suppose you use the following search to return count of the actions taken, grouped by the productId field.

FROM sample_data_index WHERE status=200 AND host="www4" GROUP BY productId SELECT count(action), productId

The results look something like this:

productId	count(action)
DC-SG-G02	12
FS-SG-G03	10
MB-AG-G07	17
PZ-SG-G05	4
SF-BVS-G01	11
SF-BVS-T01	6
WC-SH-G04	2
WC-SH-T02	15

By default the results are sorted on the GROUP BY field, productId.

You want to sort the results in descending order based on the count. However, the name of the count field in the output is the name of the aggregation specified in the SELECT clause, count(action). The ORDER BY clause will not sort on a field name that is an aggregation because it contains special characters, the parenthesis. You have two options, you can either rename the aggregation field count(action) in the SELECT clause using the AS keyword, or you can enclose the field name in single quotations, such as ORDER BY 'count(action)' DESC.

Here's the updated search using the rename option:

FROM sample_data_index WHERE status=200 AND host="www4" GROUP BY productId SELECT count(action) AS Count, productId ORDER BY Count DESC

The results look something like this:

productId	Count
MB-AG-G07	17
WC-SH-T02	15
DC-SG-G02	12
SF-BVS-G01	11
FS-SG-G03	10
SF-BVS-T01	6
PZ-SG-G05	4
WC-SH-G04	2

11. Enrich event data with a lookup dataset using the JOIN clause

Consider the following data from a set of events with login information:

_time	action	userID	host	port
8:00 AM 29 Nov 2021	Failed password	patel	yangtze.buttercupgames.com	3390
7:15 AM 29 Nov 2021	Failed password	zhang	nile.example.net	1851
9:30 PM 15 Nov 2021	Session opened	dubois	danube.sample.com	1260
6:11 AM 14 Nov 2021	Failed password	sullivan	volga.example.com	2766
11:20 AM 15 Nov 2021	Failed password	martin	volga.example.com	3622
08:13 AM 31 Oct 2021	Failed password	mayer	ganger.example.com	3658
11:59 PM 23 Oct 2021	Failed password	patel	yangtze.buttercupgames.com	1214

You want to enrich the event data with information from the host_info lookup dataset, which contains information about known hosts:

hostname	kind	status	host_contact
mekong.buttercupgames.com	internal	allowed	alex@buttercupgames.com
yangtze.buttercupgames.com	internal	allowed	claudia@buttercupgames.com
danube.sample.com	supplier	allowed	martin@sample.com
ganger.example.com	external	allowed	david@example.com
volga.example.com	external	banned

Specifically, you want every event that matches the search criteria to appear in the search results. If there is a match between an event and the host_info lookup dataset, you want to display the kind and status from the host_info lookup dataset with each event. This is referred to as a left join, which is shown in the following image.

The A circle represents the event dataset and the B circle represents the host_info lookup dataset.

The following example enriches data in the main event dataset with data from the host_info lookup dataset, where there is a matching host name. An alias for each dataset is created using the AS clause. The WHERE clause filters out events where the host kind is not internal. The SELECT clause specifies which fields to return. The results are organized by the host field.

FROM main AS m LEFT JOIN host_info AS h ON m.host=h.hostname WHERE h.kind!="internal" SELECT m.host, m.action, m.userID, h.kind, h.status GROUP BY m.host

When you use the JOIN clause, the aliases you specify in the search are not propagated to the search results. For example, in this search you specified m.host, but the search results display host.

The results of this search are shown in the following table. As you can see, the events that have a host with a kind of internal, the buttercupgames.com hosts, have been removed. The results also show that there is no host information for the nile.example.net host.

host	action	userID	kind	status
danube.sample.com	Session opened	dubois	supplier	allowed
ganger.example.com	Failed password	mayer	external	allowed
nile.example.net	Failed password	zhang
volga.example.com	Failed password	sullivan	external	banned
volga.example.com	Failed password	martin	external	banned

12. Use consecutive JOIN clauses to return data from multiple datasets

You can create a stacked join search that uses multiple JOIN clauses to return data from multiple datasets.

Consider the following data from a set of events in the orders dataset:

_time	clientip	action	pid	quantity
12:00:01 PM 20 Jan 2022	192.0.2.0	purchase	DC-SG-G02	1
10:13:34 AM 20 Jan 2022	203.0.113.255	addtochart	MB-AG-G07	3
9:55:51 AM 20 Jan 2022	203.0.113.0	purchase	WC-SH-A01	1
9:21:25 AM 20 Jan 2022	198.51.100.255	changequantity	PZ-SG-G05	2
9:14:17 AM 20 Jan 2022	192.0.2.0	purchase	SF-BVS-01	1
8:42:23 AM 20 Jan 2022	198.51.100.0	purchase	SF-BVS-G01	1
8:30:45 AM 20 Jan 2022	192.0.2.0	purchase	WC-SH-T02	2
7:57:14 AM 20 Jan 2022	198.51.100.0	purchase	PZ-SG-G05	1

You want to enrich the orders event data with information from the products lookup dataset, which contains product and price information. Here is an example of the data in the products dataset:

productId	product_name	price	sale_price	supplierId
DC-SG-G02	Dream Crusher	39.99	24.99	1238
FS-SG-G03	Final Sequel	24.99	16.99	5017
WC-SH-G04	World of Cheese	24.99	19.99	7024
WC-SH-T02	World of Cheese Tee	19.99	16.99	7024
PZ-SG-G05	Puppies vs. Zombies	4.99	3.99	7045
MB-AG-G07	Manganiello Bros.	38.99	27.99	4111
SF-BVS-G01	Grand Theft Scooter	26.99	21.99	5007
SF-BVS-01	Pony Run	49.99	41.99	5007

You want to display the product names instead of the product IDs in your search results.

You want every order event that matches the search criteria to appear in the results, even if the item ordered does not have a matching entry in the products lookup dataset. Notice that the third order contains the product ID WC-SH-A01, which does not appear in the products lookup dataset.

You can display the product names in the search results by including a JOIN clause to your search that enriches the orders dataset with the data from the products dataset. Specifically, you need to use a LEFT JOIN to accomplish this result. The datasets are joined on the field that the datasets have in common, which is the product ID field.

Here is the search you can use to add the product names to the orders events:

FROM orders AS o LEFT JOIN products AS p ON o.pid=p.productId SELECT o._time, o.pid, p.product_name, o.quantity

The results look like this:

_time	pid	product_name	quantity
12:00:01 PM 20 Jan 2022	DC-SG-G02	Dream Crusher	1
10:13:34 AM 20 Jan 2022	MB-AG-G07	Manganiello Bros.	3
9:55:51 AM 20 Jan 2022	WC-SH-A01		1
9:21:25 AM 20 Jan 2022	PZ-SG-G05	Puppies vs. Zombies	2
9:14:17 AM 20 Jan 2022	SF-BVS-01	Pony Run	1
8:42:23 AM 20 Jan 2022	SF-BVS-G01	Grand Theft Scooter	1
8:30:45 AM 20 Jan 2022	WC-SH-T02	World of Cheese Tee	2
7:57:14 AM 20 Jan 2022	PZ-SG-G05	Puppies vs. Zombies	1

Because there is no matching product ID for WC-SH-A01 in the products dataset, there is no product name in the search results. Using a LEFT JOIN is a way to highlight missing information from the second, or right-side, dataset.

Now you want to find out the name and city of the supplier that provides each product. You can enrich the search results data with information from the suppliers lookup dataset, based on the supplier ID.

Here is an example of the data in the suppliers dataset:

supplier_id	supplier_name	city	state/province	country
1009	Mile High Games	Denver	Colorado	United States
1237	Area 51 Games	Roswell	New Mexico	United States
4111	Isthmus Pastimes	Panama City	Panama	Panama
5007	EuroToys	Prague	Central Bohemia	Czech Republic
5017	Der Kriegsspiel	Cologne	North Rhine-Westphalia	Germany
7024	Happy Fun Games	Kyoto	Kyoto	Japan
7045	Kiwi Game Warehouse	Auckland	Auckland	New Zealand

To display the supplier names and city in your search results you need to add another JOIN clause to your search. Because you want every product in the search results returned, whether or not there is a corresponding supplier, you will use a LEFT JOIN. The products and suppliers datasets can be joined on the supplier ID field.

Looking back at the products dataset, the Dream Crusher product has a supplier ID of 1238, which does not appear in the suppliers dataset. The LEFT JOIN will highlight the absence of this information.

Here is the updated search:

FROM orders AS o LEFT JOIN products AS p ON o.pid=p.productId LEFT JOIN suppliers AS s ON p.supplierId=s.supplier_id SELECT o._time, o.pid, p.product_name, p.supplierId, s.supplier_name, o.quantity

The results of the search look like this:

_time	pid	product_name	supplierId	supplier_name	quantity
12:00:01 PM 20 Jan 2022	DC-SG-G02	Dream Crusher			1
10:13:34 AM 20 Jan 2022	MB-AG-G07	Manganiello Bros.	4111	Isthmus Pastimes	3
9:55:51 AM 20 Jan 2022	WC-SH-A01				1
9:21:25 AM 20 Jan 2022	PZ-SG-G05	Puppies vs. Zombies	7045	Kiwi Game Warehouse	2
9:14:17 AM 20 Jan 2022	SF-BVS-01	Pony Run	5007	EuroToys	1
8:42:23 AM 20 Jan 2022	SF-BVS-G01	Grand Theft Scooter	5007	EuroToys	1
8:30:45 AM 20 Jan 2022	WC-SH-T02	World of Cheese Tee	7024	Happy Fun Games	2
7:57:14 AM 20 Jan 2022	PZ-SG-G05	Puppies vs. Zombies	7045	Kiwi Game Warehouse	1

13. Return data from a view

This search returns the timestamp and client IP fields from a view called mysecurityview.

FROM mysecurityview | fields _time, clientip ...

14. Use the HAVING clause to filter after aggregations

The following example calculate the sum of the bytes field in the main index from events that occurred in the last 5 minutes. The results are grouped by the host field. The sum and the host fields are returned, where the sum of the bytes is greater than I MB.

SELECT sum(bytes) AS sum, host FROM main WHERE earliest=-5m@m AND latest=@m GROUP BY host HAVING sum > 1024*1024

15. Specify offsets and limits

The following search returns web access error information, grouped by host and 5 minute time spans, that have a count greater than 10. The LIMIT clause is used to return up to 50 results. The OFFSET clause is used to skip the first 20 results, starting with the 21st result.

SELECT count(), host, _time FROM index WHERE sourcetype="webaccess" AND `ERROR` GROUP BY host, span(_time, 5m) HAVING count > 10 ORDER BY count desc LIMIT 50 OFFSET 20

Related answers from Splunk Community