from command overview

The SPL2 from command retrieves data from a dataset.

This command works differently in different product contexts:

When used in a search, the from command can retrieve data from various kinds of datasets such as indexes, metric indexes, lookups, views, saved searches, or jobs. Additionally, the command has optional clauses to filter, aggregate, project, and order the data.
When used in an Edge Processor or Ingest Processor pipeline, the from command selects a subset of the data received by the Edge Processor or Ingest Processor by retrieving that data from an internal dataset. The command has limited support for optional clauses.

When used in a search, the from command has a flexible syntax, which enables you to start a search with either the FROM clause or the SELECT clause. For example, these two searches are identical and return the exact same results:

Start with the FROM clause Start with the SELECT clause

| FROM main

WHERE earliest=-5m@m GROUP BY host


SELECT sum(bytes) AS sum, host

| SELECT sum(bytes) AS sum, host

FROM main WHERE earliest=-5m@m


GROUP BY host

The only difference between these searches is that one starts with the FROM clause and the other search starts with the SELECT clause.

Regardless of which clause you start a search with, to use the optional clauses you must specify the clauses in a specific hierarchical order. See from command usage.

Syntax

The SPL2 from command supports different syntaxes in different product contexts. For detailed information about the different syntaxes, see from command syntax details.

Syntax for searches

The required syntax is in bold.

FROM <dataset> [ AS <alias>]

[ JOIN <dataset> AS <alias> ON <join-condition>... ] ...

[ WHERE ( <predicate-expression> [<logical-operator> <predicate-expression>] ) ... ]

[ (GROUP BY | GROUPBY) ( <expression>[,<expression>... ] | span ( <field> ) | span ( <field>, [<int>]<timescale> ) | <field> span=( [<int>]<timescale> ) ]

[ (SELECT | SELECT DISTINCT) <expression> [, <expression>] ... ]

[ HAVING <expression> ]

[ (ORDER BY | ORDERBY) <expression>[,<expression>... ] [ASC | DESC] ]

[ LIMIT <integer> ]

[ OFFSET <integer> ]

Syntax for pipelines

The required syntax is in bold.

FROM <dataset>

[ WHERE ( <predicate-expression> [<logical-operator> <predicate-expression>] ) ... ]

[ (SELECT <field-list> ]

Uppercase and lowercase clause names

The clause names are shown in uppercase in the syntax for readability. You can specify the clause names in uppercase or lowercase. For example, you can specify FROM or from, GROUP BY or group by. However, you cannot specify the names in mixed case, such as Group By.

Using the from command clauses

The following table provides a brief explanation of what each clause is used for:

Clause	Explanation
FROM	Use the FROM clause to specify the dataset that you want to search.
JOIN	Use the JOIN clause to enrich your event data with data from another dataset.
WHERE	Use the WHERE clause to filter data. You use the WHERE clause before the data is aggregated.
GROUP BY	Use the GROUP BY clause to organize the search results.
SELECT	Use the SELECT clause to specify the fields you want returned from the search, or to aggregate the data.
HAVING	Use the HAVING clause to filter the results after the data is aggregated
ORDER BY	Use the ORDER BY clause to sort the results
LIMIT	Use the LIMIT clause to set a maximum for the number of results to return.
OFFSET	Use the OFFSET clause to return a window of records from the results by skipping rows in the result set.

For detailed information about each clause, see from command syntax details.

How the SPL2 from command works

The from command works differently in different product contexts:

In searches

The SPL2 from command is very SQL-like, but you don't have to know SQL to use it.

The from command supports two different syntax hierarchies. One hierarchy starts with the FROM clause. The other hierarchy starts with the SELECT clause. See from command usage.

The following examples focus on the FROM clause. To see examples that start with the SELECT clause and examples using the other clauses, see from command examples.

Use the from command to read data in any kind of dataset, such as a timestamped index, a metric index, a view, or a lookup.

The only required syntax is:

| FROM <dataset>

This simple search returns all of the data in the dataset. That might be a lot of data. Most of the time you will want to add a filter to your search to either narrow the results down to what you are looking for, or to exclude the data that you don't want to see.

To show you how this works, let's start with an event index called main that contains events with HTTP status codes. You want to find all of the events with a status code of 200.

You can search the main index using a simple search like this:

| FROM main WHERE status=200

You can use the WHERE clause to filter data by specifying a field-value pair or specifying a time-range.

Specifying field-value pairs

When you specify a field-value pair, if the value is a <string> it must be enclosed in double quotation marks. For example, the following search looks for a specific value linux_secure1 in the sourcetype field. The value must be enclosed in double quotation marks.

| FROM main WHERE sourcetype="linux_secure1"

Likewise, numbers that are interpreted as string values must also be enclosed in double quotation marks. For example:

| FROM main WHERE clientip="192.0.2.14"

Specifying a time range

You can use a Time Range Picker to specify a time range or you can specify a time range directly in your search syntax.

To specify a time range directly in your search, you use the earliest and latest time modifiers in the WHERE clause. You can specify an exact time such as earliest="10/5/2019:20:00:00", or a relative time such as starting with the previous hour earliest=-h.

Here's an example that specifies both a field-value pair and a beginning time range which goes back in time 1 hour.

| FROM main WHERE clientip="192.0.2.14" AND earliest=-1h

When a search doesn't specify an ending time range using latest, the current time now() is used.

For more information about specifying time modifiers, see Time modifiers in the SPL2 Search Manual.

Snap-to time

You can specify that your search time range starts at the beginning of a time unit, such as the beginning of a day or hour. This is referred to as a snap-to time.

Here's an example of using a time range in a search that is going back 5 minutes, snapping to the beginning of the minute. The end of the time range is the beginning of the current minute.

FROM main WHERE earliest=-5m@m AND latest=@m

To learn more about using a snap-to time, see Specifying relative time in the SPL2 Search Manual.

Other clauses

There are many clauses that you can use with the from command.

For information about the hierarchy of commands, see from command usage.

See from command examples for a range of examples using the from command clauses.

In pipelines

The from command selects a subset of data from the internal dataset inside the Edge Processor or Ingest Processor. This data can then be processed by subsequent SPL2 commands specified in the pipeline.

Consider the following pipeline:

$pipeline = | from $source | eval index="main" | into $destination

The following table describes what each command is doing in the pipeline:

Command or clause	Description
from command	Selects a subset of the data received by the Edge Processor or Ingest Processor. This subset is determined by the partition of the pipeline, which you configure in the pipeline builder.
eval command	Sets the value of the `index` field to `main` for all of the events in the selected subset of data.
into command	Sends the processed data to the destination dataset specified by the pipeline settings, which you configure in the pipeline builder.

You can use the optional WHERE and SELECT clauses to filter data and specify which fields to keep.

Filtering data

You can use the WHERE clause to filter data by specifying a field-value pair. For example, the following pipeline selects a subset of the data received by the Edge Processor or Ingest Processor, and then filters it so that only data with the high value in the priority field continues to the next command in the pipeline:

$pipeline = | FROM $source WHERE priority = "high" | eval index = "main" | into $destination

Specifying fields to keep

You can use the SELECT clause to specify the fields that you want to include in the downstream data. For example, the following pipeline selects a subset of the data received by the Edge Processor or Ingest Processor and only keeps the host, sourcetype, and body fields. All other fields are dropped.

$pipeline = | FROM $source SELECT host, sourcetype, body | into $destination

Related answers from Splunk Community