Updates to partitioning and filtering behavior in Edge Processor pipelines

The January 22, 2024 release of the Edge Processor solution includes an update to how Edge Processors interpret where commands in pipelines. This page describes the update in detail and provides examples to clarify the before-and-after impact.

Change overview

When configuring a pipeline, you must define a partition. A partition is a subset of data that is selected for processing by a pipeline, and you define a partition by specifying the conditions that the data must meet in order to be selected for processing. In addition to defining a partition, you have the option of using where commands in the SPL2 statement of your pipeline to filter the data that the pipeline receives. When data is excluded by partition conditions, the Edge Processor sends that excluded data to the default destination. However, when data is excluded by where commands in the SPL2 statement, the Edge Processor drops that excluded data. See Partitions for more information.

Before the January 22, 2024 update, if a pipeline had one or more where commands as the first processing commands in the SPL2 statement, Edge Processors interpreted those commands as being partition conditions. As a result, any data that did not match the where clauses was sent to the Edge Processor's default destination instead of being dropped.

After the update, Edge Processors interpret all where commands in the SPL2 statement of a pipeline as being filters in the main body of the pipeline instead of partition conditions. Going forward, data that does not match the where clauses will be dropped.

Which pipelines are affected by this update?

This update affects all pipelines where the from $source command is immediately followed by one or more where commands. For example, the update has an impact on this pipeline:

$pipeline = | from $source | where host="buttercup" | where source="test_server" | eval index="my_test_index" | into $destination;

But the update has no impact on this pipeline:

$pipeline = | from $source | eval index="my_test_index" | where host="buttercup" | where source="test_server" | into $destination;

To ensure that this update does not cause unexpected changes in how your Edge Processors handle data, the Edge Processor service has exempted applied pipelines from this update. All pipelines that were applied to an Edge Processor before the update continue to work the same way as they did before the update, and the configurations have not been altered in any way.

However, the next time you open a pipeline for editing, the update will take effect if it is relevant for that pipeline, and the Edge Processor service will automatically try to adjust the configuration of the pipeline in order to preserve the pre-existing data processing behavior. This adjustment involves converting all the where commands that immediately follow the from $source command into partition conditions. The Edge Processor service will display the following message to confirm that it has made these changes to the pipeline:

Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. In order to preserve the behavior of pipelines that are already in use, when you opened this pipeline for editing, some processing commands were automatically moved from the SPL2 statement of the pipeline into the partition definition. Save these changes in your pipeline before proceeding.

What actions do I need to take?

If you open a pipeline for editing and receive the system message confirming that the Edge Processor service has updated the pipeline configuration, make sure to save the changes to your pipeline. Doing so ensures that the pipeline continues to work as intended and that the configuration is up to date.

Be aware that, in the following scenarios, the pipeline configuration does not get automatically adjusted even though the updates to the partitioning and filtering behavior take effect for the pipeline:

You open a pipeline for editing, but the Edge Processor service fails to adjust the pipeline configuration. In this case, the Edge Processor service displays the following message:

Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. This change can impact how Edge Processors determine which data to drop or send to the default destination. Update the partition and "where" command configurations in this pipeline as needed, and then save your changes.

You apply a pipeline to an Edge Processor without editing it first.

In these cases, you must review and manually update the configuration of the partition and where commands in your pipeline. Otherwise, your pipeline might work differently than expected.

The following examples illustrate how the January 22, 2024 update can impact a pipeline, and outline the steps you might need to take in order to preserve your desired data processing behavior.

Example 1: The Edge Processor service adjusts the pipeline configuration

This example describes the system messages and configuration changes that you might notice when the Edge Processor service successfully adjusts the pipeline configuration to preserve the current data processing behavior.

Before the update

Assume that, before the update, you had a pipeline with the following configurations:

Partition condition: sourcetype equals cisco_syslog

SPL2 statement:

$pipeline = | from $source | where host="buttercup" | where source="test_server" | eval index="my_test_index" | where result="success" | into $destination;

This pipeline accepted and processed incoming events only if they met all of these conditions:

The sourcetype field contains the value cisco_syslog.
The host field contains the value buttercup.
The source field contains the value test_server.

Any data that did not match all of these conditions was sent to the Edge Processor's default destination without being processed by the pipeline.

For example, given 2 events that have the following fields:

Event 1:

sourcetype	host	source	result
cisco_syslog	buttercup	test_server	success

Event 2:

sourcetype	host	source	result
cisco_syslog	buttercup	main_server	success

The pipeline would accept Event 1, process it, and send it to the data destination specified in the pipeline. However, the pipeline would reject Event 2 and send it to the Edge Processor's default destination instead.

After the update

After the update, you open the pipeline for editing. The Edge Processor service displays the following message to confirm that it has made some changes to the pipeline:

Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. In order to preserve the behavior of pipelines that are already in use, when you opened this pipeline for editing, some processing commands were automatically moved from the SPL2 statement of the pipeline into the partition definition. Save these changes in your pipeline before proceeding.

The Edge Processor service changes the pipeline configuration to the following:

Partition conditions:
- sourcetype equals cisco_syslog
- host equals buttercup
- source equals test_server

SPL2 statement:

$pipeline = | from $source | eval index="my_test_index" | where result="success" | into $destination;

These changes ensure that the pipeline continues to work the same way as before the update. Make sure to save these changes to your pipeline.

Example 2: The pipeline configuration is not automatically adjusted

This example describes the unwanted changes in data processing behavior that can happen if the pipeline is not adjusted after the update takes place, and what you can do to fix your pipeline.

In this example, the pipeline starts with two where clauses that are joined by the AND operator. The AND operator is supported for partition conditions, so you can fix the pipeline by converting those where clauses into partition conditions.

Before the update

Assume that, before the update, you had a pipeline with the following configurations:

Partition condition: sourcetype equals cisco_syslog

SPL2 statement:

$pipeline = | from $source | where host="buttercup" AND source="test_server" | eval index="my_test_index" | where result="success" | into $destination;

This pipeline accepted and processed incoming events only if they met all of these conditions:

The sourcetype field contains the value cisco_syslog.
The host field contains the value buttercup.
The source field contains the value test_server.

Any data that did not match all of these conditions was sent to the Edge Processor's default destination without being processed by the pipeline.

For example, given 2 events that have the following fields:

Event 1:

sourcetype	host	source	result
cisco_syslog	buttercup	test_server	success

Event 2:

sourcetype	host	source	result
cisco_syslog	buttercup	main_server	success

The pipeline would accept Event 1, process it, and send it to the data destination specified in the pipeline. However, the pipeline would reject Event 2 and send it to the Edge Processor's default destination instead.

After the update

After the update, one of the following scenarios occurs:

You open the pipeline for editing, and the Edge Processor service displays the following message indicating that it failed to adjust the pipeline configuration:

Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. This change can impact how Edge Processors determine which data to drop or send to the default destination. Update the partition and "where" command configurations in this pipeline as needed, and then save your changes.

You apply this pipeline to an Edge Processor without editing the pipeline first.

In these scenarios, the pipeline configuration remains unchanged, but the pipeline is now subject to the updated Edge Processor behavior.

The pipeline now accepts and processes incoming events as long as the sourcetype field contains the value cisco_syslog. The pipeline would still accept Event 1, process it, and send it to a data destination. However, instead of rejecting Event 2 and sending it to the Edge Processor's default destination, this pipeline would accept and start processing Event 2 before dropping it because the where source="test_server" clause filters it out.

Fix the pipeline manually

To fix the pipeline so that it works the same way as it did before the update, you'll need to do the following:

Open the pipeline for editing.
Take note of any where commands that immediately follow the from $source command, and then add partition conditions that correspond to these where commands.
For example, in this pipeline, where host="buttercup" and where source="test_server" immediately follow the from $source command:
```
$pipeline = | from $source | where host="buttercup" AND source="test_server" | eval index="my_test_index" | where result="success" | into $destination;
```
In this case, you would need to add the following conditions to the partition:
- host equals buttercup
- source equals test_server
In the SPL2 editor, delete all the where commands that immediately follow the from $source command.
For example, the pipeline from step 2 becomes the following:
```
$pipeline = | from $source | eval index="my_test_index" | where result="success" | into $destination;
```
Save these changes to your pipeline.

Now, the pipeline will continue to work the same way as it did before the January 22, 2024 update.

Updates to partitioning and filtering behavior in Edge Processor pipelines

Change overview

Which pipelines are affected by this update?

What actions do I need to take?

Example 1: The Edge Processor service adjusts the pipeline configuration

Before the update

After the update

Example 2: The pipeline configuration is not automatically adjusted

Before the update

After the update

Fix the pipeline manually

See also

Comments

Updates to partitioning and filtering behavior in Edge Processor pipelines

Was this topic useful?