Updates to partitioning and filtering behavior in Edge Processor pipelines
The January 22, 2024 release of the Edge Processor solution includes an update to how Edge Processors interpret where
commands in pipelines. This page describes the update in detail and provides examples to clarify the before-and-after impact.
Change overview
When configuring a pipeline, you must define a partition. A partition is a subset of data that is selected for processing by a pipeline, and you define a partition by specifying the conditions that the data must meet in order to be selected for processing. In addition to defining a partition, you have the option of using where
commands in the SPL2 statement of your pipeline to filter the data that the pipeline receives. When data is excluded by partition conditions, the Edge Processor sends that excluded data to the default destination. However, when data is excluded by where
commands in the SPL2 statement, the Edge Processor drops that excluded data. See Partitions for more information.
Before the January 22, 2024 update, if a pipeline had one or more where
commands as the first processing commands in the SPL2 statement, Edge Processors interpreted those commands as being partition conditions. As a result, any data that did not match the where
clauses was sent to the Edge Processor's default destination instead of being dropped.
After the update, Edge Processors interpret all where
commands in the SPL2 statement of a pipeline as being filters in the main body of the pipeline instead of partition conditions. Going forward, data that does not match the where
clauses will be dropped.
Which pipelines are affected by this update?
This update affects all pipelines where the from $source
command is immediately followed by one or more where
commands. For example, the update has an impact on this pipeline:
$pipeline = | from $source | where host="buttercup" | where source="test_server" | eval index="my_test_index" | into $destination;
But the update has no impact on this pipeline:
$pipeline = | from $source | eval index="my_test_index" | where host="buttercup" | where source="test_server" | into $destination;
To ensure that this update does not cause unexpected changes in how your Edge Processors handle data, the Edge Processor service has exempted applied pipelines from this update. All pipelines that were applied to an Edge Processor before the update continue to work the same way as they did before the update, and the configurations have not been altered in any way.
However, the next time you open a pipeline for editing, the update will take effect if it is relevant for that pipeline, and the Edge Processor service will automatically try to adjust the configuration of the pipeline in order to preserve the pre-existing data processing behavior. This adjustment involves converting all the where
commands that immediately follow the from $source
command into partition conditions. The Edge Processor service will display the following message to confirm that it has made these changes to the pipeline:
Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. In order to preserve the behavior of pipelines that are already in use, when you opened this pipeline for editing, some processing commands were automatically moved from the SPL2 statement of the pipeline into the partition definition. Save these changes in your pipeline before proceeding.
What actions do I need to take?
If you open a pipeline for editing and receive the system message confirming that the Edge Processor service has updated the pipeline configuration, make sure to save the changes to your pipeline. Doing so ensures that the pipeline continues to work as intended and that the configuration is up to date.
Be aware that, in the following scenarios, the pipeline configuration does not get automatically adjusted even though the updates to the partitioning and filtering behavior take effect for the pipeline:
- You open a pipeline for editing, but the Edge Processor service fails to adjust the pipeline configuration. In this case, the Edge Processor service displays the following message:
Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. This change can impact how Edge Processors determine which data to drop or send to the default destination. Update the partition and "where" command configurations in this pipeline as needed, and then save your changes.
- You apply a pipeline to an Edge Processor without editing it first.
In these cases, you must review and manually update the configuration of the partition and where
commands in your pipeline. Otherwise, your pipeline might work differently than expected.
The following examples illustrate how the January 22, 2024 update can impact a pipeline, and outline the steps you might need to take in order to preserve your desired data processing behavior.
- Example 1: The Edge Processor service adjusts the pipeline configuration
- Example 2: The pipeline configuration is not automatically adjusted
Example 1: The Edge Processor service adjusts the pipeline configuration
This example describes the system messages and configuration changes that you might notice when the Edge Processor service successfully adjusts the pipeline configuration to preserve the current data processing behavior.
Before the update
Assume that, before the update, you had a pipeline with the following configurations:
- Partition condition: sourcetype equals cisco_syslog
- SPL2 statement:
$pipeline = | from $source | where host="buttercup" | where source="test_server" | eval index="my_test_index" | where result="success" | into $destination;
This pipeline accepted and processed incoming events only if they met all of these conditions:
- The
sourcetype
field contains the valuecisco_syslog
. - The
host
field contains the valuebuttercup
. - The
source
field contains the valuetest_server
.
Any data that did not match all of these conditions was sent to the Edge Processor's default destination without being processed by the pipeline.
For example, given 2 events that have the following fields:
Event 1:
sourcetype | host | source | result |
---|---|---|---|
cisco_syslog | buttercup | test_server | success |
Event 2:
sourcetype | host | source | result |
---|---|---|---|
cisco_syslog | buttercup | main_server | success |
The pipeline would accept Event 1, process it, and send it to the data destination specified in the pipeline. However, the pipeline would reject Event 2 and send it to the Edge Processor's default destination instead.
After the update
After the update, you open the pipeline for editing. The Edge Processor service displays the following message to confirm that it has made some changes to the pipeline:
Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. In order to preserve the behavior of pipelines that are already in use, when you opened this pipeline for editing, some processing commands were automatically moved from the SPL2 statement of the pipeline into the partition definition. Save these changes in your pipeline before proceeding.
The Edge Processor service changes the pipeline configuration to the following:
- Partition conditions:
- sourcetype equals cisco_syslog
- host equals buttercup
- source equals test_server
- SPL2 statement:
$pipeline = | from $source | eval index="my_test_index" | where result="success" | into $destination;
These changes ensure that the pipeline continues to work the same way as before the update. Make sure to save these changes to your pipeline.
Example 2: The pipeline configuration is not automatically adjusted
This example describes the unwanted changes in data processing behavior that can happen if the pipeline is not adjusted after the update takes place, and what you can do to fix your pipeline.
In this example, the pipeline starts with two where
clauses that are joined by the AND
operator. The AND
operator is supported for partition conditions, so you can fix the pipeline by converting those where
clauses into partition conditions.
Before the update
Assume that, before the update, you had a pipeline with the following configurations:
- Partition condition: sourcetype equals cisco_syslog
- SPL2 statement:
$pipeline = | from $source | where host="buttercup" AND source="test_server" | eval index="my_test_index" | where result="success" | into $destination;
This pipeline accepted and processed incoming events only if they met all of these conditions:
- The
sourcetype
field contains the valuecisco_syslog
. - The
host
field contains the valuebuttercup
. - The
source
field contains the valuetest_server
.
Any data that did not match all of these conditions was sent to the Edge Processor's default destination without being processed by the pipeline.
For example, given 2 events that have the following fields:
Event 1:
sourcetype | host | source | result |
---|---|---|---|
cisco_syslog | buttercup | test_server | success |
Event 2:
sourcetype | host | source | result |
---|---|---|---|
cisco_syslog | buttercup | main_server | success |
The pipeline would accept Event 1, process it, and send it to the data destination specified in the pipeline. However, the pipeline would reject Event 2 and send it to the Edge Processor's default destination instead.
After the update
After the update, one of the following scenarios occurs:
- You open the pipeline for editing, and the Edge Processor service displays the following message indicating that it failed to adjust the pipeline configuration:
Splunk has released a software update that affects how filtering clauses in pipelines are interpreted. This change can impact how Edge Processors determine which data to drop or send to the default destination. Update the partition and "where" command configurations in this pipeline as needed, and then save your changes.
- You apply this pipeline to an Edge Processor without editing the pipeline first.
In these scenarios, the pipeline configuration remains unchanged, but the pipeline is now subject to the updated Edge Processor behavior.
The pipeline now accepts and processes incoming events as long as the sourcetype
field contains the value cisco_syslog
. The pipeline would still accept Event 1, process it, and send it to a data destination. However, instead of rejecting Event 2 and sending it to the Edge Processor's default destination, this pipeline would accept and start processing Event 2 before dropping it because the where source="test_server"
clause filters it out.
Fix the pipeline manually
To fix the pipeline so that it works the same way as it did before the update, you'll need to do the following:
- Open the pipeline for editing.
- Take note of any
where
commands that immediately follow thefrom $source
command, and then add partition conditions that correspond to thesewhere
commands.
For example, in this pipeline,where host="buttercup"
andwhere source="test_server"
immediately follow thefrom $source
command:$pipeline = | from $source | where host="buttercup" AND source="test_server" | eval index="my_test_index" | where result="success" | into $destination;
In this case, you would need to add the following conditions to the partition:
- host equals buttercup
- source equals test_server
- In the SPL2 editor, delete all the
where
commands that immediately follow thefrom $source
command.
For example, the pipeline from step 2 becomes the following:$pipeline = | from $source | eval index="my_test_index" | where result="success" | into $destination;
- Save these changes to your pipeline.
Now, the pipeline will continue to work the same way as it did before the January 22, 2024 update.
See also
Extract timestamps from event data using an Edge Processor | Routing data in the same Edge Processor pipeline to different actions and destinations |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408
Feedback submitted, thanks!