Routing data in the same Edge Processor pipeline to different actions and destinations

You can create a pipeline that contains multiple paths for data processing and routing. For example, you can create pipelines that do the following:

Send identical copies of data to 2 different Splunk Cloud Platform deployments.
Send a specific subset of data to an index, while sending the rest of the data to an Amazon S3 bucket.
Process several copies of the same data in different ways before sending all of the results to the same destination, or to different destinations.

To create a pipeline that has multiple paths, you can use the route, thru, and branch SPL2 commands. The following table summarizes the different ways that you can control the flow of data in a pipeline by using these commands. See the linked documentation for more detailed information.

SPL2 command	Usage	Documentation
`route`	Create an additional path in the pipeline and divert a specific subset of data to it.	Process a subset of data using an Edge Processor
`thru`	Create an additional path in the pipeline and send a complete copy of the data to it.	Process a copy of data using an Edge Processor
`branch`	Create two or more paths in the pipeline and send a complete copy of the data to each path.	Process multiple copies of data using an Edge Processor

After creating an additional pipeline path, you can choose to process the data in the other path differently, or send that data to a different destination than the other data in the pipeline.

If necessary, you can use a combination of the route, thru, and branch commands, and use the same command multiple times in the same pipeline.

Using the route, thru, and branch commands together

Depending on your specific use case, you might use one or several of these commands in the same pipeline. You can nest these commands or use them sequentially in the pipeline.

For example, you can split the pipeline into increasingly specific or limited paths by starting with a branch command followed by thru and route commands:

import route from /splunk.ingest.commands

$pipeline = | from $source
| branch 
    [ 
        | eval index="buttercup" 
        | route sourcetype == "cisco_syslog", 
        [
            | into $cisco_syslog_destination
        ]
    | into $splunk_destination_1
    ],
    [ 
        | eval ip_address = sha256(ip_address)
        | thru 
        [
            | into $aws_s3_destination
        ]
        | eval index="splunk" 
        | into $splunk_destination_2
    ],
    [ 
        | eval index="cisco" 
        | into $splunk_destination_3
    ];

This pipeline does the following:

Imports the route command to make it available for use.
Makes 3 total copies of the incoming data.
For the first copy:
- Sends any data that has source type cisco_syslog to an index named buttercup in a destination dedicated to cisco_syslog data.
- Sends the data that does not have the source type cisco_syslog to an index named buttercup in a different destination.
For the second copy:
- Obfuscates IP addresses by hashing them.
- Sends a copy of this obfuscated data to Amazon S3.
- Sends the other copy of this obfuscated data to an index named splunk in a Splunk platform destination.
For the third copy:
- Sends this data to an index named cisco in yet another Splunk platform destination.

Choosing to use one multipath pipeline or multiple single-path pipelines

You can achieve the same results by applying a single pipeline with multiple paths to your Edge Processor, or by applying multiple single-path pipelines. However, each configuration method offers different advantages:

Applying a single pipeline with multiple paths allows you to centralize the SPL2 configurations that are common to all of the paths. See Example: Using a single pipeline with multiple paths for more information.
Applying multiple single-path pipelines allows you to adjust the SPL2 configurations with greater flexibility and precision. See Example: Using multiple single-path pipelines for more information.

Example: Using a single pipeline with multiple paths

The following pipeline hashes the values in the ip_address field using the SHA-256 algorithm, then uses the branch command to create 3 pipeline paths and route the data in 3 different ways:

$pipeline = | from $source | eval ip_address = sha256(ip_address)
| branch 
    [ | eval index="buttercup" | into $first_destination],
    [ | eval index="splunk" | into $second_destination],
    [ | eval index="cisco" | into $third_destination];

If you wanted to change the hashing algorithm to SHA-512 for the data that's being sent to all 3 destinations, you only need to make a minor update to the first eval command, as follows:

$pipeline = | from $source | eval ip_address = sha512(ip_address)
| branch 
    [ | eval index="buttercup" | into $first_destination],
    [ | eval index="splunk" | into $second_destination],
    [ | eval index="cisco" | into $third_destination];

However, if you wanted to use the SHA-512 algorithm for one path only, then you would need to make a more substantial update and specify an eval command in each pipeline path. For example:

$pipeline = | from $source 
| branch 
    [ | eval ip_address = sha256(ip_address) | eval index="buttercup" | into $first_destination],
    [ | eval ip_address = sha256(ip_address) | eval index="splunk" | into $second_destination],
    [ | eval ip_address = sha512(ip_address) | eval index="cisco" | into $third_destination];

Additionally, if you wanted to temporarily stop sending data to one of the destinations, you would have to delete the SPL2 clause for one of the pipeline paths, save a backup copy of it, and then add it back in later.

Example: Using multiple single-path pipelines

You can achieve the same results as the first pipeline in the previous example by applying the following 3 pipelines to an Edge Processor:

$pipeline = | from $source | eval ip_address = sha256(ip_address)
| eval index="buttercup" | into $first_destination;

$pipeline = | from $source | eval ip_address = sha256(ip_address)
| eval index="splunk" | into $second_destination;

$pipeline = | from $source | eval ip_address = sha256(ip_address)
| eval index="cisco" | into $third_destination;

In this case, if you wanted to change the hashing algorithm to SHA-512 for the data that's being sent to all 3 destinations, you would need to update the eval command in each pipeline separately. You would not be able to make one update and have it take effect for all 3 pipelines.

However, this configuration is advantageous in other ways:

To change the hashing algorithm for one destination only, you would only need to make a minor update to the eval command in one pipeline.
To temporarily stop sending data to one of the destinations, you can remove the corresponding pipeline from the Edge Processor and then re-apply it later.
You can choose to reuse parts of this configuration by applying some but not all of the pipelines to other Edge Processors.

Related answers from Splunk Community

Routing data in the same Edge Processor pipeline to different actions and destinations

Using the route, thru, and branch commands together

Choosing to use one multipath pipeline or multiple single-path pipelines

Example: Using a single pipeline with multiple paths

Example: Using multiple single-path pipelines

Comments

Routing data in the same Edge Processor pipeline to different actions and destinations

Was this topic useful?