Splunk Cloud Platform

Use Edge Processors

Routing data in the same Edge Processor pipeline to different actions and destinations

You can create a pipeline that contains multiple paths for data processing and routing. For example, you can create pipelines that do the following:

  • Send identical copies of data to 2 different Splunk Cloud Platform deployments.
  • Send a specific subset of data to an index, while sending the rest of the data to an Amazon S3 bucket.
  • Process several copies of the same data in different ways before sending all of the results to the same destination, or to different destinations.

To create a pipeline that has multiple paths, you can use the route, thru, and branch SPL2 commands. The following table summarizes the different ways that you can control the flow of data in a pipeline by using these commands. See the linked documentation for more detailed information.

SPL2 command Usage Documentation
route Create an additional path in the pipeline and divert a specific subset of data into it. Process a subset of data using an Edge Processor
thru Create an additional path in the pipeline and send a complete copy of the data into it. Process a copy of data using an Edge Processor
branch Create two or more paths in the pipeline and send complete copies of the data into them. Process multiple copies of data using an Edge Processor

After creating an additional pipeline path, you can choose to process the data in the other path differently, or send that data to a different destination than the other data in the pipeline.

Using the route, thru, and branch commands together

Depending on your specific use case, you might use one or several of these commands in the same pipeline. You can nest these commands or use them sequentially in the pipeline.

For example, you can split the pipeline into increasingly specific or limited paths by starting with a branch command followed by thru and route commands:

$pipeline = | from $source
| branch 
    [ 
        | eval index="buttercup" 
        | route sourcetype == "cisco_syslog", 
        [
            | into $cisco_syslog_destination
        ]
    | into $splunk_destination_1
    ],
    [ 
        | eval ip_address = sha256(ip_address)
        | thru 
        [
            | into $aws_s3_destination
        ]
        | eval index="splunk" 
        | into $splunk_destination_2
    ],
    [ 
        | eval index="cisco" 
        | into $splunk_destination_3
    ];

This pipeline does the following:

  • Makes 3 total copies of the incoming data.
  • For the 1st copy:
    • Sends any data that has source type cisco_syslog to an index named buttercup in a destination dedicated to cisco_syslog data.
    • Sends the data that does not have the source type cisco_syslog to an index named buttercup in a different destination.
  • For the 2nd copy:
    • Obfuscates IP addresses by hashing them.
    • Sends a copy of this obfuscated data to Amazon S3.
    • Sends the other copy of this obfuscated data to an index named splunk in a Splunk platform destination.
  • For the 3rd copy:
    • Sends this data to an index named cisco in yet another Splunk platform destination.

Choosing to use one multipath pipeline or multiple single-path pipelines

You can achieve the same results by applying a single pipeline with multiple paths to your Edge Processor, or by applying multiple single-path pipelines. However, each configuration method offers different advantages:

Example: Using a single pipeline with multiple paths

The following pipeline hashes the values in the ip_address field using the SHA-256 algorithm, then uses the branch command to create 3 pipeline paths and route the data in 3 different ways:

$pipeline = | from $source | eval ip_address = sha256(ip_address)
| branch 
    [ | eval index="buttercup" | into $first_destination],
    [ | eval index="splunk" | into $second_destination],
    [ | eval index="cisco" | into $third_destination];

If you wanted to change the hashing algorithm to SHA-512 for the data that's being sent to all 3 destinations, you only need to make a minor update to the first eval command, as follows:

$pipeline = | from $source | eval ip_address = sha512(ip_address)
| branch 
    [ | eval index="buttercup" | into $first_destination],
    [ | eval index="splunk" | into $second_destination],
    [ | eval index="cisco" | into $third_destination];

However, if you wanted to use the SHA-512 algorithm for one path only, then you would need to make a more substantial update and specify an eval command in each pipeline path. For example:

$pipeline = | from $source 
| branch 
    [ | eval ip_address = sha256(ip_address) | eval index="buttercup" | into $first_destination],
    [ | eval ip_address = sha256(ip_address) | eval index="splunk" | into $second_destination],
    [ | eval ip_address = sha512(ip_address) | eval index="cisco" | into $third_destination];

Additionally, if you wanted to temporarily stop sending data to one of the destinations, you would have to delete the SPL2 clause for one of the pipeline paths, save a backup copy of it, and then add it back in later.

Example: Using multiple single-path pipelines

You can achieve the same results as the first pipeline in the previous example by applying the following 3 pipelines to an Edge Processor:

$pipeline = | from $source | eval ip_address = sha256(ip_address)
| eval index="buttercup" | into $first_destination;
$pipeline = | from $source | eval ip_address = sha256(ip_address)
| eval index="splunk" | into $second_destination;
$pipeline = | from $source | eval ip_address = sha256(ip_address)
| eval index="cisco" | into $third_destination;

In this case, if you wanted to change the hashing algorithm to SHA-512 for the data that's being sent to all 3 destinations, you would need to update the eval command in each pipeline separately. You would not be able to make one update and have it take effect for all 3 pipelines.

However, this configuration is advantageous in other ways:

  • To change the hashing algorithm for one destination only, you would only need to make a minor update to the eval command in one pipeline.
  • To temporarily stop sending data to one of the destinations, you can remove the corresponding pipeline from the Edge Processor and then re-apply it later.
  • You can choose to reuse parts of this configuration by applying some but not all of the pipelines to other Edge Processors.
Last modified on 10 June, 2024
Updates to partitioning and filtering behavior in Edge Processor pipelines   Process a subset of data using an Edge Processor

This documentation applies to the following versions of Splunk Cloud Platform: 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308 (latest FedRAMP release), 9.1.2312, 9.2.2403


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters