Splunk® Data Stream Processor

Use the Data Stream Processor

Acrobat logo Download manual as PDF


On October 30, 2022, all 1.2.x versions of the Splunk Data Stream Processor will reach its end of support date. See the Splunk Software Support Policy for details. For information about upgrading to a supported version, see the Upgrade the Splunk Data Stream Processor topic.
This documentation does not apply to the most recent version of Splunk® Data Stream Processor. For documentation on the most recent version, go to the latest release.
Acrobat logo Download topic as PDF

Filtering and routing data in the

The can filter data so that only records that match a specified condition can pass downstream to other functions in your pipeline. For example, you can configure the Where function to look for a specific event code from Windows event logs, and only allow records with that event code to be sent to the end of the pipeline.

You can also route data to different destinations depending on whether the data meets certain filter criteria. For example, you can branch the pipeline and configure multiple Where functions so that all data from one data source is sent to a Splunk index, all data from a second data source is sent to a different Splunk index, and all data from a third data source is sent to a third-party destination like Apache Kafka. See the Create a pipeline that routes data to different Splunk indexes section on this page for an end-to-end example of how to build a pipeline that routes data depending on filter criteria.

Filter records using the Where function

Add a Where function to your pipeline to only allow records that match a specified condition to pass downstream to other functions in your pipeline.

  1. From the Canvas view of your pipeline, click on the + icon and add the Where function to your pipeline.
  2. In the Where function, provide a predicate expression. For examples of predicate expressions, see the Where function, the Predicate expressions section on this page, or the predicates topic in the SPL2 Search Manual.
  3. With the Where function highlighted, click Start Preview Start Preview button to verify that the expression is working as expected.

Predicate expressions

The following table lists some common expressions that you can use in the Where function. The Where function accepts a predicate, which is an expression that returns a boolean value (true or false). See Predicates for more information on predicates.

Expression Description
kind="metric" If the kind field is equal to metric, the record passes downstream. If not, the record doesn't get passed through the pipeline.
source_type LIKE "CISCO%" If the source_type field contains the string, "cisco", the record passes downstream. If not, the record doesn't get passed further through the pipeline.
ttms > 5000 Here, ttms is a custom top-level field that contains latency information. If the record has a ttms value of over 5 seconds, the record is passed downstream. If not, the record doesn't get passed through the pipeline.
NOT (timestamp IS NULL) If the timestamp field is not null, the record passes downstream. If timestamp is null, the record doesn't get passed through the pipeline.
logGroup LIKE "/c/windows/%" OR logGroup LIKE "/aws/directoryservice/%" Here, logGroup is a custom top-level field containing a specific path. If the logGroup field contains "/c/windows/%" or "/aws/directoryservice/%", the record passes downstream. If not, the record doesn't get passed further through the pipeline.
map_get(attributes, "spl_fwd_type")="uf" Here, we are ingesting data from a Splunk universal forwarder, which outputs a map of key-value pairs in the attributes field. One of these key-value pairs is spl_fwd_type: "uf". In this example, the map_get scalar function is looking through the attributes map for the spl_fwd_type: "uf" key-value pair. If it finds this key-value pair, then the record passes downstream. If not, the record doesn't get passed further through the pipeline.

Create a pipeline that routes data to different Splunk indexes

In this example, we create a pipeline that does the following:

  • Ingests data from the Splunk DSP Firehose. In this case, the Splunk DSP Firehose is receiving data from a combination of sources including a Splunk Connect for Syslog (SC4S) instance.
  • Sends syslog data to a Splunk index called syslog.
  • Sends all other data to the Splunk index specified in the attributes field of the record, or to the default index called main if the attributes field does not specify an index.

Prerequisites

Steps

  1. From the Build Pipeline page, select the Splunk DSP Firehose data source.
  2. Configure the pipeline to send syslog records to a Splunk index named syslog. To achieve this, filter for records that have the source_type field set to syslog, so that only records that match that criteria can continue downstream to the sink function.
    1. Click the + icon beside the Splunk DSP Firehose function and select the Where function.
    2. On the View Configurations tab, in the predicate field, enter the following expression:
      source_type="syslog"
      
    3. Click the + icon beside the Where function and select the Send to a Splunk Index with Batching sink function.
    4. On the View Configurations tab, set connection_id to the connection for your destination Splunk instance.
    5. In the index field, enter "" (two quotation marks).
    6. In the default_index field, enter "syslog" (including the quotation marks).
  3. Configure the pipeline to send all other records to the Splunk index specified in the attributes field, or to the main index if the attributes field doesn't specify an index. To achieve this, create a branch in the pipeline that filters for records that have a source_type other than syslog, so that only records that match that criteria can continue downstream to the second sink function.
    1. Click the Splunk Data Stream Processor pipeline branching icon icon beside the Splunk DSP Firehose function and select the Where function.
    2. On the View Configurations tab, in the predicate field, enter the following expression:
      source_type!="syslog"
      
    3. Click the + icon beside the Where function and select the Send to a Splunk Index with Batching sink function.
    4. On the View Configurations tab, set connection_id to the connection for your destination Splunk instance.
    5. In the index field, enter the following expression:
      cast(map_get(attributes, "index"), "string")
      
    6. In the default_index field, enter "main" (including the quotation marks).
  4. Validate your pipeline to confirm that all of the functions are configured correctly. Click the More Options Splunk Data Stream Processor "More Options" button button located beside the Activate Pipeline button, and then select Validate.
  5. Click Save, enter a name for your pipeline, and then click Save again.
  6. (Optional) Click Activate to activate your pipeline. If it's the first time activating your pipeline, do not enable any of the optional Activate settings.

You now have a pipeline that receives data from the Splunk DSP Firehose, sends any data that was ingested from a syslog server to the syslog index in your Splunk instance, and sends all other data to either the index specified in the attributes field or the default main index.

This screenshot shows a Splunk Data Stream Processor pipeline that splits into two branches. Each branch includes a Where function and a Send to a Splunk Index with Batching sink function.

The following is the complete SPL2 statement for this pipeline:

$statement_2 = | from splunk_firehose();
| from $statement_2 | where source_type!="syslog" | into into_splunk_enterprise_indexes("2f1ce641-baeb-4695-82cc-8f16ae64eb71", cast(map_get(attributes, "index"), "string"), "main");
| from $statement_2 | where source_type="syslog" | into into_splunk_enterprise_indexes("2f1ce641-baeb-4695-82cc-8f16ae64eb71", "", "syslog");

See also

Functions
Where
Get data from Splunk DSP Firehose
Send data to a Splunk Index with Batching
Related topics
Predicates
Send data from a pipeline to multiple destinations
Create a pipeline using the Canvas Builder
Last modified on 11 March, 2022
PREVIOUS
Remove unwanted fields from your data
  NEXT
Adding and updating fields in the

This documentation applies to the following versions of Splunk® Data Stream Processor: 1.2.0, 1.2.1-patch02, 1.2.1, 1.2.2-patch02, 1.2.4, 1.2.5


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters