Create pipelines for Ingest Processor

A pipeline is a set of data processing instructions written in the Search Processing Language, version 2 (SPL2). Create pipelines in your Ingest Processor to specify how you want the Ingest Processor to route and process particular subsets of the received data. When you apply a pipeline, the Ingest Processor uses those instructions to process the data that it receives.

To create a valid pipeline, you must complete the following tasks:

Verify that you have configured your Splunk Cloud Platform tenant to connect to a Splunk Cloud Platform index. See Allow your tenant to access Splunk Cloud Platform indexes for more information.
Specify the destination that the pipeline sends processed data to. See How the destination for Ingest Processor works for more information.
Verify that your pipeline is sending data to your configured data destination. See Verify your Ingest Processor and pipeline configurations for more information.

A best practice for pipeline creation is to verify that your newly created pipeline works, before gradually adding commands and functions.

Once you have created your pipeline, and verified that the pipeline works, you can add commands and arguments to your pipeline, such as the following:

Specify the partition of the incoming data for your pipeline to process. See Partitions for more information.
Write an SPL2 statement that defines what data to further process, how to process it, and where to send the processed data to.
Add any pipeline processes, such as filtering and masking, field hashing, field extraction, or generating logs into metrics.

For more information on adding arguments and commands to your pipeline, see the Process data using pipelines chapter of this manual.

Preventing data loss

Each pipeline filters all the incoming data for a specified source type, host, source, and only processes data of that criteria. Any data that is associated with a different source type is excluded from the pipeline. If the Ingest Processor doesn't have an additional pipeline that accepts the excluded data, that data is either routed to the default destination or dropped.

As a best practice for preventing unwanted data loss, make sure to always have a default destination for your Ingest Processor pipeline. Otherwise, all unprocessed data is dropped. See Partitions to learn about what qualifies as unprocessed data.

Prerequisites

Before starting to create a pipeline, confirm the following:

If you want to partition your data by source type, the source type of the data that you want the pipeline to process is listed on the Source types page of your tenant.
The destination that you want the pipeline to send data to is listed on the Destinations page of your tenant. If your destination is not listed, then you must add that destination to your tenant. See Add or manage destinations for more information.

Steps

Complete these steps to create a pipeline that receives data associated with a specific source type, host, source, optionally processes it, and sends that data to a destination.

Navigate to the Pipelines page, then select New pipeline and then Ingest Processor pipeline.
On the Get started page, select Blank pipeline and then Next.
On the Define your pipeline's partition page, do the following:

Select how you want to partition your incoming data that you want to send to your pipeline. You can partition by source type, host, source.
Enter the conditions for your partition, including the operator and the value. Your pipeline will receive and process the incoming data that meets these conditions.
Select Next to confirm the pipeline partition.

(Optional) On the Add sample data page, enter or upload sample data for generating previews that show how your pipeline processes data.
The sample data must be in the same format as the actual data that you want to process. See Getting sample data for previewing data transformations for more information.
Select Next to confirm any sample data that you want to use for your pipeline.
On the Select a metrics destination page, select the name of the destination that you want to send metrics to.
(Optional) If you selected Splunk Metrics store as your metrics destination, specify the name of the target metrics index where you want to send your metrics.
On the Select a data destination page, select the name of the destination that you want to send logs to.

(Optional) If you selected a Splunk platform destination, you can configure index routing:

Select one of the following options in the expanded destination panel:

Option	Description
Default	The pipeline does not route events to a specific index. If the event metadata already specifies an index, then the event is sent to that index. Otherwise, the event is sent to the default index of the Splunk Cloud Platform deployment.
Specify index for events with no index	The pipeline only routes events to your specified index if the event metadata did not already specify an index.
Specify index for all events	The pipeline routes all events to your specified index.

If you selected Specify index for events with no index or Specify index for all events, then from the Index name drop-down list, select the name of the index that you want to send your data to.
If your desired index is not available in the drop-down list, then confirm that the index is configured to be available to the tenant and then refresh the connection between the tenant and the Splunk Cloud Platform deployment. For detailed instructions, see Make more indexes available to the tenant.

If you're sending data to a Splunk Cloud Platform deployment, be aware that the destination index is determined by a precedence order of configurations. See How does Ingest Processor know which index to send data to? for more information

Select Done to confirm the data destination.
After you complete the on-screen instructions, the pipeline builder displays the SPL2 statement for your pipeline.
(Optional) To process the incoming data before sending it to a destination, on the SPL2 editor page, add processing commands to the SPL2 statement. You can do that by selecting the plus icon () next to Actions and selecting a data processing action, or by typing SPL2 commands and functions directly in the editor. For information about the supported SPL2 syntax, see Ingest Processor pipeline syntax.
The pipeline builder includes the SPL to SPL2 conversion tool, which you can use to convert SPL into SPL2 that is valid for Ingest Processor pipelines. See SPL to SPL2 Conversion tool in the SPL2 Search Reference.
(Optional) Select the Preview Pipeline icon () to generate a preview that shows what the sample data looks like when it passes through the pipeline.
To save your pipeline, do the following:

Select Save pipeline.
In the Name field, enter a name for your pipeline.
(Optional) In the Description field, enter a description for your pipeline.
Select Save. The pipeline is now listed on the Pipelines page, and you can now apply it, as needed.

To apply this pipeline, do the following:

Navigate to the Pipelines page.
In the row that lists your pipeline, select the Actions icon (), and then select Apply/Remove.
Select the pipelines that you want to apply, and then select Save. It can take a few minutes to finish applying your pipeline. During this time, all applied pipelines enter the Pending status.
(Optional) To confirm that the Ingest Processor has finished applying your pipeline, navigate to the Ingest Processor page and check if all affected pipelines have returned to the Healthy status.

Your applied pipelines can now process and route data as specified in the pipeline configuration.

Related answers from Splunk Community

Create pipelines for Ingest Processor

Preventing data loss

Prerequisites

Steps

See also

Comments

Create pipelines for Ingest Processor

Was this topic useful?