Getting sample data for previewing data transformations
You can generate previews to see how your pipeline or source type configurations can change the incoming data. These previews are based on the sample data that you specify in the pipeline or source type.
For example, when editing a pipeline, you can provide Windows event logs as sample data and then generate previews that show how the processing commands in your pipeline transform Windows event logs. To specify sample data when editing a pipeline, select Sample data in the side panel of the pipeline builder.
Similarly, when editing a source type, you can provide Cisco syslog output as sample data and then generate previews that show how the event breaking and merging definitions in your source type preprocess
cisco_syslog data into distinct events. To specify sample data when editing a source type, select Edit sample data.
If you don't specify sample data in your pipeline, but you configure the pipeline to receive data from a source type that has sample data, then the sample data from the source type is used when you preview the pipeline.
To generate accurate previews, you must provide sample data that has the same format as the actual data that you want to process.
Supported formats for sample data
You can generate pipeline previews using either raw data or parsed data that has values stored in event fields. Source type previews support raw data only.
Parsed data must be specified in CSV format, with the header containing the names of the event fields. If your sample events include the
_time field, the values in that field must be ISO 8601 timestamps.
Splunk software uses the
_time field for its internal processes.
For example, if you want to preview a pipeline that's designed to process HTTP Event Collector (HEC) events that contain fields named
category, then you need to provide parsed data that has values stored in the
_raw,_time,severity,category Hello World,2023-04-24T13:00:05.105+0000,INFO,system Unexpected failure,2023-04-24T13:25:48.128+0000,ERROR,system Shutting down,2023-04-24T13:30:57.306+0000,INFO,system
As another example, if you want to process raw data from a universal forwarder where the start and end of each event is delimited by a line break, then you must provide plain text strings that are on separate lines.
Wed Feb 14 2023 23:16:57 mailsv1 sshd: Failed password for apache from 126.96.36.199 port 3801 ssh2 Wed Feb 14 2023 15:51:38 mailsv1 sshd: Failed password for grumpy from 188.8.131.52 port 1244 ssh2 Mon Feb 12 2023 09:31:03 mailsv1 sshd: Failed password for invalid user guest from 184.108.40.206 port 2903 ssh2
Methods for getting sample data
The following are a few methods that you can use to get sample data for generating previews:
- Copy the sample data that is included with default source types. If the owner of the source type is system, then it is a default source type.
- Copy the sample data that is included with pipeline templates.
- Use Splunk Cloud Platform or Splunk Enterprise to search for relevant data, and then export the search results to a CSV file. For more information, see Export data using Splunk Web in the Splunk Cloud Platform Search Manual.
- Use the Search Experience to find relevant data from the Splunk Cloud Platform deployment that's connected to your tenant, and then copy values from the
_rawfield to use as sample data.
To copy results from the Search Experience in a format that you can immediately use as sample data for previews, do the following:
- Navigate to the Search page.
- Select a dataset that you want to search, and then select Apply.
- In the SPL Editor, enter a search statement written in SPL2. For more information about writing SPL2 search statements, see the Splunk Cloud Services SPL2 Search Manual.
- Select the Run icon () to run your search.
- In the search results panel, hover over the header for the
_rawfield to make the Options for "_raw" icon () appear. Select that icon to open the Options menu and then select Copy field values.
The events returned by your search statement appear in the search results panel.
Use templates to create pipelines for Edge Processors
Updates to partitioning and filtering behavior in Edge Processor pipelines
This documentation applies to the following versions of Splunk Cloud Platform™: 9.0.2209, 9.0.2303, 9.0.2305 (latest FedRAMP release), 9.1.2308, 9.1.2312