Troubleshoot the Data Stream Processor
Review this topic if you are having difficulties with the Splunk Data Stream Processor.
Support
To report bugs or receive additional support, do the following.
- Ask questions and get answers through community support on the Splunk Community page.
- Submit a case using Splunk Support Portal.
- Contact customer support at contact customer support.
When contacting Splunk support, provide the following information.
- Pipeline ID
- Pipeline name
- Your build id, located in Help & Feedback > About in the Data Stream Processor UI.
- Summary of the problem and any additional relevant information
Generate a diagnostic file for troubleshooting
Splunk support might request diagnostic files to better help troubleshoot your DSP environment. You can generate a diagnostic report from the master node of your cluster by running: sudo ./report
.
This command creates a dsp-report-<timestamp>.tar.gz
file in your working directory. This report contains all DSP application logs as well as system and monitoring logs.
You can use the SPL2 Pipeline Builder UI to send the full SPL2 of your pipeline to someone who isn't a member of your tenant. Splunk Support may ask you to do this in order to assist you in troubleshooting your pipeline.
- From the Data Management page, click on the pipeline that you want to get the SPL2 for.
- (Optional) If this pipeline is currently active, click Edit to enter the Canvas view.
- From the Canvas view, click on the SPL button to toggle to the SPL2 Pipeline Builder.
- Copy the SPL2 and send it to Splunk Support.
You can now share your pipeline with people who aren't in your tenant.
Output of aggregate function is delayed
When you are previewing or sending data for an aggregate function, you might notice slow or no data output past your aggregate function.
Causes and solutions
The following table lists possible causes and solutions for the delay in aggregate output.
Cause | Solution |
---|---|
The volume of data you are sending is too low. | Send more data to your pipeline. |
If you are using either of the Read from Amazon Kinesis Stream source functions, you might be using too many shards in your Kinesis stream. | Decrease the number of shards in your Kinesis stream in your AWS console. |
If you are using either the Read from Apache Kafka or Read from Apache Kafka with SSL source functions, you might be using too many Kafka partitions. | Lower the parallelism of the Flink job by setting the consumer property dsp.flink.parallelism in the Read from Apache Kafka function to a lower setting. The dsp.flink.parallelism setting defaults to the number of Kafka partitions available in the Kafka topic you are reading from.
|
Cannot login to SCloud
When you login to SCloud, you might see this error.
error: failed to get session token: failed to get valid response from csrfToken endpoint: failed to get valid response from csrfToken endpoint: parse <ipaddr>:31000/csrfToken: first path segment in URL cannot contain colon
Cause
You are using standard HTTP authentication instead of HTTPS.
Solution
Confirm that your auth-url
and host-url
include https
. See Get Started with SCloud.
Duplicate events
After sending data to Splunk Enterprise or a supported third-party platform, you notice that your data contains duplicate events.
Cause
The Data Stream Processor guarantees at least once delivery of your data, and duplicates can occur. If a failure causes the Data Stream Processor to stop while processing data, upon restart, your data is reprocessed to ensure that no data is lost. This may result in some data being duplicated in your sinks.
Solution
This is expected behavior. For performance reasons and to minimize duplicate events, best practices are to have as few pipelines as possible delivering the same event(s).
Unexpected fields in the attribute field
When previewing or sending data with the DSP event or metrics schema, you might see unexpected fields in attributes
.
Cause
Your data has the DSP event or metrics schema, but the top-level fields in your data do not have the expected data types. For example, if you changed the timestamp
field from Long
to String
, then the String timestamp
is inserted as an attribute named timestamp
and the current time is used for the timestamp
field.
Solution
Make sure that the following reserved field names match the expected type:
Field name | Expected data type |
---|---|
timestamp | long |
nanos | integer |
id | string |
host | string |
source | string |
source_type | string |
attributes | map |
kind | string |
The Data Stream Processor UI shows deprecated components
The Data Stream Processor UI shows deprecated components.
Cause
The Data Stream Processor contains deprecated functions for beta users. These functions are labeled as deprecated in the UI.
Solution
Use the supported functions instead.
The Data Stream Processor shows that my data is making it through my pipeline, but I can't find my data in my Splunk Index
The Monitoring Console provides prebuilt dashboards with detailed topology and performance information about your Splunk Enterprise deployment. See About the Monitoring Console.
HTTP Event Collector dashboards
The Monitoring Console comes with pre-built dashboards for monitoring the HTTP Event Collector. To interpret the HTTP event collector dashboards information panels correctly, be aware that the Data Received and Indexed panel shows data as "indexed" even when the data is sent to a deleted or disabled index. The HEC dashboards show the data that is acknowledged by the indexer acknowledgment feature, even if that data isn't successfully indexed.
For more information about the specific HTTP event collector dashboards, see HTTP Event Collector dashboards.
The HTTP event collector dashboards show all indexes, even if they are disabled or have been deleted.
Use metrics to troubleshoot your data pipeline
Use the metrics feature to troubleshoot your data pipeline.
If you don't see data flowing into your pipeline, check the metrics on your source function to see whether it is sending any data out. Similarly, you can use the metrics feature on different streaming functions to see where in the pipeline your records might be getting dropped.
Check the status of a pipeline
Once a pipeline is activated, you can check the pipeline's status by checking the Status column in the Data Management page.
Your pipeline can have the following statuses.
- NOT_SUBMITTED: The pipeline has never been activated.
- DEACTIVATED: The pipeline has been activated before but is currently deactivated.
- CREATED: The pipeline is newly started, but no task has started to run.
- RUNNING: Some tasks are scheduled, running, or finished.
- FAILING: The pipeline failed and is waiting for backend cleanup to complete.
- FAILED: The pipeline failed with a non-recoverable task failure.
- RESTARTING: The pipeline is being restarted.
- CANCELLING: The pipeline is being deactivated.
If a pipeline is in the FAILING, FAILED, or the RESTARTING state, then the system continues to retry the pipeline until it succeeds. If a pipeline is continuously cycling through these states, do or wait for the following to happen:
- Check to see if your functions are configured correctly. Your pipeline can be in a FAILING, FAILED, or RESTARTING state if one of your functions is misconfigured, for example, a Kafka source function is misconfigured with the wrong Kafka broker name.
- Wait for the pipeline to be migrated to a different node in the system. This happens occasionally, and the system recovers the pipeline automatically.
Monitor your pipeline with data preview and real-time metrics |
This documentation applies to the following versions of Splunk® Data Stream Processor: 1.1.0, 1.1.0
Feedback submitted, thanks!