
Troubleshoot the Edge Processor solution
Review this page if you are having difficulties with sending data through the Edge Processor solution. If the problem that you're experiencing is not described on this page, you can find more information by doing the following:
- Review the list of known issues in the product. See Known issues.
- Check the logs associated with your Edge Processor. See View logs for the Edge Processor solution.
If the problem persists, contact your Splunk representative for assistance. To help expedite the support process, you can generate a diagnostic report and send it to your Splunk representative.
Generate a diagnostic report for an Edge Processor instance
You can run the edge_diagnostic
tool to generate a diagnostic report for a specific Edge Processor instance. The diagnostic report contains logs about the performance and activity of the Edge Processor instance. Include this report when contacting your Splunk representative for assistance.
- On the host machine of your Edge Processor instance, download the package containing the
edge_diagnostic
tool.curl -o 'edge-diagnostic-linux.tar.gz'</code> 'https://beam.scs.splunk.com/acies/diagnostic/edge-diagnostic-linux.tar.gz'
- Extract the
edge_diagnostic
tool into <install_directory>, where <install_directory> is the installation directory of the Edge Processor instance.tar -xf edge-diagnostic-linux.tar.gz -C <install_directory>
- Navigate to the installation directory.
cd <install_directory>
- Validate the
edge_diagnostic
tool by running the following command and confirming whether the returned value matches the expected value:sha256sum edge_diagnostic_linux_amd64
The expected value is:
b7ccb5592f22b259bebd03250085eee07779be1442ec5aa5dd2603529d79b081
If these values don't match, that indicates that the
edge_diagnostic
tool is invalid. This problem can occur if the package download did not complete as expected. If this occurs, delete the package and the extractededge_diagnostic
tool, and then try again from step 1. - Use the
edge_diagnostic
tool to generate a diagnostic report. By default, the tool generates a file named edge-diag-<host_name>-<timestamp>.tar.gz, where <host_name> is the host name of the machine that the Edge Processor instance is running on and <timestamp> is a timestamp indicating when the file was generated. You can use the-out
option to specify a different file name.- To generate a diagnostic report using the default settings, run this command:
./edge_diagnostic_linux_amd64
- To specify a different output file name, run this command, where <file_name> is the name that you want to use:
./edge_diagnostic_linux_amd64 -out <file_name>.tar.gz
The
edge_diagnostic
tool generates a diagnostic report file in the current directory. While the tool is running, it returns INFO logs about its status. - To generate a diagnostic report using the default settings, run this command:
- (Optional) If any unexpected behavior occurs with the
edge_diagnostic
tool, you can get more information about the status of the tool by running it with the-verbose
option. This option causes the tool to print DEBUG level logs about its status while generating the diagnostic report../edge_diagnostic_linux_amd64 -verbose
When contacting your Splunk representative for assistance, send them a copy of the generated file.
An Edge Processor is not detecting my forwarder
If the Data sources pane in the detailed view of your Edge Processor does not include the data from your forwarder, or the pane displays a "No data sources to display" message, check the time range that's specified in the Metrics drop-down list. Confirm that your forwarders were sending data during that time window. If necessary, increase the time range and try again.
If data sources are still missing, then on the machine where your forwarder is installed, confirm the following:
- The outputs.conf file defines a target group for the Edge Processor.
- The outputs.conf, inputs.conf, and props.conf files don't define any advanced routing or filtering settings that would prevent data from being forwarded to the target group for the Edge Processor.
- The forwarder is running and has an "active forward" sending data to the Edge Processor. To confirm this, run this command from the $SPLUNK_HOME/bin directory:
splunk list forward-server
An Edge Processor instance that was previously "Healthy" is now "Disconnected"
When viewing details about an Edge Processor, you notice that an instance that previously had the Healthy status now has the Disconnected status.
Cause
The Edge Processor service lost contact with that instance. This problem can happen for a variety of reasons, such as the host machine of the instance going down or the communication between the instance and the Edge Processor service getting blocked.
This problem can also happen if you remove an Edge Processor instance from its host machine using any method other than the uninstallation command provided in the Edge Processor service. In this case, the service fails to detect that the instance has been removed.
Solution
To determine the root cause of the problem, check the logs for the supervisor that's associated with the disconnected Edge Processor instance.
- Log in to the host machine of the disconnected Edge Processor instance.
- Navigate to the <install_directory>/var/log directory, where <install_directory> is the installation directory of the Edge Processor instance.
- Review the logs in the supervisor.log file.
If the Edge Processor instance was removed using a method other than the uninstallation command, then you need to reinstall the instance and then uninstall it using the appropriate command. Using the command provided in the Edge Processor service allows the service to detect this change and stop listing the instance in the Edge Processor details.
- Reinstall the instance:
- On the Edge Processors page, select the Edge Processor that has Disconnected instance.
- In the panel that contains your Edge Processor details, select Manage instances.
- Select the Install/uninstall tab, and then expand the Step 1: Run commands to install/uninstall instances section.
- Select Install to view the commands for downloading and installing an Edge Processor instance on a Linux machine, and then select Copy to clipboard.
- On the machine that previously hosted the removed instance, open a command-line interface in a directory of your choice and then paste and run the commands. When the installation is complete, you will see the following message:
splunk-edge.service - Splunk edge starter Loaded: loaded (/etc/systemd/system/splunk-edge.service, enabled) Active: active (running)
This command contains sensitive information about your cloud environment. Do not share this command with anyone except your Splunk representative or trusted members in your organization.
- Verify that your instance is healthy by checking its status on the Edge Processors page or in the detailed view of your Edge Processor. It may take up to 1 minute for the status to change to Healthy.
- Uninstall the instance using the command provided in the Edge Processor service:
- In the Manage instances panel for your Edge Processor, on the Install/uninstall tab, expand the Step 1: Run commands to install/uninstall instances section and then select Uninstall.
- To copy the command for uninstalling an instance from a Linux machine, select Copy to clipboard.
- On the host machine where you reinstalled the instance, open a command-line interface in a directory of your choice and then paste and run the command.
An Edge Processor instance is in the "Warning" status
When viewing details about an Edge Processor, you notice that an instance is in the Warning status.
Cause
This Edge Processor instance is consuming more resources than what is acceptable based on the CPU threshold and Memory threshold settings configured in the Edge Processor service.
Solution
First, confirm that the CPU threshold and Memory threshold settings are configured to allow a reasonable amount of resource usage. The recommended amount is 80% of the total allocated resources.
- On the Edge Processors page, select Global settings.
- Select the Other settings tab.
- Check the values specified in the CPU threshold and Memory threshold fields.
- If you want to adjust those settings, do the following:
- Select Edit.
- In the CPU threshold field, enter the percentage of the total allocated CPU power that an Edge Processor instance can use before it enters the Warning state.
- In the Memory threshold field, enter the percentage of the total allocated memory that an Edge Processor instance can use before it enters the Warning state.
- Select Save to save your changes, and then select Edge Processors to return to the Edge Processor management page.
If the problem persists, then try the following solutions.
Reduce CPU usage per instance by scaling up the Edge Processor
Add more instances to the Edge Processor. Having more instances associated with the Edge Processor reduces the amount of CPU processing power that any one instance needs to consume.
For more information, see Add more instances to an Edge Processor.
Reinstall the instance on a machine that has more memory
Install a new Edge Processor instance on a host machine that has more memory available. Then, uninstall the instance that has the Warning status. Make sure to update your data sources to start sending data to the new instance and stop sending data to the uninstalled instance.
For more information, see Add more instances to an Edge Processor and Uninstall an Edge Processor instance.
An Edge Processor instance is in the "Error" status
When viewing details about an Edge Processor, you notice that an instance is in the Error status.
Cause
The Error status indicates that something is wrong with the Edge Processor instance, but it is still in contact with the Edge Processor service. This problem can occur for a variety of reasons. For example, this status can occur if the Edge Processor is configured incorrectly, or if an internal component is stuck in a restart loop.
To get more information about the root cause of an Error status, view the logs for the instance. The logs are located on the host machine of the instance in <install_directory>/var/log/edge.log, where <install_directory> is the installation directory of the Edge Processor instance.
For example, the following messages in the logs indicate that TLS or global settings for your Edge Processors are not configured correctly:
Log message | Cause of problem |
---|---|
failed to read client private key from path open <PEM_file_path>: no such file or directory |
The private key file that you specified in the Server private key field is invalid. |
tls: private key does not match public key |
The private key and server certificate files that you specified in the Server private key and Server certificate fields are not part of the same key pair. |
connecting socket: Connection refused |
The port you configured to use is not available. |
Solution
The solution varies depending on the root cause of the problem.
For example, if the logs for the instance indicate that TLS has not been configured correctly, then review the TLS settings for your Edge Processor and re-upload the private key or certificate files. If the logs indicate a misconfigured port, configure your settings to use an available port. For more information, see Obtain TLS certificates for data sources and Edge Processors and Add an Edge Processor.
An Edge Processor instance is stuck in the "Pending" status
When viewing details about an Edge Processor, you notice that an instance has been in the Pending status for longer than 2 minutes. The Pending status is supposed to change to the Healthy status once the Edge Processor service finishes applying configuration changes to the instance, which typically takes a few minutes.
Cause
The Edge Processor service is not working as expected. Something is preventing the service from completing the configuration changes on your instance.
Solution
Contact your Splunk representative for assistance.
My data is missing from a destination
If you suspect that data is missing, first confirm that the Edge Processor is sending the volume of data that you expect.
- Log in to your cloud tenant.
- Navigate to the Data management page.
- Select View debug logs to troubleshoot the system and then select the Run icon (
).
- Confirm that there haven't been any errors recently that would lead to data loss.
- If there were no recent errors, navigate to the Edge Processors page.
- Select the Edge Processor that you want. The Edge Processor details page opens.
- Confirm that the Outbound data shown in the Pipelines section matches the data volume that you expect.
If the data volume shown does not match what you expect, then check your pipeline configurations to make sure that you are not accidentally filtering out data that you want to keep. If the data volume shown matches what you expect but you cannot find some data, follow these troubleshooting steps.
Cause: The Edge Processor queue has filled up
Edge Processors currently provide no data delivery guarantees. However, to help prevent data loss, the Edge Processor holds data in a queue if the destination that it is sending data to is unavailable or if it receives more data than it can send. If the queue fills up before the destination is available again, then data loss still occurs as the Edge Processor starts dropping any additional data that it receives. To avoid losing data due to a full queue, you can increase the size of the queue.
Queued data is stored on the hard drive of the Edge Processor host. By default, the queue is configured to hold up to 10000 batches of data. The amount of data contained in each batch and how quickly the queue fills up varies depending on the rate at which the Edge Processor is receiving data. For example, if the Edge Processor receives a large amount of data over a short period of time, then the queue fills up quickly with larger batches of data. Typically, a queue size of 10000 holds at least 3 minutes of processing data.
If you want to adjust the queue size, perform the following steps.
- In a browser, log in to the Splunk Cloud Console.
- Click More Options
and select Settings.
- Click the Copy to clipboard
icon beside the Access Token field.
- From the command line, set the following environment variables.
- Set TOKEN to be the access token that you copied in step 3.
- Set TENANT to be the name of your tenant.
- Set API_URL to be
https://<tenant>.api.scs.splunk.com
. - Set DATASET_NAME to be the name of the destination dataset that you'd like to modify the queue for. For example, if you'd like to increase the size of the queue for your connected Splunk Cloud deployment, then this would be the name of the destination dataset associated with the connected Splunk Cloud deployment.
- Run the following command to modify the queue size. Replace <updated_queue_size> with the maximum number of data batches that you want the queue to hold.
curl --location --request PATCH "https://$API_URL/$TENANT/search/v3alpha1/datasets/$DATASET_NAME" \ --header "Authorization: Bearer $TOKEN" \ --header 'Content-Type: application/json' \ --data-raw "{ \"sendQueueSize\": <updated_queue_size> }
- Refresh your pipelines so that they use the updated
sendQueueSize
setting. See Refresh a pipeline for more information.
Cause: You are sending data to an index that doesn't exist yet
If you've specified a destination index that doesn't exist in the Splunk Cloud Platform, then your data is sent to the index that you specified in the lastchanceindex
stanza in your indexes.conf file.
Cause: You are sending data to an index that you don't have permission to view
Your permissions can vary depending on the index privileges. Perform the following steps to update your permissions so you can view the index that data is being sent to.
- Work with your Splunk administrator to add the index to the list of indexes that your role has permission to search. See Create and manage roles with Splunk Web in the Securing Splunk Cloud Platform manual.
- Refresh the connection between your cloud tenant and your Splunk Cloud Platform deployment. See Make more indexes available to the tenant.
My data is not being processed as expected
When you try to preview a pipeline, the preview results area displays a "No results" message or data that looks incorrect.
Alternatively, when you view the data that was sent from a pipeline to a destination, you notice that the data looks incorrect.
Cause
Reasons why a pipeline might not process data as expected include, but are not limited to, the following:
- The inbound stream of data is not being broken into events correctly. Data must be pre-processed into distinct events before being processed by a pipeline.
- The pipeline is not configured correctly.
Solution
First, make sure that event breaking and merging has been configured correctly for the source type of the data that you want to process.
- Navigate to the Source types page.
- Look for a source type with a name that matches the value of the
sourcetype
field in the data that you want to process.- If the source type exists, select it to view its configuration details. Confirm that the event breaking and merging behavior is configured correctly for the data that you want to process.
- If the source type does not exist, then add it to the Edge Processor service.
For more information about the configuration settings for source types, see Add source types for Edge Processors.
- Navigate to the Pipelines page.
- On the Pipelines page, in the row that lists the pipeline you want to verify, select the Actions icon (
) and then select Edit.
- Select the Data tab, and confirm that the $source option is set to the source type from step 2.
- If the configuration of the source type was updated after you applied your pipeline to your Edge Processor, then you need to refresh your pipeline to make sure that it uses the latest source type configuration. See Refresh a pipeline for more information.
If the problem persists after you've verified the source type configuration, then complete the following steps to verify that the processing commands in your pipeline are configured correctly.
- If you don't already have your pipeline open for editing, do the following:
- On the Actions tab, select the Edit icon (
) in the Use data from $source action.
- Select View and edit sample data.
- Enter or upload sample data that matches the inbound data that you want this pipeline to process, and then select Apply. You can use text strings that represent raw data or CSV values that represent parsed, field-extracted data. See Getting sample data for previewing data transformations for more information.
- To generate a preview of what your data looks like after being processed by the pipeline, select the Preview pipeline icon (
).
- Verify that the preview results match how you want the pipeline to process your data. If the results do not match, or the preview cannot be generated, then make sure that the SPL2 statement of your pipeline is written correctly and contains only supported SPL2 commands. See Edge Processor pipeline syntax for more information.
When I send data to the Splunk platform, chunks of data from different events are intertwined together after indexing
When you search your Splunk platform deployment for data that you routed through an Edge Processor, the search returns events that incorrectly contain chunks of data from different events. This problem occurs even though your pipelines are configured correctly and the pipeline previews show data that looks correct.
Cause
This problem occurs if the host_segment
attribute is configured in the inputs.conf file for multiple forwarders, and the host_segment
settings are causing those forwarders to use the same host
value. Typically, data from different forwarders is associated with different host
values.
If your Edge Processors are routing data from multiple forwarders and the data from those forwarders are associated with the same host
, source
, and sourcetype
values, then indexers treat that data as pieces of the same event and the data is intertwined as a result.
Solution
For each forwarder that is sending data to an Edge Processor, open the inputs.conf file and check the host_segment
setting. Make sure that none of your forwarders have been configured to send data using the same host
value.
For more information, see the following pages:
- Set the event host with the host_segment attribute in the Splunk Cloud Platform Getting Data In manual.
- inputs.conf in the Splunk Enterprise Admin Manual.
An Edge Processor fails to send data through HEC and logs a 403 error
Your Edge Processor has a pipeline that sends data to a Splunk platform HEC destination, but that data is not arriving in the Splunk platform as expected. When you view the logs in the <install_directory>/var/log/edge.log file on the host machine of the Edge Processor, you see an error message containing the HTTP 403 code. For example:
"error":"HTTP 403 \"Forbidden\""
For information about other error codes and messages returned by HEC endpoints, see Possible error codes in the Getting Data In manual.
Cause
The HEC token that the Edge Processor is using to send data to the Splunk platform is invalid. Reasons why a HEC token might be invalid include, but are not limited to, the following:
- The token is turned off in the Splunk platform.
- The token was entered incorrectly in the original HTTP request or in the Splunk platform HEC destination settings.
Solution
First, confirm which HEC token your Edge Processor is using to send the data:
- If the data was originally transmitted to the Edge Processor through an HTTP request that specifies a HEC token in the
Authorization
header, then this token is used when the Edge Processor sends the data to the Splunk platform. - Otherwise, the HEC token specified in the Splunk platform HEC destination is used.
Then, confirm the status of the HEC token in the Splunk platform deployment:
- Log in to the Splunk platform deployment where the HEC token is configured.
- In Splunk Web, select Settings, then Data inputs.
- Select HTTP Event Collector.
- Confirm that the HTTP Event Collector page lists your HEC token, and that the status of the token is Enabled.
- If the status of the token is Disabled, then turn it on by selecting Enable in the Actions column.
An Edge Processor fails to send data through HEC and logs an "Incorrect index" error
Your Edge Processor has a pipeline that sends data to a Splunk platform HEC destination, but that data is not arriving in the Splunk platform as expected. When you view the logs in the <install_directory>/var/log/edge.log file on the host machine of the Edge Processor, you see an error message containing the phrase Incorrect index
.
For information about other error codes and messages returned by HEC endpoints, see Possible error codes in the Getting Data In manual.
Cause
The HEC token that the Edge Processor is using to send data to the Splunk platform doesn't have access to the destination index specified for the data.
Solution
First, confirm which HEC token your Edge Processor is using to send the data:
- If the data was originally transmitted to the Edge Processor through an HTTP request that specifies a HEC token in the
Authorization
header, then this token is used when the Edge Processor sends the data to the Splunk platform. - Otherwise, the HEC token specified in the Splunk platform HEC destination is used.
Then, update the index permission settings on the HEC token:
- Log in to the Splunk platform deployment where the HEC token is configured.
- In Splunk Web, select Settings, then Data inputs.
- Select HTTP Event Collector.
- Select the token that your Edge Processor is using to send data.
- In the Select Allowed Indexes control, select remove all for the Selected indexes pane. When no indexes are selected, the HEC token allows data to be sent to any index in the Splunk platform deployment.
- Select Save.
Destinations associated with the connected Splunk Cloud Platform deployment are not working as expected
When you try to send data to an Index destination or a Splunk platform S2S destination that has the Tenant paired property in the Kind field, you encounter errors and the Edge Processor fails to send data to that destination.
Cause
Destinations that have the Index value or the Tenant paired property in the Kind field are available to Edge Processors through a connection named scpbridge. Reasons why these destinations might not work as expected include, but are not limited to, the following:
- The scpbridge connection settings are incorrect. This scenario can occur if the credentials of the service account have changed since the last time the connection was updated, or if the Splunk Cloud Platform deployment has been updated in a way that changes its connection information.
- An index that was previously available as part of this connection has been deleted or changed in the Splunk Cloud Platform deployment.
For more information about the scpbridge connection, see First-time setup instructions for the Edge Processor solution and Send data from Edge Processors to the Splunk Cloud Platform deployment connected to your tenant.
Solution
To verify or update your scpbridge connection settings, complete the following steps.
- In your cloud tenant, select the Settings icon (
) and then select Manage connection.
- Confirm that the Hostname, Management port, and Service account username values are correct for the Splunk Cloud Platform deployment that you want your cloud tenant to be connected to.
- Confirm that the password for the service account hasn't been changed since the scpbridge connection was last updated.
- If you need to update any connection settings, do the following:
If you are unable to send data from an Edge Processor to an index that is associated with the scpbridge connection, make sure that the index is still available in the connected Splunk Cloud Platform deployment and accessible by the scpbridge connection.
- Log in to the connected Splunk Cloud Platform deployment using your admin credentials.
- Confirm that the index has not been deleted from the deployment.
- Confirm that the service account used by the scpbridge connection has permission to access the index:
- In the Settings menu, in the Users and authentication section, select Roles.
- In the row that lists the role used by your service account, select Edit > Edit.
This role and service account were created during the initial setup of the Edge Processor solution. See First-time setup instructions for the Edge Processor solution for more information.
- On the 3. Indexes tab, make sure that the Included check box is selected for your index. If that check box is not already selected, then select it and select Save.
- In your cloud tenant, refresh the connection to your Splunk Cloud Platform deployment:
When I try to delete the "scpbridge" connection, an error occurs
In the View connection dialog box that displays details about the scpbridge connection, when you select the Delete icon (), the Edge Processor service returns an error message indicating that the connection could not be deleted.
Cause
This error message appears because the scpbridge connection cannot be deleted after it is created. The Edge Processor solution uses this connection to store and read logs and metrics from Edge Processors, and cannot operate correctly without this connection.
For more information about the scpbridge connection, see First-time setup instructions for the Edge Processor solution and Send data from Edge Processors to the Splunk Cloud Platform deployment connected to your tenant.
Solution
You cannot delete the scpbridge connection. However, if necessary, you can update the connection settings to connect the Edge Processor solution to a different Splunk Cloud Platform deployment.
To update the scpbridge connection settings, do the following:
- In your cloud tenant, select the Settings icon (
) and then select Manage connection.
- In the View connection dialog box, select the Edit icon (
).
- Update your connection settings as needed and then select Update connection.
Syslog data fails to reach an Edge Processor
Your Edge Processor is not receiving syslog data from your syslog data source. When you view the logs in the <install_directory>/var/log/edge.log file on the host machine of the Edge Processor, you see error messages from "input/tcp"
or "input/udp"
or "input/syslog"
.
Cause
This problem occurs when you use the UDP transport protocol to send data. This is a known issue with the UDP transport protocol that is not unique to the Edge Processor solution. UDP does not provide any data guarantees, so when you try to send syslog data to an Edge Processor through UDP, the data might not be delivered successfully.
Solution
If possible, send your syslog data through the TCP transport protocol to validate the connection. If UDP is needed, keep sending syslog data until data reaches Edge Processor. See Configure your device to send syslog data to an Edge Processor for instructions.
PREVIOUS Set up alerts for Edge Processor metrics |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.0.2209, 9.0.2303, 9.0.2305 (latest FedRAMP release)
Feedback submitted, thanks!