Splunk® Common Information Model Add-on

Common Information Model Add-on Manual

This documentation does not apply to the most recent version of Splunk® Common Information Model Add-on. For documentation on the most recent version, go to the latest release.

Troubleshoot adaptive response actions in search head cluster deployments on Splunk Cloud Platform

Issue

The adaptive response framework displays error messages on Splunk Cloud Platform (SCP) search head cluster (SHC) deployments when using Common Information model (CIM) Add-on versions 5.0.2 and lower. Errors occur on Splunk Cloud Platform deployments using the CIM Add-on and Splunk Enterprise Security deployments that bundle the CIM Add-on.

If you are a Splunk Cloud Platform customer, you can configure your Splunk Cloud Platform Enterprise Security search head with an API key, which allows you to authenticate from the KV Store collection and Common Action Model (CAM) queue. The CAM adaptive response relay worker is installed on-prem and configured to communicate with Splunk Cloud Platform using the Common Information Model. For more information, see Configure your Splunk Cloud Platform ES search head with an API key.

The on-prem CAM relay worker runs every 60 seconds on the Splunk Cloud Platform CAM queue and checks whether an alert action exists in the queue or not. if an alert action exists in the CAM queue, the CAM relay worker runs the alert action. The adaptive response framework displays "500 Server Error" messages when connecting to Splunk Cloud from the on-prem CAM relay worker.
For example:

2022-07-15 09:52:59,874+0000 ERROR pid=16227 tid=MainThread file=relaymodaction.py:run:328 | Failed to fetch results: 500 Server Error: Internal Server Error for url: https://customer-gsoc.splunkcloud.com:8089/services/alerts/modaction_queue/peek/LOG-HF09.mycustomer.com@@cff33f3c137b6af7faecc825381fdeb73841964d

Adaptive response action errors cause a delay between the time when the alert is sent to the queue and the time when the on-prem CAM relay worker dequeues the alert. For example, If an on-prem CAM relay worker tries to connect to Splunk Cloud every 60 seconds and there is an 18 minute delay, , this implies that the CAM relay worker can connect to Splunk Cloud Platform successfully only after 18 attempts.

The following architectural diagram depicts the process workflow for adaptive response actions in a search head cluster deployment on Splunk Cloud Platform:

CIM adaptive response active components and process flow

Cause

The connection between the modular action relay heavy forwarder and the Cloud stack causes the adaptive response framework failures within a search head cluster Cloud environment. When configuring the modular action relay, the remote search head URI is set using the following format:protocol://servername:port, which was initially intended to be the URL of a single search head.

In a search head cluster environment, this connection setting cannot be assigned to a static member within the search head cluster as all search head cluster members can generate adaptive response actions at any time. If the remote URI is set to a single search head within the search head cluster, it results in a failure because the remote relay can only process actions that are related to the search results on the static search head member of the cluster.

Search head cluster environments on Splunk Cloud Platform provide an alternative to designating a static search head. All cloud stacks are accessible using a load-balanced stack URL. Requests to this URL can be redirected to any member within the search head cluster. Typically, this stack URL is assigned as the remote search head URI on the modular action relay. When the URI is set to this generic stack URL, the modular action relays requests using the load balancer. If the load balancer redirects the request to a member of the search head cluster that did not initiate the adaptive response action, the fetch request for search results fails.

Solution

Ensure that the modular action relay's heavy forwarder requests get directed to the appropriate member in the search head cluster, which initiates the adaptive response action. The search head that initiates the adaptive response action has the search results related to the adaptive response action.

Adaptive response actions are created using searches that use the following format:
... | sendalert <ar-action-command> .
These adaptive response actions are queued to the CAM queue and KV Store collection. Each entry contains a payload of an adaptive response action.

Following is an example of the payload for an adaptive response action:

{
        "app": "search",
        "owner": "admin",
        "results_file": "/opt/splunk/var/run/splunk/dispatch/scheduler__admin__search__RMD510d9054342d784cd_at_1664755380_283_E007D213-8F37-44C9-9663-8393A9765418/sendalert_temp_results.csv.gz",
        "results_link": "https://important-impala-mym.stg.splunkcloud.com:443/app/search/@go?sid=scheduler__admin__search__RMD510d9054342d784cd_at_1664755380_283_E007D213-8F37-44C9-9663-8393A9765418",
        "search_uri": "/servicesNS/admin/search/saved/searches/danny-2",
        "server_host": "sh-i-0a554cea1f83c1c7e",
        "server_uri": "https://127.0.0.1:8089",
        "session_key": "KEqwK4a44mUOAQk_apYg3pH4ePQvgRQDK9dWeTGr3K69HWqLWIhkR8RmAVsphDt04AyV9W^HnjUsy5hHV5Zq1H28fLyM6r5Zbq8EkmMOFO^25uxR_9e5rDfra1tFQMyloEu76l7sCKs0IlVkp7YNmzmA0qHWuaoa3f3pXkDTgtImLzURXgJTnl5qYh3Js6XA3sYYsvw_qEfGQGL8DP_rfkEuIV9C8EGwAmwTYnL3pC",
        "sid": "scheduler__admin__search__RMD510d9054342d784cd_at_1664755380_283_E007D213-8F37-44C9-9663-8393A9765418",
        "search_name": "danny-2",
        "configuration": {
            "_cam": "{\n    \"category\":          [\"Information Gathering\"],\n    \"task\":              [\"scan\"],\n    \"subject\":           [\"device\"],\n    \"technology\":        [{\"vendor\": \"Operating System\", \"product\": \"Utility\"}],\n    \"supports_adhoc\":    true,\n    \"supports_cloud\":    true,\n    \"supports_workers\":  true,\n    \"field_name_params\": [\"param.host_field\"],\n    \"required_params\":   [\"param.host_field\"]\n}",
            "_cam_workers": "[\"hf1\"]",
            "host_field": "src",
            "index": "main",
            "max_results": "5",
            "verbose": "0"
        }
    }

In this example, consider the following fields:

  • results_link
  • server_host.

The URL in the results_link field is used by the modular action relay directly to retrieve the related search results for the adaptive response actions. In search head cluster environments on Splunk Cloud Platform, the URL in the results_link field typically directs to the Cloud stack's generic URL such as https://important-impala-mym.stg.splunkcloud.com.

The server_host field contains the search head on which the adaptive response action originates such as sh-i-0a554cea1f83c1c7e

The URL in the results_link field shares the same domain name as the URI for the modular action relay's remote search head.

To ensure that the modular action relay's heavy forwarder requests get directed to the appropriate member in the search head cluster, the URL for the search head must be a combination of the server_host and the results_link fields. This URL is included in the Splunk_SA_CIM/bin/relaymodaction.py file:

For example:

https://sh-i-0a554cea1f83c1c7e.important-impala-mym.stg.splunkcloud.com:443/...

On the remote heavy forwarder, update the Splunk_SA_CIM/bin/relaymodaction.py file within the Common Information Model Add-on by deploying a patch that expects the domain name within the URL of the results_link field to be the same as the domain name used in the remote search head URI setting for the relay modular action.

For example:

Deploy the patch

The example in these steps reproduces an environment that uses the default adaptive response command set such as the ping command.
See also:
Set up an Adaptive Response relay from a Splunk Cloud Platform Enterprise Security search head to an on-premises device

Prerequisite

  • Ensure that the modular action relay is disabled on the heavy forwarder.

Follow these steps to deploy the patch on the remote heavy forwarder:

  1. In the heavy forwarder's file system, add the patch file: Splunk_SA_CIM/bin/relaymodaction.py
  2. Check the CAM queue using the Lookup Editor to ensure that unprocessed adaptive response actions are available. CAM queue on the Lookup Editor
  3. Enable the modular action relay to restart processing.
  4. Run the following search on a member in the search head cluster to check on the processing status of the modular action relay.

    index=_internal source="/opt/splunk/var/log/splunk/python_modular_input.log"

    This displays log entries in the following format: <timestamp> INFO pid=3953 tid=MainThread file=relaymodaction.py:fetch_results:172 | Successfully fetched results_file content for action with key hf1@@cb57310f74a54f72fb695c5962e45801998b88ba for worker hf1


    Modular Action Relay Processing Fetch Results


    Each successful fetch entry displays a successful dequeue.

    <timestamp> pid=3953 tid=MainThread file=relaymodaction.py:dequeue:200 | Successfully dequeued action with key hf1@@cb57310f74a54f72fb695c5962e45801998b88ba for worker hf1

    When the modular action relay is run successfully a few times, the number of entries for adaptive response actions in the CAM queue should decrease and the queue must be empty when the runs are completed.
  5. Run the following search on each member of the search head cluster using the Search app to generate adaptive response actions:

    | search | sendalert ping | makeresults count=1 | eval src="10.20.30.40",user="SPLUNKTEST" | sendalert ping param.index=main param._cam_workers="[\"hf1\"]" param.max_

    Alternatively, you can convert the search into a scheduled saved search to automatically generate results.

Last modified on 05 April, 2023
 

This documentation applies to the following versions of Splunk® Common Information Model Add-on: 5.0.2, 5.1.0


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters