Splunk® SOAR (On-premises)

Administer Splunk SOAR (On-premises)

The classic playbook editor will be deprecated in early 2025. Convert your classic playbooks to modern mode.
After the future removal of the classic playbook editor, your existing classic playbooks will continue to run, However, you will no longer be able to visualize or modify existing classic playbooks.
For details, see:
This documentation does not apply to the most recent version of Splunk® SOAR (On-premises). For documentation on the most recent version, go to the latest release.

Add or remove a cluster node from Splunk SOAR (On-premises)

A Splunk SOAR (On-premises) cluster can have nodes added or removed after the cluster has been created.

Splunk SOAR (On-premises) does not have the ability to automatically scale, or automatically add or remove cluster nodes through external systems such as Kubernetes, AWS, or Azure.

Adding cluster nodes

Adding a node to a Splunk SOAR (On-premises) cluster involves building an instance of Splunk SOAR (On-premises) and using the make_cluster_node command on that instance to add it to the cluster.

For more information see these topics in Install and Upgrade Splunk SOAR (On-premises).

Removing Splunk SOAR (On-premises) cluster nodes

You may want to remove a node from a Splunk SOAR (On-premises) cluster. Possible reasons for removing a cluster node might include; reducing your cluster size, decommissioning or replacing hardware, or even disaster recovery.

Splunk SOAR (On-premises) releases 5.0.1 through 5.2.1 require you to work with Splunk Support to remove cluster nodes from your Splunk SOAR (On-premises) cluster. Removing cluster nodes is irreversible.

Identify the cluster nodes to remove from your cluster

Each node you want to remove must meet the following requirements before being removed from your cluster.

  • The node to be removed has already been removed from your load balancer configuration.
  • The node to be removed is still listed in the cluster_node table of the Splunk SOAR PostgreSQL database.
  • The node to be removed has either:
    • had all Splunk SOAR services permanently stopped or
    • the cluster node has been destroyed

Before you can remove cluster nodes from your Splunk SOAR (On-premises) cluster, you need to know which nodes are identified as Consul "server" or "client" nodes and make sure that all RabbitMQ nodes are set to "disc" mode.

  • If you are removing multiple nodes from your cluster at one time, it is best to remove nodes listed as Consul clients before removing nodes listed as Consul servers.
    • Consul clients do not vote in the leader elections, making them better suited to being removed first.
    • Splunk SOAR (On-premises) clusters need a minimum of three Consul server nodes to elect a leader. If a cluster contains fewer than three Consul server nodes, it will not be able to elect a leader until you have the required minimum three Consul server nodes.
  • In Splunk SOAR (On-premises) clusters larger than three nodes, some RabbitMQ nodes will run in "RAM" mode. You must set all RabbitMQ cluster nodes to "disc" mode before proceeding. See The role of RabbitMQ in An overview of the Splunk SOAR (On-premises) clustering feature.

Do these steps to identify the cluster node or nodes you need to remove from your cluster.

  1. Using SSH, connect to any node in your Splunk SOAR (On-premises) cluster.
  2. Determine which nodes are Consul "servers" or Consul "clients."
    phenv consul members
  3. Get and record the IP addresses and GUIDs of your Splunk SOAR (On-premises) cluster nodes. You will need noth the IP address and the GUID for nodes at later steps in the node-removal process.
    phenv python -m manage dbshell -- -c "select host, guid from cluster_node;"

    Example

    [phantom@localhost ~]$ phenv python -m manage dbshell -- -c "select host, guid from cluster_node;"
        host     |                 guid
    -------------+--------------------------------------
     10.1.66.157 | 453b9bdc-624c-4425-b5ca-00f892d8a365
     10.1.66.246 | 37451330-8b00-4712-bf38-62dcc64e4509
     10.1.65.191 | 77dc18a1-e8a9-4564-aa96-72337d9cbc1e
    (3 rows)
    

Procedure for removing a Splunk SOAR (On-premises) node

To remove a cluster node follow these steps.

  1. Obtain the IP address and the GUID of the cluster node you want to remove from your Splunk SOAR (On-premises) cluster.
  2. Prevent the cluster from routing ingestion and automation actions to the cluster node you want to remove. If the cluster node has already been destroyed, skip this step.
    1. Log in to the Splunk SOAR (On-premises) web-based user interface as a user with the administrator role.
    2. From the Home menu, select Administration then Product Settings, then Clustering.
    3. Locate the cluster node you want to remove in the list of nodes. Set the Enabled toggle switch for that node from On to Off. If the cluster node already displays Offline or is already set to Off, skip this step.
  3. Using SSH, connect to the cluster node you want to remove. If the cluster node has already been destroyed, skip this step.
  4. From the command line, stop SOAR services on the cluster node. If the cluster node has already been destroyed, skip this step.
     <$PHANTOM_HOME>/bin/stop_phantom.sh 
  5. SSH to a Splunk SOAR (On-premises) cluster node that will remain in your cluster.
  6. Delete records from the container_node_affinity table for the cluster node you want to remove.
    <$PHANTOM_HOME>/bin/phenv python -m manage dbshell -- -c "delete from container_node_affinity where node_affinity=<guid>;"
  7. Delete records from the asset_node_affinity table for the cluster node you want to remove.
    <$PHANTOM_HOME>/bin/phenv python -m manage dbshell -- -c "delete from asset_node_affinity where node_affinity=<guid>;"
  8. Delete records from the system_health table for the cluster node you want to remove.
    <$PHANTOM_HOME>/bin/phenv python -m manage dbshell -- -c "delete from system_health where node=<guid>;"
  9. Remove the node from the RabbitMQ cluster.
    <$PHANTOM_HOME>/bin/phenv rabbitmqctl forget_cluster_node rabbit@<node_ip>
  10. Remove the node from the Consul cluster.
    <$PHANTOM_HOME>/bin/phenv consul force-leave <guid> 
  11. Remove the Splunk SOAR (On-premises) node you want to remove from your cluster from your load balancer's configuration. For steps on removing a server from your load balancer's configuration, see the documentation for your load balancer.
  12. Remove the node from the cluster_node table.
    <$PHANTOM_HOME>/bin/phenv python -m manage dbshell -- -c "delete from cluster_node where guid=<guid>;"
  13. Destroy or otherwise deprovision the cluster node that has been removed.

    Splunk SOAR (On-premises) must not be restarted on that deprovisioned cluster node. Restarting Splunk SOAR (On-premises) on the deprovisioned node can interfere with the functioning of the other cluster nodes.

  14. Repeat these steps for each cluster node you want to remove from your Splunk SOAR (On-premises) cluster.
  15. On all remaining cluster nodes, edit the file <$PHANTOM_HOME>/etc/consul/config.json to remove all references to the removed nodes from the retry_join block.
  16. Verify cluster membership is as expected.
    <$PHANTOM_HOME>/bin/phenv show_cluster_state

    Using the management command phenv show_cluster_state will show Consul-related information for recently removed cluster nodes for up to 72 hours after their removal. Consul purges references to those nodes after 72 hours. This is normal and expected.

Last modified on 31 July, 2023
How to restart your Splunk SOAR (On-premises) cluster   certificate store overview

This documentation applies to the following versions of Splunk® SOAR (On-premises): 5.2.1


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters