Run playbooks in parallel with vertical scaling

Splunk Phantom supports vertical scaling for playbook execution.

The DECIDED daemon now spawns a number of runners when started. Each runner is a dedicated instance of either the Python 2 or Python 3 environment. The default is one runner for Python 2 and one runner for Python 3. If multiple runners are specified in Main Menu > Administration > Administration Settings > Playbook Execution, DECIDED spawns that many runners for each Python version when it starts.

Playbooks and custom functions are normally run by a single runner for their Python version in a single queue. A playbook or custom function must be completed before the runner executes the next playbook or custom function in sequence. By defining additional runners, playbooks and custom functions can be assigned to different runners to increase the number of playbooks which are executed at once.

When a playbook run is started, the DECIDED daemon assigns the playbook to an available runner for the correct Python version. Playbooks are assigned to an available runner as follows:

Runners that match a playbook's Python version are checked for availability.
Playbooks are assigned to matching, available runners in a round-robin fashion.
All the blocks of a playbook and child playbooks are run by the same runner.

Differences in behavior from a single runner vs. multiple python runners

Some Splunk Phantom behaviors are different when multiple python runners are enabled.

The playbook API save_data may return incorrect results when playbooks are run in parallel if the key:value pairs are not unique across playbook runs. Use the save_object() API instead of the save_data() API.
The playbook API save_object may return incorrect results if the same playbook is run against the same container multiple times simultaneously. Use the optional playbook_name and container_id parameters with save_object to make sure that saved objects are unique across multiple runs of the same playbook.
When the number of playbook runners is increased some deployments may reach the maximum size defined for decided.log more quickly than with the default number of runners. Splunk Phantom administrators may want to increase the settings for logrotate on their deployments to values higher than rotate 10 and size 50M.

Use local variables instead of global variables

When creating or editing playbook source code, it is better to use local variables than global ones. Values stored in a global variable may be modified by another instance of the playbook, a child playbook, or another process that uses the same variable resulting in unexpected or incorrect results.

If you need to pass data between functions or playbooks, it is better to use function outputs, or to persist data using the save_object and the related get_object and clear_object APIs.

DECIDED settings for vertical scaling

Configure DECIDED for vertical scaling by changing the settings in Main Menu > Administration > Administration Settings > Playbook Execution.

Changing these settings will restart DECIDED and cause any playbooks which are running to fail. As a best practice, change these settings only when no playbooks are active.

Code Block Execution Time Out (in minutes) Set the number of minutes a playbook code block or custom function can run before DECIDED will stop the runner and spawn a replacement runner. Any playbook that hits this limit without completing will be marked as failed.

Number of Python 2 Runners Set the number of Python 2 runners. The default and the minimum is one runner. The maximum is 10. This number of runners will be created when DECIDED starts and will be active even if no playbooks or custom functions are assigned to them.

Number of Python 3 Runners Set the number of Python 3 runners. The default and the minimum is one runner. The maximum is 10. This number of runners will be created when DECIDED starts and will be active even if no playbooks or custom functions are assigned to them.

These settings can also be modified by using the REST API. See REST System Settings.

When to add more Python runners

By default, Splunk Phantom starts with a single runner for Python 3 and Python 2, and is designed to support up to 10 runners for each python version. Because every deployment is unique, and the factors that influence performance are varied, there are no hard rules for when, or by how many to increase the number of runners for your deployment.

When deciding whether or not to add more runners, some factors that influence performance are:

Number and kind of actions performed in your playbooks.
Number of child playbooks or custom functions executed by playbooks.
Actions that require responses from assets or external services.
Available CPU resources.

If your Splunk Phantom deployment is queuing playbooks to run, and your hardware or virtual machine still has unused CPU capacity (such as idle cores, or low core usage percentages) you should consider increasing the number of playbook runners.

Increase the number of runners for the python version of your playbooks. For example, if most of your queued playbooks are in Python 2, increase the number of Python 2 runners.
Increase the number of runners by one, or one for each python environment and measure performance before adding additional runners. Repeat this until you either achieve the performance gains desired, reach the maximum number of runners for each python environment, or encounter resource limits.

When you increase the number of Python runners you can see a decrease in the length of time it takes to complete a playbook. Many deployments can expect to see gains by adding between one and four more of each type of Python runner, with gains from adding additional Python runners tapering off after a total of five of each Python runner type.

Not all playbooks and deployments are the same. Your results may vary based on the number of playbooks, the kinds of actions or processing each playbook is doing, the amount of CPU cores available to Splunk Phantom, and other effects.

Run playbooks in parallel with vertical scaling

Differences in behavior from a single runner vs. multiple python runners

Use local variables instead of global variables

DECIDED settings for vertical scaling

When to add more Python runners

Comments

Run playbooks in parallel with vertical scaling

Was this topic useful?