Run playbooks in parallel with vertical scaling

supports vertical scaling for playbook execution.

The DECIDED daemon now spawns a number of runners when started. Each runner is a dedicated instance of the Python 3 environment. The default is four runners.

When you upgrade Splunk SOAR (On-premises) from a lower release, the number of runners you have set is not changed.

When you set multiple runners in Main Menu > Administration > Administration Settings > Playbook Execution, DECIDED spawns that many runners for each Python version when it starts.

Playbooks and custom functions are normally run by a single runner in a single queue. A playbook or custom function must be completed before the runner executes the next playbook or custom function in sequence. By defining additional runners, playbooks and custom functions can be assigned to different runners to increase the number of playbooks which are executed at once.

When a playbook run is started, the DECIDED daemon assigns the playbook to an available runner. Playbooks are assigned to an available runner as follows:

Playbooks are assigned to available runners in a round-robin fashion.
All the blocks of a playbook and child playbooks are run by the same runner.

Differences in behavior from a single runner vs. multiple python runners

Some behaviors are different when multiple python runners are enabled.

The playbook API save_data may return incorrect results when playbooks are run in parallel if the key:value pairs are not unique across playbook runs. Use the save_object() API instead of the save_data() API.
The playbook API save_object may return incorrect results if the same playbook is run against the same container multiple times. Use the optional playbook_name and container_id parameters with save_object to make sure that saved objects are unique across multiple runs of the same playbook. If you need to save information specifically about the playbook run, use the save_run_data() and get_run_data() APIs.
When the number of playbook runners is increased some deployments may reach the maximum size defined for decided.log more quickly than with the default number of runners. administrators may want to increase the settings for logrotate on their deployments to values higher than rotate 10 and size 50M.

Use local variables instead of global variables

When creating or editing playbook source code, it is better to use local variables than global ones. Values stored in a global variable may be modified by another instance of the playbook, a child playbook, or another process that uses the same variable resulting in unexpected or incorrect results.

Exchange data between playbooks or blocks when you have multiple python runners

When you have multiple Python runners, you must exercise greater care around data consistency in your playbooks. When there is only a single Python runner, playbooks run in series so no other playbook will attempt to access or modify information during the playbook run. When multiple Python runners are operating, multiple playbooks could be using the same variables or objects, so data consistency is more difficult to guarantee. Using local variables, the correct APIs for passing data between playbook runs, and implementing a locking method helps ensure your data is consistent and correct.

You must implement a locking solution, such as NamedAtomicLock, to ensure data consistency. See NamedAtomicLock on pypi.org for details. Use locking carefully. Incorrectly implementing locking can lead to problems or hang a playbook.

If you want to share data within the same playbook run use the APIs save_run_data and get_run_data to exchange information about specific keys. In this case, locking is not required. See get_run_data and save_run_data in the Python Playbook API Reference for .

If you want to share data between playbook runs with the same playbook name, between playbook runs that operate on the same container, or across all playbook runs, then you should use the APIs clear_object, get_object, and save_object. In each of these cases, locking is required. See get_object, save_object, and clear_object in the Python Playbook API Reference for .

If all you want to do is read data from a container in a playbook, then no locking is required.

When exchanging data between playbook runs, remember that the order in which playbooks are run is not guaranteed. If you need to ensure results from a specific playbook are available, call it as a dependent playbook.

DECIDED settings for vertical scaling

You configure DECIDED for vertical scaling by changing the settings in Main Menu > Administration > Administration Settings > Playbook Execution.

Changing these settings restarts DECIDED and causes any running playbooks to fail. It is better to change these settings only when there are no active playbooks.

Code Block Execution Time Out (in minutes) Set the number of minutes a playbook code block or custom function can run before DECIDED will stop the runner and spawn a replacement runner. Any playbook that hits this limit without completing will be marked as failed.

Number of Python 3 Runners Set the number of Python 3 runners. For the default is 4 runners. The minimum is one runner. The maximum is 10. This number of runners will be created when DECIDED starts and will be active even if no playbooks or custom functions are assigned to them.

These settings can also be modified by using the REST API. See REST System Settings.

When to add more Python runners

By default, starts with four runners for Python 3 and is designed to support up to 10 runners. Because every deployment is unique, and the factors that influence performance are varied, there are no hard rules for when, or by how many to increase the number of runners for your deployment.

When deciding whether or not to add more runners, some factors that influence performance are:

Number and kind of actions performed in your playbooks.
Number of child playbooks or custom functions executed by playbooks.
Actions that require responses from assets or external services.
Available CPU resources.

If your deployment is queuing playbooks to run, and your hardware or virtual machine still has unused CPU capacity (such as idle cores, or low core usage percentages) you should consider increasing the number of playbook runners.

Increase the number of runners by one and measure performance before adding additional runners. Repeat this until you either achieve the performance gains desired, reach the maximum number of runners, or encounter resource limits.

When you increase the number of Python runners you can see a decrease in the length of time it takes to complete a playbook. Many deployments can expect to see gains by adding between one and four more runners, with gains from adding additional Python runners tapering off after a total of five runners.

Not all playbooks and deployments are the same. Your results may vary based on the number of playbooks, the kinds of actions or processing each playbook is doing, the amount of CPU cores available to , and other effects.

Run playbooks in parallel with vertical scaling

Differences in behavior from a single runner vs. multiple python runners

Use local variables instead of global variables

Exchange data between playbooks or blocks when you have multiple python runners

DECIDED settings for vertical scaling

When to add more Python runners

Comments

Run playbooks in parallel with vertical scaling

Was this topic useful?