Run playbooks in parallel with vertical scaling
supports vertical scaling for playbook execution.
The DECIDED daemon now spawns a number of runners when started. Each runner is a dedicated instance of either the Python 2 or Python 3 environment. On a new installation of 4.10.4 or later, the default is one runner for Python 2 and four runners for Python 3.
When you upgrade Splunk Phantom from an earlier release, the number of runners you have set for each Python version is not changed.
If multiple runners are specified in Main Menu > Administration > Administration Settings > Playbook Execution, DECIDED spawns that many runners for each Python version when it starts.
Playbooks and custom functions are normally run by a single runner for their Python version in a single queue. A playbook or custom function must be completed before the runner executes the next playbook or custom function in sequence. By defining additional runners, playbooks and custom functions can be assigned to different runners to increase the number of playbooks which are executed at once.
When a playbook run is started, the DECIDED daemon assigns the playbook to an available runner for the correct Python version. Playbooks are assigned to an available runner as follows:
- Runners that match a playbook's Python version are checked for availability.
- Playbooks are assigned to matching, available runners in a round-robin fashion.
- All the blocks of a playbook and child playbooks are run by the same runner.
Differences in behavior from a single runner vs. multiple python runners
Some behaviors are different when multiple python runners are enabled.
- The playbook API save_data may return incorrect results when playbooks are run in parallel if the key:value pairs are not unique across playbook runs. Use the save_object() API instead of the save_data() API.
- The playbook API save_object may return incorrect results if the same playbook is run against the same container multiple times. Use the optional playbook_name and container_id parameters with save_object to make sure that saved objects are unique across multiple runs of the same playbook. If you need to save information specifically about the playbook run, use the save_run_data() and get_run_data() APIs.
- When the number of playbook runners is increased some deployments may reach the maximum size defined for decided.log more quickly than with the default number of runners. administrators may want to increase the settings for logrotate on their deployments to values higher than
rotate 10
andsize 50M
.
Use local variables instead of global variables
When creating or editing playbook source code, it is better to use local variables than global ones. Values stored in a global variable may be modified by another instance of the playbook, a child playbook, or another process that uses the same variable resulting in unexpected or incorrect results.
Exchange data between playbooks or blocks when you have multiple python runners
When you have multiple Python runners, you must exercise greater care around data consistency in your playbooks. When there is only a single Python runner, playbooks run in series so no other playbook will attempt to access or modify information during the playbook run. When multiple Python runners are operating, multiple playbooks could be using the same variables or objects, so data consistency is more difficult to guarantee. Using local variables, the correct APIs for passing data between playbook runs, and implementing a locking method helps ensure your data is consistent and correct.
You must implement a locking solution, such as NamedAtomicLock, to ensure data consistency. See NamedAtomicLock on pypi.org for details. Use locking carefully. Incorrectly implementing locking can lead to problems or hang a playbook.
- If you want to share data within the same playbook run use the APIs save_run_data and get_run_data to exchange information about specific keys. In this case, locking is not required. See get_run_data and save_run_data in the Python Playbook API Reference for .
- If you want to share data between playbook runs with the same playbook name, between playbook runs that operate on the same container, or across all playbook runs, then you should use the APIs clear_object, get_object, and save_object. In each of these cases, locking is required. See get_object, save_object, and clear_object in the Python Playbook API Reference for .
- If all you want to do is read data from a container in a playbook, then no locking is required.
- When exchanging data between playbook runs, remember that the order in which playbooks are run is not guaranteed. If you need to ensure results from a specific playbook are available, call it as a dependent playbook.
DECIDED settings for vertical scaling
You configure DECIDED for vertical scaling by changing the settings in Main Menu > Administration > Administration Settings > Playbook Execution.
Changing these settings restarts DECIDED and causes any running playbooks to fail. It is better to change these settings only when there are no active playbooks.
Code Block Execution Time Out (in minutes) Set the number of minutes a playbook code block or custom function can run before DECIDED will stop the runner and spawn a replacement runner. Any playbook that hits this limit without completing will be marked as failed.
Number of Python 2 Runners Set the number of Python 2 runners. The default and the minimum is one runner. The maximum is 10. This number of runners will be created when DECIDED starts and will be active even if no playbooks or custom functions are assigned to them.
Number of Python 3 Runners Set the number of Python 3 runners. For release 4.10.4, the default is 4 runners. The minimum is one runner. The maximum is 10. This number of runners will be created when DECIDED starts and will be active even if no playbooks or custom functions are assigned to them.
These settings can also be modified by using the REST API. See REST System Settings.
When to add more Python runners
By default, starts with a single runner for Python 2, four runners for Python 3, and is designed to support up to 10 runners for each python version. Because every deployment is unique, and the factors that influence performance are varied, there are no hard rules for when, or by how many to increase the number of runners for your deployment.
When deciding whether or not to add more runners, some factors that influence performance are:
- Number and kind of actions performed in your playbooks.
- Number of child playbooks or custom functions executed by playbooks.
- Actions that require responses from assets or external services.
- Available CPU resources.
If your deployment is queuing playbooks to run, and your hardware or virtual machine still has unused CPU capacity (such as idle cores, or low core usage percentages) you should consider increasing the number of playbook runners.
- Increase the number of runners for the python version of your playbooks. For example, if most of your queued playbooks are in Python 2, increase the number of Python 2 runners.
- Increase the number of runners by one, or one for each python environment and measure performance before adding additional runners. Repeat this until you either achieve the performance gains desired, reach the maximum number of runners for each python environment, or encounter resource limits.
When you increase the number of Python runners you can see a decrease in the length of time it takes to complete a playbook. Many deployments can expect to see gains by adding between one and four more of each type of Python runner, with gains from adding additional Python runners tapering off after a total of five of each Python runner type.
Not all playbooks and deployments are the same. Your results may vary based on the number of playbooks, the kinds of actions or processing each playbook is doing, the amount of CPU cores available to , and other effects.
Configure Google Maps for visual geolocation data | Manage your organization's credentials with a password vault |
This documentation applies to the following versions of Splunk® Phantom (Legacy): 4.10.4, 4.10.6, 4.10.7
Feedback submitted, thanks!