Data access automation API
Splunk Phantom's Automation API allows security operations teams to develop detailed and precise automation strategies. Playbooks can serve many purposes, ranging from automating minimal investigative tasks that can speed up analysis to large-scale response to a security breach. The following APIs are supported to leverage the capabilities of the platform using Playbooks.
collect
phantom.collect(container, #this can be a container or an action results object datapath, scope='new', limit=100, none_if_first=False)
This API allows users to collect or gather any information from the associated Artifacts of a Container or action results that you get in the action callback or via the get_action_results() API.
For example, to obtain a listing of all IP addresses or all file hashes across all Artifacts, you can use this API, by specifying the appropriate data path into the Artifact JSON. Or for extracting all the 'country_iso_code' from the action results of action 'geolocate ip', you can use this API by just passing in the 'results' object. You can specify either one datapath as a string for the information you want to extract from action results or you can specify more than one datapaths in a list of datapath strings.
Parameter | Description |
---|---|
container | The container that is available to the user in on_start() or any action callback or in on_finish() or this can be a results object that you get in the action callback or via the get_action_results() API. |
datapath |
The path of the element in the JSON schema to be able to access/retrieve it from associated Artifacts of a Container or the action results object. Example datapaths for a Container:
Example datapaths for action results: These are exactly as specified in each app's action 'Action Output' section.
If you specify a list of datapaths for extracting data from action results, the results will be a table, where each column represents the respective datapath. If you specify a single datapath as a string Phantom will simplify and return just the data corresponding to one column. |
scope | This OPTIONAL parameter (default = 'new') defines if the data has to be collected from artifacts over what range of time window. 'scope' can be new, which implies that the information has to be collected only from 'new' artifacts since the playbook last ran on that container. all scope implies that the information has to be collected from 'all' artifacts belonging to this container.
An active playbook runs on a container after it has been created and every time new artifacts are added to the container. 'scope' is especially useful when you want every instance of playbook run to only process 'new' artifacts that have been posted/added to the container. However every time you modify the playbook, it is considered a new playbook and hence the playbook execution will start with all artifacts in the container till that instance in time and after that the 'scope' parameter will only collect what has been added to the container after the previous instance of playbook execution. Please see the following two parameters to further alter/control the behavior of this API |
limit | This OPTIONAL parameter enforces the maximum number of artifacts that can be retrieved in this call. Default is 100. |
none_if_first | When the collect call is executed from a playbook for the first time on a container, even with scope='new', it will collect all the artifacts since the container was created. This parameter allows you to change the behavior of the collect call executed for the first time from this playbook on a container. Use this parameter to specify whether you would like the playbook to collect all artifacts since the container was created, or only those added since the first time the playbook was executed on the container. If you would like the playbook to not get any existing artifacts the first time it is run on the container, specify True for this parameter. Then, on subsequent runs, it will only get the artifacts added since the first run. |
collect2
phantom.collect2(container=None, action_results=None, action_name=None, datapath=None, filter_artifacts=None, tags=None, scope='new', limit=100, trace=False)
This is an extension of the phantom.collect() API. It adds the filter_artifacts parameter, a list of artifacts whose values will be returned.
Parameter | Required? | Description |
---|---|---|
container | Required | This is the container dictionary object that is passed to the playbook across various functions. |
action_results | Optional, unless action_name is not provided. |
These are the action results passed into any callback function, or a subset of action results that had been filtered from a phantom.condition() call.
|
action_name | Optional, unless action_results is not provided. |
This is the custom 'name' specified for the action in the the phantom.act() API. This allows action results to be returned based on the action name. |
datapath | Required | A list of datapaths. A datapath is the path of the element in the JSON schema to be able to access/retrieve it from associated action results or artifacts. Please refer to the phantom.collect() API for examples. |
filter_artifacts | Optional | These are ids of artifacts that were returned from a phantom.condition() call.
|
tags | Optional | A list of tags used to further filter artifacts. |
scope | Optional | Scope of artifacts to retrieve, defaults to 'new'. Please refer to the phantom.collect() API for more details.
|
limit | Optional | Maximum number of results to be returned. Please refer to phantom.collect() API documentation. |
trace | Optional | Set this parameter to 'True' for verbose debugging of the API call. |
collect_from_contains
phantom.collect_from_contains(container=None, action_results=None, contains=None, tags=None, scope=None, filter_artifacts=None, include_params=True, limit=None, trace=False)
This API is meant to function similarly to collect, but instead of needing to know the datapaths for the values you want, you instead provide a contains value. This will return a flat list of all the unique values which match at least one contains in the list. Returns None on failure.
Parameter | Required? | Description |
---|---|---|
container | Optional, unless action_results is not provided. |
Container, passing this will search for contains in the CEF values of that container. |
action_results | Optional, unless container is not provided. |
Action result, like what is passed to a callback from phantom.act() (as 'result'). Search for values matching the contains in this action result.
|
contains | Required | A list of contains to filter by. |
tags | Optional | A list of tags used to further filter artifacts. |
filter_artifacts | Optional | These are ids of artifacts that were returned from a phantom.condition() call.
|
include_params | Optional | If set to false, ignore values with matching contains if they are a parameter to an action. This value is only used if an action_result is passed in. |
scope | Optional | Scope of artifacts to retreive, defaults to 'new'. Please refer to the phantom.collect() API for more details. This value is only used if a container is provided.
|
limit | Optional | Maximum number of artifacts to match. This value is only used if a container is provided. |
trace | Optional | Set this parameter to 'True' for verbose debugging of the API call. |
import phantom.rules as phantom def geolocate_ip(action, success, container, results, handle): # We have already created various artifacts for this event collected_ips = phantom.collect_from_contains(container=container, contains=["ip"]) # [ "8.8.8.8", "8.8.4.4", "1.1.1.1", ... ] parameters = [] for ip in collected_ips: parameters.append({ 'ip': ip }) phantom.act("geolocate ip", parameters=parameters, app={ "name": "MaxMind" }, name="geolocate_ip") return def collect_from_action_result(results): return phantom.collect_from_contains(action_results=results, contains=["url", "domain"])
get_action_results
phantom.get_action_results(action=None, action_run_id = 0, app_run_id = 0, result_data=True, action_name=None, playbook_run_id=0, flatten=True)
This is an API supported for the purposes of letting the user retrieve the action results at any time using the action json that was given in the action callback or the action_run_id that was in the action json. The API call get_summary() also returns one or more app_run_id(s) that can be passed in as the optional parameter.
Parameter | Required? | Description |
---|---|---|
action | Optional, unless action_run_id and app_run_id are not provided. |
Action json object provided in the action callback. Using this provides the action results from the action that completed and triggered the callback function. |
action_run_id | Optional, unless action and app_run_id are not provided. |
ID of the action run. Using this allows you to obtain action results from any completed action runs from the current playbook. 'action_run_id' can be obtained from the above noted action json object or by calling the phantom.get_summary() API which enumerates all the actions that were executed in the playbook. |
app_run_id | Optional, unless action and action_run_id are not provided. |
ID of the app run. This 'app_run_id' can be obtained by calling the phantom.get_summary() API which enumerates all the actions that were executed in the playbook. |
result_data | Optional | This is a boolean parameter, default is True. If the user does NOT need to obtain the full action results (which in some cases can be a lot of data) but just summary information, this parameter should be specified as False. |
action_name | Optional | This is the unique name provided to an action execution via the phantom.act() parameter 'name'. |
playbook_run_id | Optional | This is the playbook run id that uniquely identifies the playbook execution instance. Default value of 0 implies the current playbook execution instance. |
flatten | Optional | boolean. Default=True. An action can be executed on more than one asset and for many sets of parameters. Flattening provides a result dictionary object for each combination of asset and parameter even if many parameters were used in a single action. Setting this variable to False generates results as provided in action callbacks or when viewing the action results in Investigation widgets. |
A single phantom.act()
API call can be executed on multiple sets of parameters on more than one asset. Each instance of phantom.act() call is identified by a unique 'action_run_id'. One action execution on each asset results in a corresponding app execution, each of which is identified by a unique 'app_run_id'. Parameters of an action execution via each app (on their respective asset) can be part of the same app run.
import phantom.rules as phantom import json def collect_params(container, datapath, key_name): params = [] items = set(phantom.collect(container, datapath, scope='all')) for item in items: params.append({key_name:item}) return params def on_start(container): parameters = collect_params(container, 'artifacts:*.cef.sourceAddress', 'ip') phantom.act('geolocate ip', parameters=parameters, name='my_geolocate_ip') return def on_finish(container, summary): summary_json = phantom.get_summary() if 'result' in summary_json: for action_result in summary_json['result']: if 'action_run_id' in action_result: action_results = phantom.get_action_results( action_run_id=action_result['action_run_id'], result_data=False, flatten=False) phantom.debug(action_results) return
The return value of this API is a list of JSON dictionaries; a dictionary per app run (which runs an instance for each asset that was used to run the action on) that has the 'action_results'..
NOTE that the action_result JSON object shown below is generated with parameters result_data=False and flatten=False sent to get_action_results() API in the playbook shown above. If the parameter result_data was specified as True, the dictionaries in the 'action_results' list would also have to include 'data' that has the full action result information. Setting the flatten parameter to True generates the same data but nested 'action_results' data lists are reorganized to have a rather flat hierarchy with a list of higher level objects. This is primarily for backward compatibility.
[ { "asset_id": 237, "status": "success", "name": "my_geolocate_ip", "app": "MaxMind", "action_results": [ { "status": "success", "message": "Country: France", "parameter": { "ip": "2.2.2.2", "context": {...} }, "summary": { "country": "France" } }, { "status": "success", "message": "Country: Australia", "parameter": { "ip": "1.1.1.1", "context": {...} }, "summary": { "country": "Australia" } } ], "app_id": 42, "app_run_id": 1076, "asset": "maxmind", "action": "geolocate ip", "message": "'my_geolocate_ip' on asset 'maxmind': 2 actions succeeded... ", "summary": { ...}, "action_run_id": 1083 } ]
get_extra_data
phantom.get_extra_data(action, action_run_id, app_run_id)
This is an API supported for the purposes of letting the user retrieve the extra data retrieved during an action execution. In some cases the action result is too huge/large and moving it around in the UI or showing to the users all the time does not help. Hence app authors can choose to store larger amounts of data in "extra_data" which can then be retrieved on an on-demand basis via this API in the playbooks. You can specify action, action_run_id, or app_run_id as a key to obtain the data.
Parameter | Rquired? | Description |
---|---|---|
action | Optional | Action JSON object provided in the action callback. Using this provides the extra data from the action that completed and triggered the callback function. |
action_run_id | Optional | ID of the action run. Using this allows you to obtain extra data from any completed action runs from the current playbook. 'action_run_id' can be obtained from the above noted action json object or by calling the phantom.get_summary() API which enumerates all the actions that were executed in the playbook. |
app_run_id | Optional | ID of the app run. This 'app_run_id' can be obtained by calling the phantom.get_summary() API which enumerates all the actions that were executed in the playbook. |
import phantom.rules as phantom import json def domain_reputation_cb(action, success, container, results, handle): if not success: return extra_data = phantom.get_extra_data(action) phantom.debug("Testing extra data: ") phantom.debug(extra_data) return def on_start(container): phantom.act('domain reputation', parameters=[{ "domain" : "bjtuangouwang.com" }], assets=["passivetotal"], callback=domain_reputation_cb) return def on_finish(container, summary): phantom.debug("Summary: " + summary) return
NOTE: At least one parameter MUST be specified. The return value of this API is a list of JSON dictionaries that has the action results along with extra data.
[ { "asset_id": 7, "extra_data": [ { "status": "success", "extra_data": [{...}], "parameter": {} } ], "asset": "passivetotal" } ]
get_filtered_data
phantom.get_filtered_data(name=None)
This API allows users to retrieve the filtered data that was saved via phantom.condition(). In the API phantom.condition(), if the 'name' was specified, the filtered data (filtered action results and/or filtered artifact IDs) is saved under the specified key and the same key can then be used to later retrieve the data.
Parameter | Required? | Description |
---|---|---|
name | Required | This parameter is the same name that was used in the name parameter of phantom.condition() to save the filtered action results and filtered artifacts |
The return value of this API is a tuple, filtered_action_results and filtered_artifacts.
import phantom.rules as phantom import json from datetime import datetime, timedelta ... def filter_1(action=None, success=None, container=None, results=None, handle=None, filtered_artifacts=None, filtered_results=None): # collect filtered artifact ids for 'if' condition 1 matched_artifacts_1, matched_results_1 = phantom.condition( container=container, action_results=results, conditions=[ ["geolocate_ip_1:action_result.data.*.country_iso_code", "!=", "UK"], ["artifact:*.cef.bytesIn", "!=", 99], ], logical_operator='or', name="filter_1:condition_1") ... def on_finish(container, summary): filtered_results, filtered_artifacts = phantom.get_filtered_data(name="filter_1:condition_1")
get_format_data
phantom.get_format_data(name=None)
This is an API supported for the purposes of retrieving data saved via the phantom.format() API. If the user had specified the 'name' parameter value in the phantom.format() API, the name can be used to retrieve the data to be used later. For sample usage, please refer to the phantom.format() API documentation.
get_raw_data
phantom.get_raw_data(container)
This API lets the user retrieve container raw data as it exists at the source. This allows users to access and automate on raw data in cases where there is information that was not parsed into artifacts.
Parameter | Description |
---|---|
container | This is the JSON container object as available in on_start , all callbacks, or on_finish() functions
|
import phantom.rules as phantom import json def on_start(container): raw_data = phantom.get_raw_data(container) phantom.debug(raw_data) return def on_finish(container, summary): return
get_raw_data
pulls raw data from container ["data"], and is often used to store raw emails and the ticketing tools raw data from on_poll
. When pulling data, get_raw_data
specifically uses the ["data"] section of the container to do so.
This is shown in the following example:
phantom.debug(phantom.get_raw_data(container)) phantom.update(container, {"data": {"this": "is a test"}}) phantom.debug(phantom.get_raw_data(container)) in a custom block on a container that does not leverage container['data']. The output: Wed May 13 2020 11:08:38 GMT-0600 (Mountain Daylight Time): phantom.get_raw_data(): called for playbook run '39792' and container id: '9420' Wed May 13 2020 11:08:38 GMT-0600 (Mountain Daylight Time): {} Wed May 13 2020 11:08:39 GMT-0600 (Mountain Daylight Time): successfully updated container(id: 9420) Wed May 13 2020 11:08:39 GMT-0600 (Mountain Daylight Time): phantom.get_raw_data(): called for playbook run '39792' and container id: '9420' Wed May 13 2020 11:08:39 GMT-0600 (Mountain Daylight Time): {"this": "is a test"}
get_apps
phantom.get_apps(action, asset, app_type)
This is an API supported for the purposes of letting the user enumerate all the apps installed on the system for each of the actions. The call returns a flat listing of all actions and apps with matching criteria.
Parameter | Description |
---|---|
action | The name of the action like 'block ip'. Allows users to retrieve information about assets that support the action 'block ip'. |
asset | The Asset name that allows users to retrieve only those apps that match the specified asset. |
app_type | Allows users to retrieve only those apps that match the specified type of the app. Types are like 'reputation', 'information', etc. |
def on_start(container): apps=[] apps = phantom.get_apps() phantom.debug(apps) apps=phantom.get_apps(action='file reputation') phantom.debug(apps) apps = phantom.get_apps(asset='my_smtp_asset') phantom.debug(apps) apps = phantom.get_apps(app_type='information') phantom.debug(apps) return
All of these parameters are optional, if the user does not specify any parameter, all the configured apps in the system are retrieved.
The return value of this API is a list of JSON dictionaries that have the following schema:
[ { "asset_disabled": false, "product_version_match": true, "app_type": "sandbox", "product_vendor": "Cuckoo", "product_name": "Cuckoo", "app_match_product_version": ".*", "asset_name": "cuckoo", "ap_name": "Cuckoo", "action": "detonate file", "app_version": "1.2.8", "asset_product_version": "", "asset_type": "sandbox" }, ... ]
get_assets
phantom.get_assets(action=None, tags=None, types=None)
As explained in the phantom.act() API, users can either specify a specific asset on which the action has to be executed or not specify an asset and the action will be executed on all possible assets.
This API, allows users with programmatic access to assets setup in the system.
Parameter | Description |
---|---|
action | The name of the action like 'block ip'. Allows users to retrieve information about assets that support the action 'block ip'. |
tags | A list of 'tags' that allows users to retrieve only those assets that have been tagged with the specified keyword. |
types | A list of 'types' of assets that must be used to retrieve the specific assets |
def on_start(container): assets = phantom.get_assets() phantom.debug(assets) assets = phantom.get_assets(action='file reputation') phantom.debug(assets) assets = phantom.get_assets(types=['reputation service']) phantom.debug(assets) return
Since all of these parameters are optional, if the user does not specify any parameter, all the configured assets in the system are retrieved.
The return value of this API is a list of JSON dictionaries that have the following schema:
[ { "description": "VirusTotal", "tags": [], "product_vendor": "VirusTotal", "product_version": "Private 2.0", "product_name": "VirusTotal", "disabled": true, "version": 1, "type": "reputation service", "id": 11, "name": "virustotal_private" }, ... ]
get_container
json_object = phantom.get_container(container_id)
This API is used to retrieve the JSON for a container (as a Python object). See the Containers REST documentation for more details.
Parameter | Required? | Description |
---|---|---|
container_id | Required | The ID of the container to fetch. |
Example usage:
def on_start(container): cdata = phantom.get_container(container['id']) phantom.debug('Container Data: {}'.format(cdata)) return
get_parent
json_object = phantom.get_parent_handle()
This API is used to retrieve the 'handle' that has been set in the 'phantom.playbook()' API call (synchronous mode) in the parent / caller playbook. This API can be called from anywhere (on_start(), on_finish() or any other function) in the child playbook.
This API only works when the parent calls the child playbook in synchronous mode. See phantom.playbook() API for more details on how to call the playbooks in synchronous mode.
Example usage:
In the PARENT playbook...
some_handle="some_handle from parent pb" # 'some_handle' is now passed to the child playbook via the handle parameter. playbook_run_id = phantom.playbook("local/child_pb", container=container, name="playbook_local_child_pb_1", callback=decision_1, handle=some_handle)
In the CHILD playbook...
def on_start(container): handle_from_parent=phantom.get_parent_handle() # this call can be done from any function of the child playbook phantom.debug("handle sent by parent playbook: {}".format(handle_from_parent)) return
get_playbook
phantom.get_playbook_info()
This is an API to retrieve the current playbook's information such as id, run id, name, repo, and parent_playbook_run_id (if this playbook was executed from another playbook) and the running playbook's effective user id.
The return value of this API is a list containing a single dictionary.
[{ 'parent_playbook_run_id': '0', 'name': 'test_plabook', 'run_id': '37', 'scope_artifacts': [], 'scope': 'new', 'id': '562', 'repo_name': 'local', 'effective_user_id':5 }]
get_summary
phantom.get_summary()
This is an API supported for the purposes of letting the user retrieve the summary of the playbook execution in a json format.
import phantom.rules as phantom import json def on_start(container): phantom.act('geolocate ip', parameters=[{ "ip" : "1.1.1.1" }]) return def on_finish(container, summary): summary_json = phantom.get_summary() phantom.debug(summary_json) return
The return value of this API is a list of JSON representation of the playbook execution
{ "status": "success", "message": "", "result": [ { "status": "success", "close_time": "2016-02-11T06:45:22.005343+00:00", "app_runs": [ { "asset_id": 40, "status": "success", "app": "MaxMind", "app_id": 27, "app_run_id": 224, "asset": "maxmind", "action": "geolocate ip", "summary": "Country: Australia", "parameter": "{\"ip\": \"1.1.1.1\"}", "action_run_id": 104 } ], "create_time": "2016-02-11T06:45:20.917+00:00", "action": "geolocate ip", "message": "1 action succeeded", "type": "investigate", "id": 104 } ], "playbook_run_id": 167 }
parse_errors, parse_success, parse_result
phantom.parse_errors(action_results) phantom.print_errors(action_results) phantom.parse_success(action_results) phantom.parse_results(action_results)
Parsing action_results. These APIs allows the users to pass in the action_results directly from callback into these helper routines to be able to conveniently access data.
API | Description | |||
---|---|---|---|---|
parse_errors() | This API collects all the errors and returns to the user errors per asset and per parameter. | = | print_errors() | This API is a convenient way to just quickly dump any errors, if there are any in the action_results. |
parse_success() | Processes the action_results are removes any records that had errors, so that the user can conveniently and confidently access only results of successful actions on respective asset and parameter. | |||
parse_results() | Processes the action_results and transforms the contents to be organized by success and failed categories.
NOTE: Please review the phantom.collect() API before using, as these convenience APIs have very limited use scenarios. |
set_parent_handle
phantom.set_parent_handle()
This API is used to set the 'handle' from the synchronously called / executed child playbook that is then accessed in the parent playbook via the handle parameter of the callback function.
This API only works when the parent calls the child playbook in synchronous mode. See phantom.playbook() API for more details on how to call the playbooks in synchronous mode.
NOTE: The last call to set_parent_handle will overwrite the handle being sent to the callback function. So in a parent playbook, if there is a join block where two child playbooks called synchronously are joining to a callback, the value of handle in the callback depends on which child playbook called the set_parent_handle last.
Example usage:
In the CHILD playbook...
some_handle="some_handle from child pb" phantom.set_parent_handle(some_handle)
In the PARENT playbook...
def playbook_callback(..., handle=None, ...): phantom.debug("handle sent by child playbook: {}".format(handle)) return
Data management automation API | System automation API |
This documentation applies to the following versions of Splunk® Phantom (Legacy): 4.8
Feedback submitted, thanks!