Build an observable-query intelligence source integration

This document is a language-agnostic engineering guide for developers who want to integrate observable-query-style intelligence services with the Splunk Intelligence Management platform. This guide is intended for developers who will host their own code in their own environment.

Architecture notes

15-minute delays between each run of this script are required. Splunk Intelligence Management analyzes API traffic daily. API Credentials with activity that does not show 15-minute breaks between script runs will be deactivated.
Only one set of Splunk Intelligence Management API credentials can be used in a single process at any given time. If same set of Splunk Intelligence Management credentials is used in multiple applications, processes, or threads, they will fight each other for the limited quantity of per-minute API calls allowed to a credential set.

Configuration

Each Customer using this integration will submit to the intel source provider:

Property

Description

trustar_service_user_api_key
trustar_service_user_api_secret

The Splunk Intelligence Management API key and API secret for a Splunk Intelligence Management service account.

Recommendations:

Create a separate Splunk Intelligence Management account for this integration.
Restrict service account's enclave access to view for source Enclaves and submit or full for the destination Enclave

source_enclaves

One or more Enclave IDs that contain data the app submits to Splunk Intelligence Management. This data can be logs, cases, emails, or whatever source data your organization generates that you want to use.

Suggestions for Enclave names

" Splunk ES Threat Activity Notable Events enclave

IBM QRadar Offenses enclave
ServiceNow Cases enclave
IBM Resilient Tickets enclave
Jira Cases enclave
Phishing Emails enclave

destination_enclave

A single Enclave ID where the app will store enriched data (Reports and/or Indicators).

Descriptive client_type, client_version, and client_metatag are required. API credentials with activity that does not show meaningful, descriptive values in those fields will be deactivated.

Pseudocode

Load configs + checkpoint from persistent storage.
Validate configs (proper access to necessary enclaves)
Check the source enclaves for new reports using explicit from/to times (endpoint: search reports). The search reports endpoint response is ordered newest report —> oldest "lastUpdated" timestamp.
Splunk Intelligence Management is an asynchronous platform. "to time" should always be 15 minutes prior to present moment. If "to time" is closer to present moment than 15 mins, you run the risk of missing data.
Aggregate all the report-shells into a list/array.
Re-order the array from oldest to newest based on the report's "lastUpdated" timestamp (easier for checkpointing)
declare consecutive_failures = 0
For each report shell (oldest to newest):
- Get the observables Splunk Intelligence Management found in the report ( endpoint: get indicators for report)
- For all observables in the report:
  - Extract enrichment about the observable from all the Intel Source's endpoint(s) appropriate for that observable type.
  - Transform the endpoints' enrichment to a single dictionary (enrichment dict - see next section for dict formatting) .
  - Build the Splunk Intelligence Management report (API Docs, SDK Class) object, placing the enrichment dict in the Report's "body" attribute.
  - Up-sert Report (see algorithm in next section) to Splunk Intelligence Management.
  - add attributes to the observable as Splunk Intelligence Management Observable Tags:
  - Transform the endpoints' enrichment to a Splunk Intelligence Management Indicator + Indicator Tags
  - submit the observable as an individual Splunk Intelligence Management Indicator with Tags (endpoint: submit indicators)
    The add tag to indicator endpoint only allows adding one tag per API call. submit indicators allows adding many tags in 1 API call.
- update persistent-storage checkpoint with report's "timeUpdated" attribute's value.

Report Upsert Algorithm

Check to see if a report about this observable already exists (endpoint: get report status) (use external ID - see below)
If status not UNKNOWN or SUBMISSION_FAILURE or SUBMISSION_PROCESSING:
- re-check every 5 seconds until one of those 3 statuses. time this "wait" loop out after 5 mins. [case: Splunk Intelligence Management is down for maintenance]
If UNKNOWN: submit new report (endpoint: submit report)
elif SUBMISSION_SUCCESS: update (overwrite the old one completely with the new) (endpoint: update report)
elif SUBMISSION_FAILURE:
- try to get full report (endpoint: get report details) (using external ID)
- if 200: update the report (endpoint: update report) [the report was once submitted & processed successfully, but most-recent update attempt failed]
- if 404: submit the report as new report (endpoint: submit report) [all attempts to submit this report have failed]
Wait 5 sec before checking status (endpoint: get report status)
Check status + wait until submit / update is finished processing (either SUBMISSION_SUCCESS or SUBMISSION_FAILURE) (5 minute timeout)
- if timeout: Raise exception, terminate application. [Splunk Intelligence Management is down for maintenance]
if SUBMISSION_SUCCESS:
- consecutive_failures = 0
- log, continue.
elif SUBMISSION_FAILURE:
- if failure reason == too many observables or > 2MB:
  - shrink or remove the "related_observables" k/v pair.
  - upsert again. (careful, protect against infinite recursion.)
  - wait 5 sec.
  - check status, wait until SUBMISSION_SUCCESS or SUBMISSION_FAILURE.
  - if SUBMISSION_FAILURE:
    - log it
    - transform the Splunk Intelligence Management Report object to a json string
    - dump string to file for human review later.
      - (bucket storage ideal, so an unattended script doesn't fill up a host's hard drive)
      - consecutive_failures += 1
      - if consecutive_failures >= NUMBER_OF_CONSECUTIVE_PROCESSING_FAILURES_THAT_YOU_THINK_WARRANTS_HUMAN_INTERVENTION:
        send your ENG tm a Slack msg.
        
        automatically create Jira ticket for ENG tm.
        
        raise Exception, terminate the process.

Splunk Intelligence Management report external ID format

Your integration needs to provide an External ID that uniquely identifies itself to Splunk Intelligence Management. It is used to enable updating of Splunk Intelligence Management reports between your app and the Splunk Intelligence Management platform. Formatting Rules

The External ID must contain only characters valid in URL strings. This is because some Splunk Intelligence Management API endpoint calls will use the External ID as an URL string parameter.

Recommended algorithm:

Concatenate strings: observable value + destination enclave ID
base-64 encode the result of the concatenation (all resulting characters are valid in URLs)

Splunk Intelligence Management report body enrichment dict format

{ subject_observable:  { type: __________, 
                         value: _________,   
                         attribute_one:  _______, 
                         a_second_attribute: ______, 
                         a_third_attribute:  ______, 
                         ….etc….                       },
 
   related_observables:  [ { type: _____________,  
                             value:  ____________,
                             ……etc…..                 },   
                           { type: __________,  
                             value: _________, 
                             some_attribute:  _______,   
                             another_attribute: ______    } ]  }

Build an observable-query intelligence source integration

Architecture notes

Configuration

Pseudocode

Report Upsert Algorithm

Splunk Intelligence Management report external ID format

Comments

Build an observable-query intelligence source integration

Was this topic useful?