Understanding Splunk UBA data cubes
A data cube in Splunk UBA is a database table containing aggregated events based on specific attributes. Batch models in Splunk UBA use the data aggregated in data cubes to generate content such as anomalies and threats.
In this example, Splunk UBA ingests events from the Splunk platform using the Splunk Direct and Splunk Raw Events connectors. See Use connectors to add data from the Splunk platform to Splunk UBA in Get Data into Splunk User Behavior Analytics for more information about the connectors and when to use which connector. Events are recognized by Splunk UBA and tagged as belonging to a specific view. For example, an event from data source DS1 is tagged with
_AD as Splunk UBA recognizes this as Active Directory (AD) data.
Data cubes subscribe to one or more views and can track any number of attributes from those views. For example, if a view has 20 attributes, a data cube may track only 5 of the attributes of that view. In our example, we see that Data Cube 1 is tracking three attributes from the AD view and one attribute from the HTTP view.
deviceId attribute must be tracked in order for Splunk UBA to generate anomalies. Whichever attribute you choose is called the default entity ID. Without either one, anomalies are not raised even if the other attributes indicate otherwise.
Splunk UBA batch models use the data stored in a cube to generate content. Because each cube stores different types of data, a model must choose the appropriate cube in order to have the correct data for its content.
Example cube and descriptions
Consider the example data cube below. The cube tracks three attributes, which serve as the table columns:
processName. These attributes are called Dimensions. A custom cube can have a maximum of six dimensions.
The cube can track unique combinations of these dimensions per hour, day, month, or year. A function can count the total number of occurrences for each combination per hour, day, month, or year. This function is called a Measure. A custom cube can have a maximum of three measures.
As events are ingested into Splunk UBA, the desired attributes from each event populate the rows of the cube based on the refresh interval. For example, the cube may be updated every day, as shown in the following example:
day | userId | processPath | processName | COUNT ----+--------+-------------+-------------+------ 1 | user1 | /path1 | exe1 | 753 1 | user2 | /path2 | exe2 | 753 2 | user1 | /path1 | exe1 | 303 2 | user2 | /path2 | exe2 | 300 2 | user1 | /path2 | exe1 | 3 3 | user1 | /path1 | exe1 | 441 3 | user2 | /path2 | exe2 | 450
In this example, we can see the following behavior over three days of data:
- It is normal for
- It is normal for
On day 2, the event where
path2 is not normal behavior and considered rare, so this event may cause an anomaly to be raised, depending on the specific threshold configured in the model that is consuming data from this cube. This particular pattern occurred 3 times out of 3000. If the model sets the rarity threshold to be 1 out of 1000, then an anomaly is raised for
A larger number of attributes can enhance the content generation in Splunk UBA as Splunk UBA can analyze events for more specific deviations or malicious behavior. Let's add an additional attribute to the cube so that
deviceId is tracked along with the other attributes.
day | userId | deviceId | processPath | processName | COUNT ----+--------+----------+-------------+-------------+------ 1 | user1 | device1 | /path1 | exe1 | 753 1 | user2 | device2 | /path2 | exe2 | 753 2 | user1 | device1 | /path1 | exe1 | 303 2 | user2 | device2 | /path2 | exe2 | 300 2 | user1 | device1 | /path2 | exe1 | 1 2 | user1 | device2 | /path2 | exe1 | 1 2 | user1 | device3 | /path2 | exe1 | 1 3 | user1 | device1 | /path1 | exe1 | 441 3 | user2 | device2 | /path2 | exe2 | 450
Using the same data with the additional dimension of the
deviceId attribute, we see the following behavior patterns:
- It is normal for
- It is normal for
On day 2, there are now three separate entries for
path2 on three different devices. In this case, three separate anomalies are raised.
By default, Splunk UBA data cubes retain 30 days worth of data. You can configure this retention period to be larger if desired, provided you allocate enough disk space for the data storage.
Examine existing cubes to get more information about Splunk UBA data views
Each attribute in a view has a unique attribute key associated with it. Splunk UBA uses attribute keys to extract the corresponding value of an attribute from the data event's view or from the data event directly.
You must provide specific attribute keys when you create a custom cube, so examining the content of existing cubes is a necessary prerequisite to building custom cubes. Use the following syntax formats to specify attribute keys:
|Syntax||Description and example|
|view.<viewname>.<object/method>||Use this format to extract the value of the |
|view.*.<object/method>||Use this format to extract the value of the |
|event.<objectname>||Use this format to extract the value of the |
|event.attribute#<attributename>||Use this format to extract the value of the |
view formats to aggregate the values from the following types of events:
- Events from CIM-compliant data sources ingested using Splunk Direct. See Add CIM-compliant data from the Splunk platform to Splunk UBA in Get Data into Splunk User Behavior Analytics.
- Events from data sources ingested using Splunk Raw Events and Splunk UBA's native parsers. See Add raw events from the Splunk platform to Splunk UBA in Get Data into Splunk User Behavior Analytics.
event formats to aggregate the values from events whose data sources are configured as generic data types (
uba_source_type="generic"). See Add custom data to Splunk UBA using the generic data source in Get Data into Splunk User Behavior Analytics.
For example, examine a raw event from a Windows event log in multiline format:
09/12/2019 04:42:00 PM LogName=Security SourceName=Microsoft Windows security auditing. EventCode=4672 EventType=0 Type=Information ComputerName=acmel-lpt0399177 TaskCategory=Special Logon OpCode=Info RecordNumber=4660700202 Keywords=Audit Success Message=Special privileges assigned to new logon. Subject: Security ID: acme\carbanak Account Name: carbanak Account Domain: acme Logon ID: 0x95455c63 Privileges: SeTcbPrivilege SeDebugPrivilege SeLoadDriverPrivilege SeSecurityPrivilege SeBackupPrivilege
To extract the value of the
EventCode, store it in the cube, and included the value in Splunk UBA content, begin by examining the
windows events cube in Splunk UBA:
- Make sure you are logged in to Splunk UBA as a user with content developer privileges.
- Select System > Cubes.
- In the URL, add
?systemimmediately following the host name or IP address. For example:
- Select the
windowseventscube to view its details.
At the top of the page, there is an attribute with the name
eventId, an its attribute key is
view.ad.eventId. This is the attribute key that must be provided when you are creating a new cube and want to store the value of this variable in the cube. Note that the names do not always coincide, as the field name is
EventCode in the raw event but transformed to
eventId in Splunk UBA.
Store generic events in Splunk UBA data cubes
Generic events are events from data sources for which Splunk UBA does not have any existing parsing logic. For example, you can ingest credit card transaction logs which are not CIM compliant and for which Splunk UBA does not parse by default. Events from this type of data source can be ingested into Splunk UBA as generic events and aggregated in data cubes. Splunk UBA can generate anomalies based on generic events stored in the data cubes as long as one of
deviceId are also present in the cube.
Suppose you have a data source called exampleDS and the field you want to store in the cube is called exampleDSfield. Use the following syntax to store the value of this field in the cube:
view. methods described in Examine existing cubes to get more information about Splunk UBA data views only when Splunk UBA can map the attribute to one of the following defined in the table. These attributes contain identity resolution logic used for generating entities in Splunk UBA.
|Entity you want to track in Splunk UBA||Your event contains this field||Mapped entity in Splunk UBA||What can be extracted by Splunk UBA|
For example, the parsed token
view.*.source populates the
source field in the cube with the device name as one of the dimensions. You can also extract additional device information such as the device ID, type, scope, and port.
Since the exampleDSfield attribute is not one of the ones listed in the table, it can't be matched by Splunk UBA. You must use the
event.attribute format to store the value of exampleDSfield in the cube.
What is the custom use case framework?
Create a new data cube
This documentation applies to the following versions of Splunk® User Behavior Analytics: 5.0.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 184.108.40.206, 5.0.5, 220.127.116.11