Splunk® User Behavior Analytics

Get Data into Splunk User Behavior Analytics

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

Get HR data into Splunk UBA

Add HR data to Splunk UBA by performing the following tasks:

  1. Verify that the HR data filter is enabled
  2. Set the HR data cache capacity
  3. Prepare the HR data in Active Directory
  4. Gather HR data from Active Directory
  5. Use the HR data to classify account types and user accounts
  6. Add HR data from Splunk Enterprise

As an option, you can add HR data from a CSV file to Splunk UBA for testing and validation. See Add HR data from a CSV file.

Verify that the HR data filter is enabled

Before adding HR data to Splunk UBA, verify that the HR data filter is enabled.

Do not disable this filter unless directed to do so by Splunk Customer Support.

Set the HR data cache capacity

For each HR data account, Splunk UBA uses loginId, domainLoginId, and email to perform identity resolution and stores an entry for each unique value in the HR cache. Set the value of identity.resolution.hrcache.capacity to three times the number of HR accounts being monitored by Splunk UBA to avoid potential performance issues. For example, if your Splunk UBA deployment is monitoring 250,000 accounts, set this property to 750,000:

  1. Log in to the Splunk UBA management node as the caspida user.
  2. Add or edit the identity.resolution.hrcache.capacity property to the /etc/caspida/local/conf/uba-site.properties file:
    identity.resolution.hrcache.capacity = 750000
  3. Save and exit the file.
  4. In distributed deployments, synchronize the cluster:
    /opt/caspida/bin/Caspida sync-cluster  /etc/caspida/local/conf
  5. Run the following command to verify that the number of lookup keys is less than the cache capacity:
    /opt/caspida/bin/irscan -H -m


    Below is a sample output for this command:

    
Number of lookup keys     :       0
    
Cache capacity            :  300000

    Account Lookup attributes : [domainLoginId, loginId, email]
        (partial lookup)  : []
    
User Lookup attributes    : []
     (partial lookup)     : []

    

The default value of identity.resolution.hrcache.capacity is 300,000.

Prepare the HR data in Active Directory

Ensure that existing Active Directory (AD) data is as complete and consistent as possible.

Splunk UBA dashboards provide a rich reporting and filtering capabilities based on user and account attributes, such as a user's department, location, or status. Properly configured HR data is critical in making sure this information is available for analysts and hunters.

Fill out as many fixed fields as possible, such as a user's first and last name, department, and location. This information is used for reporting and incident review. A user's manager is also useful information in situations where review and escalation are required.

Variable fields such as employeeID or sAMAccountName are used to link multiple accounts together to a single unique ID. Provide as much of this information as possible. See Why Splunk UBA requires HR data.

Comprehensive Active Directory data helps Splunk UBA provide the most accurate and meaningful results when exploring events in your environment.

Gather HR data from Active Directory

Pull the HR data from AD into Splunk Enterprise and save the data to a lookup file.

Use SPL to obtain the HR data

On the Splunk search head, run the following search to obtain HR data from Active Directory, format the results into a table, and save it to a lookup file:

| ldapsearch domain=default search="(&(objectCategory=person)(objectClass=user)(sAMAccountName=*))" attrs="accountExpires,c,company,dc,department,displayName,distinguishedName,employeeID, employeeNumber,givenName,initials,l,mail,manager,memberOf,mobile,postalCode,sAMAccountName,sn,st, status,streetAddress,telephoneNumber,title,userAccountControl,whenChanged,whenCreated" | table accountExpires,c,company,dc,department,displayName,distinguishedName,employeeID, employeeNumber,givenName,initials,l,mail,manager,memberOf,mobile,postalCode,sAMAccountName,sn,st, status,streetAddress,telephoneNumber,title,userAccountControl,whenChanged,whenCreated | outputlookup mycompany_domain.csv

Replace domain=default with the short domain name (not the FQDN) from the SA-ldapsearch configuration on the search head.

The outputlookup command creates the lookup file.

The table command formats the results so that they can be recognized by Splunk UBA. The fields listed in this command are used by the data models in Splunk UBA to generate anomalies and threats. If you get your HR data from a source other than Active Directory or from a CSV file, you must format the results to match this table command.

Splunk UBA already maps certain fields by default, as shown in the following table. You can also view /opt/caspida/conf/attribution/User.json and /opt/caspida/conf/attribution/Account.json to view all available field names.

See Add custom attributes to your HR data for information about adding your own HR data fields.

UBA field name AD/LDAP field name Description Required Example
AD Groups memberOf, groups List of AD groups that the user is a member of. If there are no groups, leave the value blank. No pony_instructors, pony_riders
City l, city City (location) of the user. No San Francisco
Country co, country Country code of the user. No USA
Departing User departingUser Whether or not the user has decided to leave the company. No true, false
Display Name displayName The user's full name or a service account name. If this field is empty, the display name is created by using the values in the first name, middle name, and last name fields. No Shruti Michelle Buttercup
Domain and LoginId domainLoginId The user's domain + login ID. Supported formats:
  • adDomain\loginId
  • adDomain\\loginId
  • adDomain/loginId
  • loginId\adDomain
  • loginId@dnsDomain
  • adDomain\loginId@dnsDomain
No domain1/smbuttercup
Email Address mail, userPrincipalName, email User's email address. In some cases, you may find this stored in the userPrincipalName field. Yes smbuttercup@example.com
Employee Type employeeType, userType The type of employee. No Contractor
Expiration Time accountExpires Valid formats:
  • Windows FileTime
  • yyyy-MM-dd'T'HH:mm:ss
  •  %Y-%m-%dT%H:%M:%S.%QZ
  • MM/dd/yyyy
  • yyyyMMddHHmmss.S'Z'
  • yyyyMMdd
No 07/09/2014
First Name preferredName, givenName, firstname The user's first name. This value is used to compute the display name field if the display name field is empty. Yes Shruti
High Risk User highRiskUser Whether or not the user is identified as a high risk user, such as an executive. No true, false
Hire Date hireDate Date the user was hired. Valid formats:
  • MM/dd/yyyy
  • yyyyMMddHHmmss.S'Z'
  • yyyMMdd
No 07/09/2014
Last Logon lastLogon Last time the user logged on. Valid formats:
  • Windows FileTime
  • yyyy-MM-dd'T'HH:mm:ss
  •  %Y-%m-%dT%H:%M:%S.%QZ
  • MM/dd/yyyy
  • yyyyMMddHHmmss.S'Z'
  • yyyMMdd
No 07/09/2014
Last Logon Timestamp lastLogonTimestamp Valid formats:
  • Windows FileTime
  • yyyy-MM-dd'T'HH:mm:ss
  •  %Y-%m-%dT%H:%M:%S.%QZ
  • MM/dd/yyyy
  • yyyyMMddHHmmss.S'Z'
  • yyyMMdd
No 07/09/2014
Last Name sn, lastname The user's last name. This value is used to compute the display name field if the display name field is empty. Yes Buttercup
Login Id sAMAccountName, loginId Login ID or username of an account associated with the user. Yes smbuttercup
Manager manager, manageremployeeId Name or ID of the user's manager. No Charlotte Arachnia
Middle Name initials, MiddleName The user's middle name. This value is used to compute the display name field if the display name field is empty. Yes Michelle
On Performance Improvement Plan onPerformanceImprovementPlan, onPIP Whether or not the user is on a performance improvement plan. No true, false
OU department, ou Organizational unit (department) or business unit of the user. Yes Pony Instructors
Phone telephoneNumber, phone Phone number of the user. No 123-456-7890
State st, state State where the user resides. No CA
Termination Date terminationDate The user's last day of employment. Valid formats:
  • MM/dd/yyyy
  • yyyyMMddHHmmss.S'Z'
  • yyyMMdd
No 07/10/2014
Title title The user's title. No Senior pony instructor
Traveling traveling, travelling Whether or not the user is traveling. No true, false
User Account Control Code userAccountControl, UAC User account control code from AD. Use UAC when the value in your HR data is an ENUM value such as NORMAL_ACCOUNT. If a UAC value is not available, Splunk UBA calculates the UAC using the value of the userAccountControl, such as 512 for a NORMAL_ACCOUNT. No 66050, ACCOUNT_DISABLED
Status status Active or inactive status of the user from the HR system. No Active/InActive
ZIP Code postalCode, zip ZIP code of the user. No 94107

Scheduling LDAP searches to obtain HR data from Splunk Enterprise

The LDAP search to obtain HR data can affect the performance on the search head. After the lookup file is created, use it to create an SPL that will identify and classify all accounts in your HR data, and associate a single user with each of them. Do not attempt to do this using live searches on Splunk Enterprise.

In environments with multiple AD domains, make a separate search and lookup file for each domain. Use multiple LDAP searches scheduled at different times to help reduce the load on the search heads.

Splunk UBA refreshes HR data each morning at 2:00 AM.

  • Schedule your daily LDAP searches to complete before 2:00 AM, but also at a time when the search will not compete for critical system resources.
  • Schedule your LDAP search to time out after 7 days. Some LDAP searches can take several hours to complete, so setting a large timeout window ensures that the search is able to complete.

Use the HR data to classify account types and user accounts

Perform the following tasks to write and modify a search to identify and classify all account types and user accounts. Each section may contain example commands, or commands you must cut and paste without modifications.

  1. Import the CSV file containing the HR data
  2. Exclude system accounts from your HR data
  3. Perform transformations required by Splunk UBA
  4. Find and classify all account types
  5. Determine user status to properly identify terminated accounts
  6. Identify domain and login ID
  7. Create a proper display name for each user
  8. Link related accounts to a single user

See Example SPL to get HR data into Splunk UBA for a complete search example.

Import the CSV containing the HR data

Begin by importing the CSV file containing the HR data. For example, use the following command to import a CSV file called mycompany_domain.csv:

| inputlookup mycompany_domain.csv

If you have multiple domains, you can append additional domains using append. For example, the following command imports a CSV file called mycompany_domain_1.csv and appends the CSV file mycompany_domain_2.csv:

| inputlookup mycompany_domain_1.csv | inputlookup append=t mycompany_domain_2.csv

Exclude system accounts from your HR data

Remove system accounts to reduce your licensing requirements and load on Splunk UBA. Cut and paste either of the following commands without making any changes:

| search sAMAccountName!="*$" sAMAccountName!="$*"

| search userAccountControl!=*WORKSTATION_TRUST_ACCOUNT*

Perform transformations required by Splunk UBA

The following replace, rename, and rex commands are required for transforming some data to a format recognized by Splunk UBA. Cut and paste all of the following commands and do not change or remove any commands.

| replace "NULL" with "" | rename c as co | rename userAccountControl as UAC | rex field=manager "CN=(?<manager>.*?),OU=" | rex mode=sed field=manager "s/\\\//g" | rex max_match=0 field=memberOf "CN=(?<memberOf>.*?),OU=" | eval memberOf=mvjoin(memberOf,",")

Find and classify all account types

Use the match command as needed to classify all of the account types in hrAccountType. By default, Normal, Service, and Admin accounts are defined in Splunk UBA, but you may have additional account types in your environment.

The following example identifies all accounts beginning with 99 followed by a combination of zero or lower-case letters as Admin accounts:

| eval hrAccountType=case( (match(lower(sAMAccountName),"^99[0a-z].*")),"Admin",

Examine the remaining accounts to see if there are additional Admin accounts not categorized with this command. Write additional match statements as needed until add Admin accounts are accounted for. Then, write commands to find all the service accounts. The following example contains the previous search for Admin accounts, along with additional commands to identify Service accounts:

| eval hrAccountType=case( (match(lower(sAMAccountName),"^99[0a-z].*")),"Admin", (match(lower(distinguishedName),".*ou=.*service accounts.*")), "Service", (match(lower(distinguishedName),".*cn=managed service accounts.*")), "Service",

You can have any many match statements as needed to classify all accounts in your system. For example, your normal email convention may be <first initial><last name>@mycompany.com but due to a conflict, you have some users who are <first name><last initial>@mycompany.com. This situation requires separate match statements to properly identify all accounts.

You may even need to have a single match statement to identify one single account.

Continue with this process until only human accounts remain classified as Normal accounts.

Determine user status to properly identify terminated accounts

Splunk UBA uses the status property to determine whether a human user is active or inactive. Any users who no longer has an employment relationship with the organization are considered terminated and have a status of inactive. Splunk UBA detections can raise anomalies for activities involving inactive users, such as scripts or automated tasks that are run with the inactive user's credentials.

Any user employed or contracted by the organization is considered active, regardless of how often they are in the office, their physical location, or whether they are traveling or on any type of extended leave.

Do not confuse the status of a human user with userAccountControl or UAC, which reflect the status of the user's accounts. The UAC for Admin, Service or System account types should not be used to set a user's status because these account types are not managed by HR systems. The IT departments of many organizations set the userAccountControl of a user's account to ACCOUNTDISABLE in cases where the employee is not permanently terminated and still has an employment contract with the organization. The following examples show how userAccountControl can be set to ACCOUNTDISABLE for a user who is not permanently terminated:

  • A user is on legal-hold pending an internal forensics investigation.
  • A user forgot their password upon returning from a long weekend or holiday.
  • A user is on PTO and the organization has set the user's accounts to ACCOUNTDISABLE.

In some cases, the UAC for Normal accounts can be an indicator that the user is terminated, but it's not the only condition. For example, the UAC for a user's Normal account is set to ACCOUNTDISABLE, but the user may or may not have other accounts that are not set to ACCOUNTDISABLE.

When onboarding HR data, you must accurately write your SPL to correctly identify terminated users and set their status=InActive. You must identify the way your organization is marking permanently terminated users, and write an SPL transformation so that only terminated users have a status of InActive. Incorrectly mapping active, disabled, or suspended accounts with status=InActive will result in a large number of false anomalies related to terminated user activity, when the user is in fact not terminated.

Identify domain and login ID

Use a search such as the one in the following example to properly identify an account's domain and login ID:

| eval domainLoginId=dc + "\\\\" + sAMAccountName

This statement would generate an example such as Corp\\userAccount. Splunk UBA resolves this to proper AD multiline format, such as Corp\userAccount.

Create a proper display name for each user

Create a proper display name for each user, consisting of the first, middle, and last name. Below is an example command:

| eval displayName=givenName." ".middleName." ".sn

Valid display names must use spaces to separate the first, middle, and last name, otherwise some models in Splunk UBA will not work properly.

Link related accounts to a single user

Use sAMAccountName to link related Admin accounts and Normal user accounts to a single employeeID. Below is an example command:

| eval employeeID=sAMAccountName | rex field=employeeID "^99(?<employeeID>.*?)$"

See Why Splunk UBA requires HR data for more information about why all accounts must be associated with a single employeeID representing a human user.

Example SPL to get HR data into Splunk UBA

This example SPL inputs HR data from two domains and contains match statements to cover a variety of account types such as Service, SharedMailbox, Vendor, TestAccount, FaxAccount, and ExchangeSystemAccount. Accounts that are not identified by any of these statements are classified as Normal accounts.

The table command formats the results so that they can be recognized by Splunk UBA.

| inputlookup mycompany_domain_1.csv | inputlookup append=t mycompany_domain_2.csv | search sAMAccountName!="*$" sAMAccountName!="$*" | replace "NULL" with "" | rename c as co | rename userAccountControl as UAC | rex field=manager "CN=(?<manager>.*?),OU=" | rex mode=sed field=manager "s/\\\//g" | eval hrAccountType=case( (match(lower(sAMAccountName),"^99[0a-z].*")),"Admin", (match(lower(distinguishedName),".*ou=.*service accounts.*")), "Service", (match(lower(distinguishedName),".*cn=managed service accounts.*")), "Service", (match(lower(distinguishedName), ".*ou=shared mailboxes.*")), "SharedMailbox", (match(lower(distinguishedName), ".*ou=vendor.*")), "Vendor", (match(lower(distinguishedName), ".*ou=testbank.*")), "TestAccount", (match(lower(sAMAccountName),".*fax.*")), "FaxAccount", (match(lower(sAMAccountName), "^healthmailbox.*")), "ExchangeSystemAccount", (match(lower(sAMAccountName), "^sm_.*")), "ExchangeSystemAccount", (match(lower(sAMAccountName), "^systemmailbox.*")), "ExchangeSystemAccount", (1==1), "Normal" ) | eval domainLoginId=dc + "\\\\" + sAMAccountName | eval displayName=givenName." ".middleName." ".sn | eval employeeID=sAMAccountName | rex field=employeeID "^99(?<employeeID>.*?)$" | table status, employeeID, domainLoginId, hrAccountType, sAMAccountName, displayName, distinguishedName, givenName, initials, sn, title, mail, company, department, streetAddress, l, st, postalCode, co, telephoneNumber, mobile, manager, UAC, whenCreated, whenChanged, accountExpires, memberOf

Add HR data from Splunk Enterprise

Use the finalized search from Use the HR data to classify account types and user accounts to pull live HR data from Splunk Enterprise into Splunk UBA.

  1. From the Splunk UBA menu, select Manage > Data Sources.
  2. Click New Data Source.
  3. Select the Splunk HR Data data source and click Next.
  4. Type a connection Name, such as SplunkHR. The data source name must be alphanumeric with no spaces.
  5. Add the URL of your Splunk search head and management port.
    For example, https://10.10.123.45:8089.
  6. Type a user name and password for Splunk Enterprise. The user account must have the admin_all_objects capability.
  7. Click Next.
  8. Type a Query to query the HR data from Splunk Enterprise. Use the complete query you configured in Use the HR data to classify account types and user accounts. The query output can be in JSON or CSV format.
  9. Select a Frequency for how often you want Splunk UBA to run the search and retrieve the HR data, such as Daily.
    • Select Hourly to run a search now, then again every hour from now. The first search will run when you click OK at the end of this procedure.
    • Select Daily to run the first search at the next available 2 AM, then again each morning at 2 AM. For example, if you create the search at 9 AM on Monday May 1, the first search will run at 2 AM on Tuesday May 2. Subsequent searches will run each morning at 2 AM. If you want to run a search before the scheduled time, click OK at the end of this procedure, then click on the data source and click Start at the top of the data source page.
    • Select Weekly to run the first search at 2 AM one week from you create the search, then again each week at 2 AM. For example, if you create the search at 9AM on Monday May 1, the first search will run at 2 AM on Monday May 8. Subsequent searches will run each Monday at 2 AM. If you want to run a search before the scheduled time, click OK at the end of this procedure, then click on the data source and click Start at the top of the data source page.
    • Select One Time to run the search this one time only. The search will run when you click OK at the end of this procedure.
  10. Click OK to add the data source.

Add HR data from a CSV file

As an option, you can export your HR data to a CSV file, add then add the HR data in the CSV file to Splunk UBA. This is useful for testing and verifying HR data ingestion using a smaller set of data. See Validate HR data configuration before adding other data sources.

Add HR data to Splunk UBA from a CSV file by performing the following tasks:

  1. Export your HR data into a CSV file with headers that correspond to the fields in the table in Use SPL to obtain the HR data.
  2. From the Splunk UBA menu, select Manage > Data Sources.
  3. Click New Data Source.
  4. Select HR File as the type of data source and click Next.
  5. Choose the .csv file to upload.
  6. Click OK to add the data source.
Last modified on 04 March, 2021
PREVIOUS
Why Splunk UBA requires HR data
  NEXT
Add custom attributes to your HR data

This documentation applies to the following versions of Splunk® User Behavior Analytics: 5.0.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.4.1, 5.0.5


Was this documentation topic helpful?

You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters