Identity Correlation

Identity correlation uses an identity list to provide information used to correlate identities (individuals) with events and assets.

What does identity correlation do?

Identity correlation provides additional information about the users involved in events. This information can be used to correlate multiple events to a single person, identify the owner of a host, determine whether an individual is subject to special attention, and more. Specifically, identity correlation provides the following:

Prioritization

The same type of access events initiated by two different users may not deserve the same level of attention; a medium severity pattern of behavior from a graphic designer's account is less urgent than the same pattern from a database administrator's account. Identity correlation allows an urgency to be computed based on the priority of individuals so that higher urgency is assigned to high priority individuals. The urgency is assigned to the notable events that a viewable on the Incident Review dashboard. Those notable events are generated based on significant patterns of behavior and urgency levels.

Categorization

Identity correlation allows information about the individuals involved to be added to events. For example, identity correlation lookups can be used in a correlation search to indicate whether the user is subject to a compliance regulation.

Normalization

Identity correlation allows accounts to be normalized and determine whether two events relate to the same individual. For example, one person may have different account names for email, ERP, and ITIL systems that they use frequently. Identity management helps the security analyst determine that access events for all three systems are from the same person.

Why is identity correlation important?

Identity correlation provides additional fields as part of an event to automatically designate the identity associated with the event. This makes it possible to find events for an identity, and finding identities for an event. Where possible, identity correlation uses session information to help match events to a session, and a session to an identity.

What is an identity?

An identity represents a user of the system. An identity can have multiple identity-related fields in the field context menu. Identity correlation will consider each of these fields in order to match the identity with the respective events.

Additionally, identities include information that describes the priority and business unit of the identity. Identities can also be put into categories to define the role of the identity or the functional area to which it belongs.

Fields in the Identities list:

Column	Description
Identity (key)	Pipe delimited list of usernames representing the identity
prefix	Prefix of the identity (for example, Dr.)
nick	Nickname of the identity
first	First name of the identity
middle	Middle name or initial of the identity
last	Last name of the identity
suffix	Suffix of the identity (for example, Jr.)
email	Email address of the identity
phone	Telephone number of the identity
phone2	Secondary telephone number of the identity
managedBy	Username representing manager of the identity.
priority	Priority of the identity (low, medium, high, critical) For example, "CEO" might be considered a critical priority.
bunit	Business unit of the identity
category	Category of the identity
watchlist	Identity on watchlist?
startDate	Start/Hire date of the identity
endDate	End/Termination date of the identity

In the Identity Center dashboard, these identities and events are aggregated by priority, by business unit, and by category. A list of identities associated with events is displayed in a table at the bottom of the dashboard. See "Identity Center dashboard"' in this manual for more details.

How identity correlation works

Identity correlation is used to reconcile and validate the ownership of different user names that reside on systems and applications throughout the organization, and to permanently link ownership of those user names to particular individuals.

Identities (people) can operate under many different user names. For example, a person could log in as:

jhsmith
james_smith
js842@gmail.com
team.leader12

An identity search includes all of the user names for a given identity. To find events associated with an identity, Splunk searches for user names and then searches for session information to create a list of events.

Find the event(s) associated with an identity

To associate an event with an identity, the event is searched for a user field, a session, or an identity field. When an identity is found, identity matching may take place to see if there are other associated user names.

Find the identity associated with events

Identity matching

Identity matching is done by comparing user names associated with an event with identities in the Identities table. Splunk identity match configurations are specified using the identities.csv file. An identity table might look like this:

Customizing Identities

To view a current identity list, create a new identity list,or make immediate modifications to an existing list, go to Configure > Identities > Edit. Edit the content, or paste the new values in.

Note: The editor does not validate input. Be very careful with your edits.

Alternatively, the new file may be installed at: $SPLUNK_HOME/etc/apps/SA-IdentityManagement/lookups/identities.csv.

Note: The CSV file must use UNIX line endings. The popular dos2unix utility may be used to correct line endings in a file produced on Windows or OS/X.

Update the list periodically in order to ensure that the Splunk App for PCI Compliance has reasonably up-to-date information. Generally it is recommended that the list be updated at least every quarter.

Note: Splunk automatically loads the identity list at search time and does not need to be restarted.

Scripted inputs

It is possible to configure a scripted input to automatically populate the list if it is contained within a database. Automatic identity updates can be done using a combination of scripted inputs and custom search commands (written in Python). The implementation details depend on the technology that contains the identity information and is beyond the scope of this document.

Related answers from Splunk Community

Identity Correlation

What does identity correlation do?

Why is identity correlation important?

What is an identity?

How identity correlation works

Identity matching

Customizing Identities

Scripted inputs

Comments

Identity Correlation

Was this topic useful?