Splunk® User Behavior Analytics Kafka Ingestion App

Splunk UBA Kafka Ingestion App

Acrobat logo Download manual as PDF


Acrobat logo Download topic as PDF

Configure Kafka data ingestion

Kafka ingestion works by issuing multiple micro-batch queries with consecutive time ranges connected to each other against live data from Splunk Enterprise. Running real-time indexed searches on Splunk Enterprise is not required. See How data gets in to Splunk UBA in Get Data into Splunk User Behavior Analytics for information about ingesting data sources not using Kafka.

Some data sources are known to have a lag when ingested into the Splunk platform, such as batch files that are ingested periodically. In such cases, you can adjust the Kafka ingestion properties to make sure that the data is still ingested by Splunk UBA. Perform the following steps to configure the Kafka ingestion properties:

  1. Modify or add the properties to the /etc/caspida/local/conf/uba-site.properties file. See the table for the property names, descriptions, and default values.
  2. Synchronize the cluster and restart Splunk UBA to make the configuration changes take effect.

These properties apply globally to all data sources sent to Kafka for ingestion, including any data sources that you may have configured earlier with different properties.

Property Description
splunk.kafka.ingestion.search.delay.seconds The point in time where Splunk UBA begins Kafka ingestion. The default is 180 seconds (3 minutes) earlier than the start of the current minute. For example, if Kafka ingestion is enabled at 10 seconds past 1:02 PM, then the beginning of the minute is 1:02 PM. Specifying a delay of 120 seconds means that the first batch query begins processing events at 1:00 PM. The query runs on the events within the specified interval of time defined by splunk.kafka.ingestion.search.delay.seconds.


Do not configure this property to exceed 10800 seconds (3 hours).

You can configure the data ingestion start time for any individual data source by adding the data source name to the end of the property. For example, to configure delay of 120 seconds for a data source named exampledatasource, use the following property and value setting:

splunk.micro.batching.search.delay.seconds.exampledatasource = 120

Setting this property for an individual data source overrides the value of the splunk.kafka.ingestion.search.delay.seconds property.

splunk.kafka.ingestion.search.interval.seconds The length of the time in seconds for each batch query. The default is 60 seconds, meaning that each query searches for 60 seconds worth of events, starting from the time defined by splunk.kafka.ingestion.search.interval.seconds.


Do not configure the interval to exceed 4 minutes.

You can configure the query interval for any individual data source by adding the data source name to the end of the property. For example, to configure an interval of 120 seconds for a data source named exampledatasource, use the following property and value setting:

splunk.micro.batching.search.interval.seconds.exampledatasource = 120

Setting this property for an individual data source overrides the value of the splunk.kafka.ingestion.search.interval.seconds property.

splunk.kafka.ingestion.search.max.lag.seconds The lag, or amount of time between the end time of the most recent batch query and the time Kafka ingestion starts. The default is 3600 seconds (1 hour). For example, if the first batch query ends at 1:00 PM and 59 seconds, and Kafka ingestion starts at 1:02 PM and 10 seconds, then the lag at that time is 1 minute and 11 seconds. If the lag exceeds the configured splunk.kafka.ingestion.search.max.lag.seconds, Splunk UBA shows an alert in the health monitor.

The respective time ranges of these properties is shown in the following diagram.

This graphic shows a timeline describing the Kafka ingestion properties. There are four segments on the timeline, with the following labels, from left to right: The first segment is labeled 1:00:00PM - 1:00:59 PM, the second segment is labeled 1:01:00 PM - 1:01:59 PM, about one-third of the way in from the third segment, there is a label reading "Kafka ingestion started at 1:02:10 PM", and the fourth segment is labeled 1:03:00 PM - 1:03:59 PM. The configurable properties, and how they are related to each segment in the timeline, are described in the table immediately preceding this graphic.

Last modified on 12 August, 2022
PREVIOUS
Enable Kafka data ingestion
  NEXT
Change an existing data source to use Kafka ingestion

This documentation applies to the following versions of Splunk® User Behavior Analytics Kafka Ingestion App: 1.3, 1.4, 1.4.1, 1.4.2, 1.4.3


Was this documentation topic helpful?


You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters