Data ingestion parameters for Splunk Connect for Kafka
Use the following parameters to specify the types of data that you want to ingest into your Splunk platform deployment.
Required parameters
Parameter Name | Description |
---|---|
name
|
Connector name. A consumer group with this name will be created with tasks to be distributed evenly across the connector cluster nodes. |
connector.class
|
The Java class used to perform connector jobs. Keep the default value com.splunk.kafka.connect.SplunkSinkConnector unless you modify the connector.
|
tasks.max
|
The number of tasks generated to handle data collection jobs in parallel. The tasks will be spread evenly across all Splunk Connect for Kafka connector nodes. |
splunk.hec.uri
|
Splunk HTTP Event Collector (HEC) URIs. Either a list of Fully Qualified Domain Names (FQDNs) or IPs of all Splunk indexers, separated with a "," or a load balancer. The connector will load balance to indexers using round robin. Splunk Connector will round robin to this list of indexers. For example, <code>https://hec1.splunk.com:8088, https://hec2.splunk.com:8088, https://hec3.splunk.com:8088</code>. |
splunk.hec.token
|
Splunk HEC token. |
topics or topics.regex
|
For topics : A comma-separated list of Kafka topics for Splunk to consume. For example, prod-topic1,prod-topic2,prod-topic3 .
For
|
Header parameters
Parameter Name | Description |
---|---|
splunk.header.support
|
Header name. A consumer group with this name will be created with tasks to be distributed evenly across the connector cluster nodes. Parses Kafka headers for using metadata in generated Splunk software events. By default, this setting is set to false .Requires Kafka Connect version 1.1 or later. |
splunk.header.custom
|
Header name. Applicable when splunk.header.support is set to true . Custom headers are configured separated by comma for multiple headers. For example, custom_header_1,custom_header_2,custom_header_3 . This setting will look for Kafka record headers with these values and add them to each event if present. By default, it is set to "" .
|
splunk.header.index
|
Header name. Applicable when splunk.header.support is set to true . This setting specifies the header to be used for the Splunk platform index. By default, it is set to splunk.header.index .
|
splunk.header.source
|
Header name. Applicable when splunk.header.support is set to true . This setting specifies the source to be used for the Splunk platform source. By default, it is set to splunk.header.source .
|
splunk.header.sourcetype
|
Header name. Applicable when splunk.header.support is set to true . This setting specifies the sourcetype to be used for the Splunk software sourcetype. By default, it is set to splunk.header.sourcetype .
|
splunk.header.host
|
Header name. Applicable when splunk.header.support is set to true . This setting specifies the host to be used for the Splunk software host. By default, it is set to splunk.header.host .
|
Optional parameters
Parameter Name | Description |
---|---|
kerberos.user.principle
|
The Kerberos user principal connector can be used to authenticate with Kerberos. |
kerberos.keytab.path
|
The path to the keytab file is used for authentication with Kerberos. |
splunk.indexes
|
Target Splunk indexes to send data to. This can be a list of indexes can be a list of indexes, and can also be the same sequence and order as topics. It is possible to inject data from different Kafka topics to different Splunk platform indexes. For example, prod-topic1, prod-topic2, and prod-topic3 can be sent to index prod-index1, prod-index2, and prod-index3. If you want to index all data from multiple topics to the main index, then "main" can be specified. If you leave this setting unconfigured, data will route to the default index configured against the HEC token. Verify that the indexes configured here are in the index list of HEC tokens, otherwise Splunk HEC will drop the data. By default, this setting is empty. |
splunk.sources
|
Splunk event source metadata for the Kafka topic data. The same configuration rules as indexes can be applied. If left unconfigured, the default source binds to the HEC token. By default, this setting is empty. |
splunk.sourcetypes
|
Splunk event source metadata for the Kafka topic data. The same configuration rules as indexes can be applied here. If left unconfigured, the default source binds to the HEC token. By default, this setting is empty. |
splunk.hec.backoff.threshhold.seconds
|
The amount of time Splunk Connect for Kafka waits to attempt resending after errors from a HEC endpoint. |
splunk.flush.window
|
The interval, in seconds, at which the events from Kafka connect will be flushed to your Splunk platform instance. By default, this is set to 30 .
|
splunk.hec.ssl.validate.certs
|
Valid settings are true or false , and they enable or disable HTTPS certification validation. By default, this is set to true .
|
splunk.hec.http.keepalive
|
Valid settings are true or false , and they enable or disable HTTPS connection keep-alive. By default, this is set to true .
|
splunk.hec.max.http.connection.per.channel
|
Controls how many HTTP connections will be created and cached in the HTTP pool for one HEC channel. By default, this is set to 2. |
splunk.hec.max.outstanding.events
|
Maximum amount of un-acknowledged events kept in memory by connector. Will trigger back-pressure event to slow down collection if reached. |
splunk.hec.max.retries
|
The amount of times a failed batch will attempt to resend before dropping events completely. Dropping events will result in data loss. Default is -1 , which will retry indefinitely.
|
splunk.hec.lb.poll.interval
|
Specify this parameter(in seconds) to control the polling interval (increase to do less polling, decrease to do more frequent polling). Default is 120 .
|
splunk.hec.enable.compression
|
Used for enable or disable gzip-compression. Valid settings are true or false. Default is false .
|
splunk.hec.total.channels
|
Controls the total channels created to perform HEC event POSTs. By default, this is set to 2. |
splunk.hec.max.batch.size
|
Maximum batch size when posting events to Splunk. The size is the actual number of Kafka events, and not byte size. By default, this is set to 500. |
splunk.hec.threads
|
Controls how many threads are spawned to do data injection via HEC in a single connector task. By default, this is set to 1. |
splunk.hec.socket.timeout
|
Internal TCP socket timeout when connecting to Splunk. By default, this is set to 60 seconds. |
splunk.hec.json.event.formatted
|
Set to true for events that are already in HEC format. Valid settings are true or false .
|
splunk.hec.ssl.trust.store.path
|
Location of Java KeyStore. Default setting is "" .
|
splunk.hec.ssl.trust.store.password
|
Password for Java Keystore. Default setting is "" .
|
Acknowledgment parameters (optional)
Enable HTTP Event Collector (HEC) token acknowledgments to avoid data loss. Without HEC token acknowledgment, data loss may occur, especially in the case of a system restart or crash.
Parameter Name | Description |
---|---|
splunk.hec.ack.enabled
|
Valid settings are true or false . When set to true the Splunk Connect for Kafka connector will poll event acknowledgments (ACKs) for POST events before check-pointing the Kafka offsets. This is used to prevent data loss, as this setting implements guaranteed delivery. By default, this setting is set to true . If this setting is set to |
splunk.hec.ack.poll.interval
|
This setting is only applicable when splunk.hec.ack.enabled is set to true . Internally it controls the event ACKs polling interval. By default, this setting is set to 10 seconds.
|
splunk.hec.ack.poll.threads
|
This setting is used for performance tuning and is only applicable when splunk.hec.ack.enabled is set to true . It controls how many threads should be spawned to poll event ACKs. By default, this is set to 1. For large Splunk indexer clusters (for example, 100 indexers) increase this number. Speed up ACK polling by increasing to 4 threads. |
splunk.hec.event.timeout
|
This setting is applicable when splunk.hec.ack.enabled is set to true . This setting determines how long the connector will wait before timing out and resending when events are POSTed to Splunk and before they are ACKed. By default, this setting is set to 300 seconds.
|
Endpoint parameters (Optional)
Parameter Name | Description |
---|---|
splunk.hec.raw
|
Set to true for Splunk software to ingest data using the HEC /raw endpoint. Default is false , which will use the /event endpoint.
|
splunk.hec.raw.line.breaker
|
Only applicable to HEC /raw endpoint. The setting is used to specify a custom line breaker to help Splunk separate the events correctly. For example, you can specify For more on the HTTP Event Collector (HEC) see Set up and use HTTP Event Collector in Splunk Web in the Getting Data In manual. |
splunk.hec.json.event.enrichment
|
Only applicable to the HEC /event endpoint. This setting is used to enrich raw data with extra indexed metadata fields. It contains a list of key value pairs separated by ",". The configured enrichment metadata will be indexed along with raw event data by Splunk software. Data enrichment for the HEC /event endpoint is only available in Splunk Enterprise 6.5 and later. By default, this setting is empty. |
splunk.hec.track.data
|
Valid settings are true or false . When set to true , data loss and data injection latency metadata will be indexed along with raw data. This setting only works in conjunction with the HEC /event endpoint (splunk.hec.raw : false ). By default, this setting is set to false .
|
Hardware and software requirements for Splunk Connect for Kafka | Install Splunk Connect for Kafka |
This documentation applies to the following versions of Splunk® Connect for Kafka: 2.0.5, 2.0.6, 2.0.7
Feedback submitted, thanks!