About HTTP Event Collector Indexer Acknowledgment
HTTP Event Collector (HEC) supports indexer acknowledgment in Splunk Enterprise only.
Splunk Cloud Platform supports the HTTP Event Collector Indexer Acknowledgment for AWS Kinesis Firehose.
While similar in purpose and identical in name, indexer acknowledgment in HEC is not the same as the indexer acknowledgment capability for forwarding. For information about indexer acknowledgment for forwarding, see protect against loss of in-flight data in the Splunk Enterprise Forwarding Data manual. Splunk Cloud Platform also does not offer support for forwarding-based indexer acknowledgment.
Why indexer acknowledgment
By default, when HEC receives an event successfully, it immediately sends an HTTP Status 200 code to the sender of the data. However, this only means that the event data appears to be valid, and HEC sends the status message before the event data enters the processing pipeline. During processing, there are several places where, because of an outage or a system failure, events can be lost before Splunk Enterprise indexes them. While HEC has precautions in place to prevent data loss, it's impossible to completely prevent such an occurrence, especially in the event of a network failure or hardware crash. This is where indexer acknowledgment comes in.
How indexer acknowledgment works
You can enable indexer acknowledgment on a per-token basis. The indexer acknowledgment process is similar to the following package tracking scenario:
The shipping company issues a tracking number upon shipment of a package. The shipping company updates the status for the tracking number once it's delivered, and then at your convenience, you check whether the package arrived successfully by using the tracking number to retrieve the status.
The following diagram illustrates the indexer acknowledgment process in order from top to bottom. The paragraphs that follow refer to each step by number:
Each time a client sends a request to the HEC endpoint using a token with indexer acknowledgment enabled (1), HEC returns an acknowledgment identifier to the client (2). The response body is a JavaScript Object Notation (JSON) object with the acknowledgment identifier, such as the following:
{"ackID":"2"}
The client can then query HEC with the identifier to verify whether all the events in the request that corresponds to that identifier have been indexed (3). The client sends the query to a special endpoint (/services/collector/ack
), and contains JSON-formatted data like the following, where the only key, "acks"
, is set to an array of the ackIDs whose status you are querying:
{"acks":[0,1,2]}
Next, HEC responds with the status information to the client (4). The body of the reply contains the status of each of the requests that the client queried. A true
status only indicates that the event that corresponds to that ackID was replicated at the desired replication factor. A true status does not guarantee that the event was indexed, because the parsing pipeline might drop events that can't be parsed. A false
status indicates that there is no status information for that ackID, or that the corresponding event has not been indexed. The corresponding event might not have been indexed yet, the ackID might not have been found, or some other problem might have occurred. For example:
{"acks": {"0": true, "1": false, "2": true}}
Because a false
status could indicate any number of problems, only query an ackID during the timeframe in which the request could reasonably be expected to be in transit.
Once a client retrieves a true
status for an ackID, HEC deletes that ackID status information. If you query the same ackID again, HEC will always return false
for that ackID because its status information can no longer be found. For that reason, avoid querying an ackID again after its status returns as true
.
Enable indexer acknowledgment for HEC in Splunk Enterprise
You can enable indexer acknowledgment in Splunk Web or in the inputs.conf configuration file.
Splunk Cloud Platform does not support indexer acknowledgement in HEC.
Enable indexer acknowledgment for HEC using Splunk Web
When you create a HEC token in Splunk Web, select the checkbox on the first screen labeled Enable indexer acknowledgment. Then continue with the token creation process.
For information on creating HEC tokens in Splunk Web, see Set up and use HTTP Event Collector in Splunk Web.
Enable indexer acknowledgment using the inputs.conf configuration file
You can enable indexer acknowledgment for existing tokens on Splunk Enterprise by editing the inputs.conf configuration file in the HEC app.
- Open the inputs.conf file, which is at the following path:
- In *nix: $SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf
- In Windows: %SPLUNK_HOME%\etc\apps\splunk_httpinput\local\inputs.conf
- Within the stanza that corresponds to the token for which you want to enable indexer acknowledgment, add the following line:
useACK=true
- Save and close the file.
About channels and sending data
Sending events to HEC with indexer acknowledgment active is similar to sending them with the setting off. There is one crucial difference: when you have indexer acknowledgment turned on, you must specify a channel when you send events.
The concept of a channel was introduced in HEC primarily to prevent a fast client from impeding the performance of a slow client. When you assign one channel per client, because channels are treated equally on Splunk Enterprise, one client can't affect another.
You must include a matching channel identifier both when sending data to HEC in an HTTP request and when requesting acknowledgment that events contained in the request have been indexed. If you don't, you will receive the error message, "Data channel is missing." Each request that includes a token for which indexer acknowledgment has been enabled must include a channel identifier, as shown in the following example cURL statement, where <data>
represents the event data portion of the request:
curl https://mysplunk.com/services/collector -H "X-Splunk-Request-Channel: FE0ECFAD-13D5-401B-847D-77833BD77131" -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<data>' -v
Alternatively, the X-Splunk-Request-Channel
header field can be sent as a URL query parameter, as shown here:
curl https://mysplunk.com/services/collector?channel=FE0ECFAD-13D5-401B-847D-77833BD77131 -H "Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C" -d '<data>' -v
Indexer acknowledgment also works with raw JSON data. In that case, the endpoint to use in requests is /services/collector/raw
. For more information, see Format events for HTTP Event Collector.
Channels are designed so that you assign a unique channel to each client that sends data to HEC. Each channel has a channel identifier (ID), which must be a Globally Unique Identifier (GUID) but can be randomly generated. You assign channel IDs simply by including them in requests as shown in the examples above. When Splunk Enterprise sees a new channel identifier, it creates a new channel.
Query for indexing status
Once you enable indexer acknowledgment for a token, every request that a client sends to HEC using that token returns the following acknowledgment identifier (ackID) contained in a simple JSON object to the sender, where <int>
represents a unique integer identifier that corresponds to the request:
{"ackID":"<int>"}
To verify that the indexer has indexed the event(s) contained in the request, query the following endpoint, where <host>
and <port>
represent the hostname and port number of your Splunk platform instance, respectively:
https://<host>:<port>/services/collector/ack
{"acks":[0,1,2]}
Following is an example cURL statement that queries Splunk Enterprise for the indexing status of the events contained in the requests with the identifiers "0", "1", "2", and "3":
curl https://<host>:<port>/services/collector/ack?channel=FE0ECFAD-13D5-401B-847D-77833BD77131 -H "Authorization: Splunk 2EE7B1AE-8577-4FC2-BA31-5CA377266B22" -d "{"acks":[0,1,2,3]}"
Both the data channel ID (?channel=FE0ECFAD-13D5-401B-847D-77833BD77131
) and the auth header ("Authorization: Splunk BD274822-96AA-4DA6-90EC-18940FB2414C"
) are required in this query. For more information, see the previous section About channels and sending data.
The body of the reply contains the status of each of the request(s) for whose status you queried. The following example response indicates that the requests with the ackIDs "0" and "2" were successfully indexed, but the requests with the ackIDs "1" and "3" were not successfully indexed:
{"acks": {"0": true, "1": false, "2": true, "3": false}}
Channel limits and indexing status expiration
Splunk Enterprise caches acknowledgment IDs and their corresponding status information in memory. To prevent the server from running out of memory, and to prevent malicious or misbehaved clients, several new limit settings have been introduced.
To prevent channels from being overloaded, and to prevent an excessive number of channels from being created, several new settings have been introduced to the http_input stanza in the limits.conf configuration file:
Setting | Value type | Default value | Description |
---|---|---|---|
max_number_of_acked_requests_pending_query_per_ack_channel | integer | 1000000 | Specifies the maximum number of ackIDs and their corresponding status information that are waiting to be queried in each channel. If a client makes many requests with indexer acknowledgment enabled, this setting prevents the client's channel from becoming full of ackIDs and status information and the client from receiving a server busy error. |
max_number_of_ack_channel | integer | 1000000 | Specifies the maximum number of channels that clients can acquire for this Splunk server instance. If a single client tries to acquire more than this number of channels, the request will fail with server busy error. This setting is used to prevent a client from acquiring too many channels. |
max_number_of_acked_requests_pending_query | integer | 10000000 | Specifies the maximum number of ackIDs and their corresponding status information in all channels. |
To prevent the likelihood of the limits being reached, Splunk Enterprise can clean up channels that are idle for a period of time and release the memory for those channels. You do this using the following settings, which are set at the global ([http]
stanza) level in the inputs.conf configuration file:
- In *nix: $SPLUNK_HOME/etc/apps/splunk_httpinput/local/inputs.conf
- In Windows: %SPLUNK_HOME%\etc\apps\splunk_httpinput\local\inputs.conf
Parameter | Value type | Default value | Description |
---|---|---|---|
ackIdleCleanup | boolean | false | When set to true , this parameter causes the server to remove channels that are idle for the number of seconds set in the maxIdleTime setting.
|
maxIdleTime | integer | 600 | Specifies the maximum number of seconds that channels can be idle before they are removed. |
Indexer acknowledgment client behavior
To ensure data is successfully ingested into the Splunk platform, configure your clients with the ability to act on response codes that the HEC endpoint returns. If the client can't take an action based on the resulting response code, data loss might occur. For more information, see Possible error codes.
Follow these guidelines to ensure that clients that connect to HEC don't exhibit malicious behavior or end up hitting the limits described earlier in this topic. An indexer acknowledgment client must:
- Create its own GUID to use as its channel identifier.
- Send requests using only that channel.
- Save each acknowledgment identifier (acklD) that HEC returns after a request.
- Continually poll the
/services/collector/ack
REST endpoint at an interval (for instance, every 10 seconds) to confirm that acknowledgment status arrives in a timely manner. Because Splunk Enterprise deletes status information after clients retrieve it, this releases memory on the server. - Resend any event data for which HEC has not sent an acknowledgment within a certain amount of time, for example, 5 minutes. It is safe to assume that, by that time, the event data has been lost. When you resend the event data, a good practice is to add some additional data in the event that indicates it may be duplicate data. It's possible the event was previously indexed but the status expired due to the cleanup of the channel, or the HEC status cache might have been cleared.
Use cURL to manage HTTP Event Collector tokens, events, and services | Scale HTTP Event Collector with distributed deployments |
This documentation applies to the following versions of Splunk® Enterprise: 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 7.3.6, 7.3.7, 7.3.8, 7.3.9, 8.0.0, 8.0.1, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.6, 8.0.7, 8.0.8, 8.0.9, 8.0.10, 8.1.0, 8.1.1, 8.1.2, 8.1.3, 8.1.4, 8.1.5, 8.1.6, 8.1.7, 8.1.8, 8.1.9, 8.1.10, 8.1.11, 8.1.12, 8.1.13, 8.1.14, 8.2.0, 8.2.1, 8.2.2, 8.2.3, 8.2.4, 8.2.5, 8.2.6, 8.2.7, 8.2.8, 8.2.9, 8.2.10, 8.2.11, 8.2.12, 9.0.0, 9.0.1, 9.0.2, 9.0.3, 9.0.4, 9.0.5, 9.0.6, 9.0.7, 9.0.8, 9.0.9, 9.0.10, 9.1.0, 9.1.1, 9.1.2, 9.1.3, 9.1.4, 9.1.5, 9.2.0, 9.2.1, 9.2.2, 9.3.0
Feedback submitted, thanks!