Splunk Cloud Platform

Getting Data In

Troubleshoot HTTP Event Collector

You can troubleshoot HTTP Event Collector (HEC) by viewing error logs. You can also set up logging using configuration files, investigate instance performance with dashboards included in the Monitoring Console, and detect other scaling problems.

Logging

HTTP Event Collector saves usage data about itself to log files. You can search these usage metrics using Splunk Cloud Platform or Splunk Enterprise to explore usage trends system-wide, per token, per source type, and more, as well as to evaluate HEC performance. Metrics are logged whenever HEC is active. HEC is disabled by default, so it does not log data until you enable it.

You can also view HEC error logs in the splunkd.log log file on Splunk Enterprise. See Enable debug logging in the Troubleshooting Manual for how to enable debugging on your Splunk Enterprise instance.

Log file location and management

Splunk Enterprise writes HTTP Event Collector metrics to the $SPLUNK_HOME/var/log/introspection/http_event_collector_metrics.log file.

The Splunk platform creates a new http_event_collector_metrics.log file when you log off of and back on to Splunk Cloud Platform or start your Splunk Enterprise instance. Any existing file with that name is renamed.

You configure the logging frequency of HTTP Event Collector metrics in the limits.conf configuration file. 60 seconds is the default frequency. HEC continues logging system-level metrics even when there is no data input activity. When there is no activity, you can expect about 200 kilobytes (KB) of metrics log data to be produced every 24 hours. The maximum size of a metrics log file is 25 megabytes (MB). If a log file reaches that limit, the Splunk platform renames the log file and creates a new file. Up to five metrics log files can be stored at a time.

The props.conf configuration file defines parameters for reading and indexing the metrics log file.

Searching HTTP Event Collector metrics data

The Splunk platform puts HEC metrics data into the _introspection index. To search the accumulated HEC metrics with the Splunk platform, use the following search command:

index="_introspection" token

Metrics log data format

The Splunk platform records HEC metrics data to the log in JSON format. This means that the log is both human-readable and consistent with other Splunk Cloud Platform or Splunk Enterprise log formats. A single entry consists of both input summary metrics (series = http_event_collector) and per-token metrics (series = http_event_collector_token), as shown in the following example:

{  
   "datetime":"09-01-2016 19:21:19.014 -0700",
   "log_level":"INFO",
   "component":"HttpEventCollector",
   "data":{  
      "series":"http_event_collector",
      "transport":"http",
      "format":"json",
      "total_bytes_received":0,
      "total_bytes_indexed":0,
      "num_of_requests":0,
      "num_of_events":0,
      "num_of_errors":0,
      "num_of_parser_errors":0,
      "num_of_auth_failures":0,
      "num_of_requests_to_disabled_token":0,
      "num_of_requests_to_incorrect_url":0,
      "num_of_requests_in_mint_format":0,
      "num_of_ack_requests":0,
      "num_of_requests_acked":0,
      "num_of_requests_waiting_ack":0
   }
}

{  
   "datetime":"08-22-2016 12:38:04.854 -0700",
   "log_level":"INFO",
   "component":"HttpEventCollector",
   "data":{  
      "token_name":"test",
      "series":"http_event_collector_token",
      "transport":"http",
      "format":"json",
      "total_bytes_received":57000,
      "total_bytes_indexed":44000,
      "num_of_requests":1000,
      "num_of_events":1000,
      "num_of_errors":0,
      "num_of_parser_errors":0,
      "num_of_requests_to_disabled_token":0,
      "num_of_requests_in_mint_format":0
   }
}

HEC summary metrics

The Splunk platform accumulates system-wide summary metrics even if there is no input activity. These metrics are identified by "series":"http_event_collector".

See the following table for a description of the fields for HEC summary metrics:

Field Description Value
component HTTP Event Collector metrics data identifier. HttpEventCollector
data:format HTTP Event Collector data format. json
data:num_of_auth_failures Total number of authentication failures due to invalid token. unsigned integer
data:num_of_errors Total number of per-token errors, which include the following options:
  • Bad data format
  • No authorization
  • Bad authorization
  • Connectivity problems
unsigned integer
data:num_of_events Total number of per-token events received by the HTTP Event Collector endpoint. unsigned integer
data:num_of_parser_errors Total number of per-token parser errors due to incorrectly formatted event data. unsigned integer
data:num_of_requests Total number of valid per-token individual HTTP or HTTPS requests received by an HTTP Event Collector endpoint. Each request can have one or more data events. unsigned integer
data:num_of_ack_requests Total number of HEC request indexer status queries received. unsigned integer
data:num_of_requests_acked Total number of HEC requests that Splunk successfully indexed and acknowledged. unsigned integer
data:num_of_requests_waiting_ack Total number of HEC requests received with indexer acknowledgements enabled. unsigned integer
data:num_of_requests_to_incorrect_url Total number of requests to an incorrect URL. unsigned integer
data:num_of_requests_in_mint_format Total number of requests from Splunk MINT. unsigned integer
data:num_of_requests_to_disabled_token Total number of per-token requests to disable token. unsigned integer
data:series Metrics data type. http_event_collector
data:total_bytes_indexed Total amount of per-token data sent to the indexer. unsigned integer
data:total_bytes_received Total amount of per-token data received by calling the receive/token endpoint. unsigned integer
data:transport Data transport protocol for HTTP Event Collector data. http
datetime Date and time associated with the data. Takes the following format: MM-DD-YYYY HH:MM:SS.SSS +/-GMTDELTA string
log_level Log severity level. INFO

Per-token metrics

In contrast to the system-wide summary metrics, the Splunk platform accumulates per-token metrics only when HEC is active. These metrics are identified by "series":"http_event_collector_token".

The [http_input] stanza in the limits.conf configuration file defines the logging interval and maximum number of tokens logged for these metrics.

See the following table for a description of the fields for per-token metrics:

Field Description Value
component HTTP Event Collector metrics data identifier. HttpEventCollector
data:format HTTP Event Collector data format. Always JSON format for metrics logging. json
data:num_of_errors Number of errors, which include the following:
  • Bad data format
  • No authorization
  • Bad authorization
  • Connectivity problems
unsigned integer
data:num_of_events Number of events received by the HTTP Event Collector endpoint. unsigned integer
data:num_of_parser_errors Number of parser errors due to incorrectly formatted event data. unsigned integer
data:num_of_requests Number of valid individual HTTP or HTTPS requests received by an HTTP Event Collector endpoint. Each request can have one or more data events. unsigned integer
data:num_of_requests_in_mint_format Total number of requests from Splunk MINT. unsigned integer
data:num_of_requests_to_disabled_token Number of requests to a disabled token. unsigned integer
data:series Metrics data type. http_event_collector_token
data:token_name Token name. string
data:total_bytes_indexed Total amount of data sent to the indexer. unsigned integer
data:total_bytes_received Total amount of data received by calling the receive/token endpoint. unsigned integer
data:transport Data transport protocol for HTTP Event Collector data. http
datetime Date and time associated with the data. Takes the following format: MM-DD-YYYY HH:MM:SS.SSS +/-GMTDELTA string
log_level Log severity level. INFO

Logging with configuration files

The limits.conf and props.conf files control metrics data logging and indexing behavior.

limits.conf

The [http_input] stanza in the $SPLUNK_HOME/etc/system/default/limits.conf file controls HTTP Event Collector metrics data logging.

For information about all HTTP Event Collector-related parameters, including those not related to metrics, see the [http_input] stanza documentation on limits.conf in the Splunk Enterprise Admin Manual.

Limits.conf takes the following parameters:

Parameter Default value Description
max_number_of_tokens 10000 An unsigned integer that represents the maximum number of tokens reported by HTTP Event Collector metrics.
metrics_report_interval 60 An unsigned integer that represents the number of seconds in an HTTP Event Collector metrics report interval.

props.conf

The [http_event_collector_metrics] stanza in the $SPLUNK_HOME/etc/system/default/props.conf file controls reading and indexing the HTTP Event Collector log files.

See the following example:

[source::.../http_event_collector_metrics.log(.\d+)?]
sourcetype = http_event_collector_metrics

...

[http_event_collector_metrics]
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = datetime
TIME_FORMAT = %m-%d-%Y %H:%M:%S.%l %z
INDEXED_EXTRACTIONS = json
KV_MODE = none
JSON_TRIM_BRACES_IN_ARRAY_NAMES = true

Props.conf takes the following parameters:

Parameter Default Description
SHOULD_LINEMERGE false Specifies layout of events per line. Setting to true allows multiple events in the same line. Setting to false puts multiple events in separate lines.
TIMESTAMP_FIELDS datetime Log entry time field name.
TIME_FORMAT %m-%d-%Y %H:%M:%S.%l %z Log entry time field format.
INDEXED_EXTRACTIONS json Metrics log format. Always in JSON format for metrics logging.
KV_MODE none Key-value data indicator. Setting to none means no key-value data. Always none for metrics logging.
JSON_TRIM_BRACES_IN_ARRAY_NAMES true Whether to trim brace characters from JSON array names.

Possible error codes

The following status codes have particular meaning for all HTTP Event Collector endpoints:

Status code HTTP status code ID HTTP status code Status message
0 200 OK Success
1 403 Forbidden Token disabled
2 401 Unauthorized Token is required
3 401 Unauthorized Invalid authorization
4 403 Forbidden Invalid token
5 400 Bad Request No data
6 400 Bad Request Invalid data format
7 400 Bad Request Incorrect index
8 500 Internal Error Internal server error
9 503 Service Unavailable Server is busy
10 400 Bad Request Data channel is missing
11 400 Bad Request Invalid data channel
12 400 Bad Request Event field is required
13 400 Bad Request Event field cannot be blank
14 400 Bad Request ACK is disabled
15 400 Bad Request Error in handling indexed fields
16 400 Bad Request Query string authorization is not enabled

To ensure data is successfully ingested into the Splunk platform, configure your clients with the ability to act on response codes returned by the HEC endpoint. If the client can't take an action based on the resulting response code, data loss might occur.

Investigate instance performance with the Monitoring Console

The Monitoring Console provides pre-built dashboards for HEC that you can use to investigate your instance performance. See the following topics for more information:

The Monitoring Console provides a pre-built dashboard to monitor HTTP Event Collector. See Indexing: Inputs: HTTP Event Collector in the Monitoring Splunk Enterprise manual.

Detect scaling problems

If you are experiencing performance slowdowns or want to speed up your HTTP Event Collector deployment, the following factors can affect performance.

HTTP and HTTPS

Sending data over HTTP results in a significant performance improvement compared to sending data over HTTPS.

Batching

If you batch multiple events into single requests, it can speed up data transmission. Because the request metadata applies to all events in the request, less data is sent overall. For more information about how event data is packaged, see Format events for HTTP Event Collector.

HTTP Keep-alive

Setting keep-alive on your connection can improve performance. As long as the client sending the data supports HTTP 1.1 and is set up to support HTTP persistent connection, you can optimize performance with keep-alive.

Persistent queues

Persistent queuing slows down performance by storing data in an input queue to disk. For more information, see Use persistent queues to help prevent data loss.

Last modified on 08 October, 2024
HTTP Event Collector examples   Monitor First In, First Out (FIFO) queues

This documentation applies to the following versions of Splunk Cloud Platform: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release), 9.3.2408


Was this topic useful?







You must be logged into splunk.com in order to post comments. Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic. If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk, consider posting a question to Splunkbase Answers.

0 out of 1000 Characters