Get metrics in from other sources

If you are gathering metrics from a source that is not natively supported, you can still add this metrics data to a metrics index.

Get metrics in from files in CSV format

There are two accepted formats for CSV files when you use them as inputs for metrics data. The format you use depends on how you want the Splunk software to index the information in the CSV file. Should it index so that each data point has multiple measurements, or so that each data point has only one measurement?

It is more efficient to use metric data points that can contain multiple measurements. When you index metrics data this way you reduce your data storage costs and can benefit from improved search performance.

Set metrics CSV source types and data inputs

Create a data input to add your CSV data to a metrics index. If your metrics data is in CSV format, you have two pre-trained source type options for metric conversion: csv and metrics_csv.

Source type	Used for	CSV formatting information
`csv`	CSV data that is formatted for multiple-measurement metric data points	How to format
`metrics_csv`	CSV data that is formatted for single-measurement metric data points	How to format

The index you select for the CSV metrics data input must be a metrics index.

Here is an example of a universal forwarder inputs.conf configuration that monitors CSV data for multiple-measurement metric data points and sends that data to the metrics indexer.

#inputs.conf

[monitor:///opt/metrics_data]
index = metrics
sourcetype = csv

A universal forwarder configuration that monitors CSV data for single-measurement metric data points would be exactly the same, except it would have sourcetype = metrics_csv.

You should also have the following indexes.conf configuration on the metrics indexer:

#indexes.conf

[metrics]
homePath = $SPLUNK_DB/metrics/db
coldPath = $SPLUNK_DB/metrics/colddb
thawedPath = $SPLUNK_DB/metrics/thaweddb
datatype = metric
maxTotalDataSizeMB = 512000

See Monitor files and directories in the Getting Data In manual, and Create metrics indexes in the Managing Indexers and Clusters of Indexers manual.

Format a CSV file for multiple-measurement metric data points

When you format a CSV file for multiple-measurement metric data points, the first column header is _time, the metric timestamp. It is a required field.

This is followed by one or more column headers for each metric measurement. Each measurement column header follows this syntax: metric_name:<metric_name>.

The Splunk software considers additional columns that are not a timestamp or a measurement to be dimensions.

Each row of the CSV table is a separate metric data point.

Field name	Required	Description	Example value
`_time`	Yes	The metric timestamp. The format is epoch time (elapsed time since 1/1/1970), in milliseconds.	1504907933.000
`metric_name:<metric_name>`	Yes	A measurement for a specific metric, as specified by `<metric_name>`, such as `metric_name:os.cpu.idle` or `metric_name:max.size.kpbs`. Their values are always numeric.	13.34
dimensions	No	All other fields are treated as dimensions.	For a dimension named `ip`, a value of `192.0.2.1`.

Here is an example of a CSV file that is formatted for multiple-measurement metric data points. The first column is _time, the metric timestamp. The middle three columns are measurements. The last two columns are dimensions.

"_time","metric_name:cpu.usr","metric_name:cpu.sys","metric_name:cpu.idle","dc","host"
"1562020701",11.12,12.23,13.34,"east","east.splunk.com"
"1562020702",21.12,22.33,23.34,"west","west.splunk.com"

This CSV file example contains the same information as the example CSV file for single-measurement metric data points in the following section. However, because it uses two data points for this information instead of six, it will take up less space on disk when it is indexed.

Format a CSV file for single-measurement metric data points

When you format a CSV file for single-measurement metric data points, the first three columns are fields that are required for single-measurement metric data points:

metric_timestamp
metric_name
_value.

All additional columns are considered to be dimensions.

During the ingestion and indexing process the metric_name and _value measurements will merged into the metric_name:<metric_name>=<numeric_value> format.

Field name	Required	Description	Example value
`metric_timestamp`	Yes	The timestamp format is epoch time (elapsed time since 1/1/1970), in milliseconds.	1504907933.000
`metric_name`	Yes	The metric name using dotted-string notation.	os.cpu.percent
`_value`	Yes	The numerical value associated with the `metric_name`.	42.12345
dimensions	No	All other fields are treated as dimensions.	ip

Here is an example of a CSV file that is formatted for single-measurement metric data points. The first three columns of the table are the fields that are required for single-measurement metric data points. All additional columns are dimensions. This CSV file has dc and host as dimensions.

"metric_timestamp","metric_name","_value","dc","host"
"1562020701","cpu.usr",11.12,"east","east.splunk.com"
"1562020701","cpu.sys",12.23,"east","east.splunk.com"
"1562020701","cpu.idle",13.34,"east","east.splunk.com"
"1562020702","cpu.usr",21.12,"west","west.splunk.com"
"1562020702","cpu.sys",22.33,"west","west.splunk.com"
"1562020702","cpu.idle",23.34,"west","west.splunk.com"

If you compare this example to the example for multiple-measurement metric data points, you can see how the single-metric format would take up more space on disk. This table contains the same information as the multiple-measurement table. However, this table uses six data points where the multiple-measurement table only uses two.

Get metrics in from clients over TCP/UDP

You can add metrics data from a client that is not natively supported to a metrics index by manually configuring a source type for your data, then defining regular expressions to specify how the Splunk software should extract the required metrics fields. See Metrics data format.

For example, let's say you are using Graphite. The Graphite plaintext protocol format is:

<metric path> <metric value> <metric timestamp>

A sample metric might be:

510fcbb8f755.sda2.diskio.read_time 250 1487747370

To index these metrics, edit Splunk configuration files to manually specify how to extract fields.

Configure field extraction by editing configuration files

Define a custom source type for your metrics data.
1. In a text editor, open the props.conf configuration file from the local directory for the location you want to use, such as the Search & Reporting app ($SPLUNK_HOME/etc/apps/search/local/) or the system ($SPLUNK_HOME/etc/system/local). If a props.conf file does not exist in this location, create a text file and save it to that location.
2. Append a stanza to the props.conf file as follows:
```
# props.conf

[<metrics_sourcetype_name>]
TIME_PREFIX = <regular expression>
TIME_FORMAT = <strptime-style format>
TRANSFORMS-<class> = <transform_stanza_name>
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
pulldown_type = 1
category = Metrics
```
Define a regular expression for each metrics field to extract.
1. In a text editor, open the transforms.conf configuration file from the local directory for the location you want to use, such as the Search & Reporting app ($SPLUNK_HOME/etc/apps/search/local/) or the system ($SPLUNK_HOME/etc/system/local). If a transforms.conf file does not exist in this location, create a text file and save it to that location.
2. Append a stanza for each regular expression as follows:
```
# transforms.conf

[<transform_stanza_name>]
REGEX = <regular expression>
FORMAT = <string>
WRITE_META = true
```
Create a data input for this source type as described in Set up a data input for StatsD data, and select your custom source type.

For more about editing these configuration files, see About configuration files, props.conf, and transforms.conf in the Admin Manual.

Example of configuring field extraction

This example shows how to create a custom source type and regular expressions to extract fields from Graphite metrics data.

# props.conf.example

[graphite_plaintext]
TIME_PREFIX = \s(\d{0,10})$
TIME_FORMAT =  %s
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
pulldown_type = 1
TRANSFORMS-graphite-host = graphite_host
TRANSFORMS-graphite-metricname = graphite_metric_name
TRANSFORMS-graphite-metricvalue = graphite_metric_value
category = Metrics

# transforms.conf.example

[graphite_host]
REGEX = ^(\S[^\.]+)
FORMAT = host::$1
DEST_KEY = MetaData:Host

[graphite_metric_name]
REGEX = \.(\S+)
FORMAT = metric_name::graphite.$1
WRITE_META = true

[graphite_metric_value]
REGEX = \w+\s+(\d+.?\d+)\s+
FORMAT = _value::$1
WRITE_META = true

Get metrics in from clients over HTTP or HTTPS

If you want to send metrics data in JSON format from a client that is not natively supported to a metrics index over HTTP or HTTPS, use the HTTP Event Collector (HEC) and the /collector REST API endpoint.

Create a data input and token for HEC

In Splunk Web, click Settings > Data Inputs.
Under Local Inputs, click HTTP Event Collector.
Verify that HEC is enabled.
1. Click Global Settings.
2. For All Tokens, click Enabled if this button is not already selected.
3. Click Save.
Configure an HEC token for sending data by clicking New Token.
On the Select Source page, for Name, enter a token name, for example "Metrics token".
Leave the other options blank or unselected.
Click Next.
On the Input Settings page, for Source type, click New.
In Source Type, type a name for your new source type.
For Source Type Category, select Metrics.
Optionally, in Source Type Description type a description.
Next to Default Index, select your metrics index, or click Create a new index to create one.
If you choose to create an index, in the New Index dialog box:
1. Enter an Index Name.
2. For Index Data Type, click Metrics.
3. Configure additional index properties as needed.
4. Click Save.
Click Review, and then click Submit.
Copy the Token Value that is displayed. This HEC token is required for sending data.

See Set up and use HTTP Event Collector in Splunk Web in Getting Data In.

Send data to a metrics index over HTTP

Use the /collector REST API endpoint and your HEC token to send data directly to a metrics index as follows:

http://<splunk_host>:<HTTP_port>/services/collector -H 'Authorization: Splunk <HEC_token>' -d "<metrics_data>"

You need to provide the following values:

Splunk host machine (IP address, host name, or load balancer name)
HTTP port number
HEC token value
Metrics data, which may include an event field.

For more information about HEC, see Set up and use HTTP Event Collector in Splunk Web and Format events for HTTP Event Collector in Getting Data In.

For the /collector endpoint reference, see /collector in the REST API Reference Manual.

Example of sending metrics using HEC

The following example shows a command that sends a metric data point to a metrics index, with the following values:

Splunk host machine: "localhost"
HTTP port number: "8088"
HEC token value: "b0221cd8-c4b4-465a-9a3c-273e3a75aa29"

curl https://localhost:8088/services/collector                     \
-H "Authorization: Splunk b0221cd8-c4b4-465a-9a3c-273e3a75aa29"       \
-d '{"time": 1486683865.000,"event":"metric","source":"disk","host":"host_1.splunk.com","fields":{"region":"us-west-1","datacenter":"dc1","rack":"63","os":"Ubuntu16.10","arch":"x64","team":"LON","service":"6","service_version":"0","service_environment":"test","path":"/dev/sda1","fstype":"ext3","metric_name:cpu.usr": 11.12,"metric_name:cpu.sys": 12.23, "metric_name:cpu.idle": 13.34}}'

The measurements for this metric data point appear at the end of the JSON blob. They follow a multiple-metric format that uses the "metric_name:<metric_name>":<numeric_value> syntax.

The multiple-metric JSON format

Versions of the Splunk platform previous to 8.0.0 used a JSON format that only supported one metric measurement per JSON object. This resulted in metric data points that could only contain one measurement at a time.

Version 8.0.0 of the Splunk platform supports a JSON format which allows each JSON object to contain measurements for multiple metrics. These JSON objects generate multiple-measurement metric data points. Multiple-measurement metric data points take up less space on disk and can improve search performance.

Here is an example of a JSON object in the multiple-metric format.

{
  "time": 1486683865,
  "event": "metric",
  "source": "metrics",
  "sourcetype": "perflog",
  "host": "host_1.splunk.com",
  "fields": {
    "region": "us-west-1",
    "datacenter": "dc2",
    "rack": "63",
    "os": "Ubuntu16.10",
    "arch": "x64",
    "team": "LON",
    "service": "6",
    "service_version": "0",
    "service_environment": "test",
    "path": "/dev/sda1",
    "fstype": "ext3",
    "metric_name:cpu.usr": 11.12,
    "metric_name:cpu.sys": 12.23,
    "metric_name:cpu.idle": 13.34
  }
}

Related answers from Splunk Community

Get metrics in from other sources

Get metrics in from files in CSV format

Set metrics CSV source types and data inputs

Format a CSV file for multiple-measurement metric data points

Format a CSV file for single-measurement metric data points

Get metrics in from clients over TCP/UDP

Configure field extraction by editing configuration files

Example of configuring field extraction

Get metrics in from clients over HTTP or HTTPS

Create a data input and token for HEC

Send data to a metrics index over HTTP

Example of sending metrics using HEC

The multiple-metric JSON format

Comments

Get metrics in from other sources

Was this topic useful?