Naming conventions for metrics and dimensions 🔗
Read this document to learn about naming conventions and recommendations for custom metrics and dimensions in Splunk Observability Cloud.
All metrics and MTS generated by Splunk Observability Cloud start with the prefix
Types of data in Observability Cloud 🔗
Imported data 🔗
When you use an existing data collection integration such as the collectd agent or the AWS CloudWatch integration, the integration defines metric, dimension, and event names for you. To learn more, see Metric name standards.
To make it easier for you to find and work with metrics coming in from different sources, Splunk Infrastructure Monitoring pulls, transforms, and returns the data in a unified format called virtual metrics. See Virtual metrics in Splunk Infrastructure Monitoring for more information.
Custom data 🔗
When you send custom metrics, dimensions, or events (key-value pairs you send to mark specific events such as a release) to Splunk Infrastructure Monitoring, you choose your own names.
Send custom data to Observability Cloud 🔗
To learn how to send custom metrics in Observability Cloud using our API, see the developer portal .
If you’re using the OpenTelemetry Collector, you can create a receiver to Send custom metrics to Splunk Observability Cloud.
Modify naming schemes you sent to other metric systems 🔗
If you’re working with metrics that you had previously sent to other metric systems, such as Graphite, modify the naming scheme to leverage the full feature set of Splunk Observability Cloud.
Metric name standards 🔗
Metrics are distinct numeric measurements generated by system infrastructure, application instrumentation, or other hardware or software, which change over time. For example:
Count of GET requests received
Percent of total memory in use
Network response time in milliseconds
Read more on metrics in Metrics, data points, and metric time series in Splunk Observability Cloud.
Use descriptive names 🔗
Metric names can have up to 256 characters. Use names that help you identify what the metric is related to.
If you apply a calculation to the metric before you send it, use the calculation as part of the description. For example, if you calculate the ninety-fifth percentile of measurements and send the result in a metric, use
p95 as part of the metric name.
On the other hand, some information is better suited for dimension instead of a metric names, such as the description of the hardware or software being measured. For example, don’t use
production1 to indicate that the measurement is for a particular host. To learn more, see Type of information suitable for dimensions.
Use metric names to indicate metric types 🔗
Follow these best practices to use names to indicate different metric types:
Give each metric its own name.
When you define your own metric, give each metric a name that includes a reference of the metric type.
Avoid assigning custom metric names that include dimensions. For example, if you have 100 server instances and you want to create a custom metric that tracks the number of disk writes for each one, differentiate between the instances with a dimension.
Create metric names using a hierarchical structure 🔗
Start at the highest level, then add more specific values as you proceed.
In this example, all of these metrics have a dimension key called
hostname with values such as analytics-1, analytics-2, and so forth. These metrics also have a customer dimension key with values org-x, org-y, and so on. The dimensions provide an infrastructure-focused or a customer-focused view of the analytics service usage. For more information on gauge metrics, see Identify metric types.
Start with a domain or namespace that the metric belongs to, such as analytics or web.
Next, add the entity that the metric measures, such as jobs or http.
At your discretion, add intermediate names, such as errors.
Finish with a unit of measurement. For example, the SignalFlow analytics service reports the following metrics:
analytics.jobs.total: Gauge metric that periodically measures the current number of executing jobs
analytics.thrift.execute.count: Counter metric that’s incremented each time new job starts
analytics.thrift.execute.time: Gauge metric that measures the time needed to process a job execution request
analytics.jobs_by_state: Counter metric with a dimension key called state, incremented each time a job reaches a particular state.
Dimension names and value standards 🔗
Dimensions are arbitrary key-value pairs you associate with metrics. While metrics identify a measurement, dimensions identify a specific aspect of the system that’s generating the measurement or characterizes the measurement. Use dimensions to:
Classify different streams of data points for a metric.
Simplify filtering and aggregation. For example, SignalFlow lets you filter and aggregate data streams by one or more dimensions.
Dimensions can be numeric or nonnumeric. Some dimensions, such as host name and value, come from a system you’re monitoring. You can also create your own dimensions.
Dimension name requirements 🔗
Dimension names have the following requirements:
UTF-8 string, maximum length of 128 characters (512 bytes).
Dimension values can have a maximum length of 256 characters.
Must start with an uppercase or lowercase letter. The rest of the name can contain letters, numbers, underscores (_) and hyphens (-), and periods (.).
Must not start with the underscore character (_).
Must not start with the prefix
sf_, except for dimensions defined by Observability Cloud such as
Must not start with the prefix
Dimension values are UTF-8 strings with a maximum length of 256 UTF-8 characters (1024 bytes). Numbers are represented as numeric strings.
You can have up to 36 dimensions per MTS. If this limit is exceeded, the data point is dropped, and a message is logged.
To ensure readability, keep names and values to 40 characters or less.
Considerations for metric and dimension names in your organization 🔗
Create consistent names for your organization:
Use a single consistent delimiter in metric names. Using a single consistent delimiter in metric names helps you search with wildcards. Use periods or underscores as delimiters. Don’t use colons or slashes.
Avoid changing metric and dimension names. If you change a name, you have to update the charts and detectors that use the old name. Infrastructure Monitoring doesn’t do this automatically.
Since you’re not the only person using the metric or dimension, use names easy to identify and understand. Follow established conventions. To find out the conventions in your organization, browse your metrics using the Metric Finder.
Guidelines for working with low and high cardinality data 🔗
Send low-cardinality data only in metric names or dimension key names. Low-cardinality data has a small number of distinct values. For example, the metric name
web.http.error.count for a gauge metric that reports the number of HTTP request errors has a single value. This name is also readable and self-explanatory. For more information on gauge metrics, see Identify metric types.
High-cardinality data has a large number of distinct values. For example, timestamps are high-cardinality data. Only send this kind of high-cardinality data in dimension values. If you send high-cardinality data in metric names, Infrastructure Monitoring might not ingest the data. Infrastructure Monitoring rejects metrics with names that contain timestamps. High-cardinality data does have legitimate uses. For example, in containerized environments,
container_id is usually a high-cardinality field. If you include
container_id in a metric name such as
system.cpu.utilization.<container_id>, instead of having one MTS, you have as many MTS as you have containers.
When to use metrics or dimensions 🔗
Use metrics when tracking different metric types 🔗
In Infrastructure Monitoring, all metrics belong to a specific metric type, with a specific default rollup. To learn more about metric types, see Metric types.
To track a measurable value using two different metric types, use two metrics instead of one metric with two dimensions.
For example, suppose you have a
network_latency measurement that you want to send as two different metric types: a gauge metric (the average network latency in milliseconds) and a counter metric (the total number of latency values sent in an interval). In this case, send the measurement using two different metric names, such as
network_latency.count, instead of one metric name with two dimensions
Type of information suitable for dimensions 🔗
See some examples of types of information you can add to dimensions:
Categories rather than measurements: If doing an arithmetic operation on dimension values results in something meaningful, you don’t have a dimension.
Metadata for filtering, grouping, or aggregating.
Name of entity being measured: For example
Metadata with large number of possible values: Use one dimension key for many different dimension values.
Nonnumeric values: Numeric dimension values are usually labels rather than measurements.
Example: Custom metrics and dimensions to measure HTTP errors 🔗
Let’s imagine you want to track the following data to oversee HTTP errors:
Number of errors
HTTP response code for each error
Host that reported the error
Service (app) that returned the error
Suppose you identify your data with a long metric name instead of a metric name and a dimension. For example,
web.http.myhost.checkout.error.500.count might be a long metric name that represents the number of HTTP response code 500 errors reported by the host named
myhost for the service checkout.
If you use
web.http.myhost.checkout.error.500.count, you might encounter the following issues:
To visualize this data in a Splunk Infrastructure Monitoring chart, you have to run a wildcard query with the syntax
To sum up the errors by host, service, or error type, you have to change the query.
You can’t use filters or dashboard variables in your chart.
You have to define a separate metric name to track HTTP 400 errors, or errors reported by other hosts, or errors reported by other services.
Instead, use dimensions to track the same data:
Define a metric name that describes the measurement you want, which is the number of HTTP errors:
web.http.error.count. The metric name includes the following:
web: Your name for a family of metrics for web measurements
http.error: Your name for the protocol you’re measuring (http) and an aspect of the protocol (error)
count: The unit of measure
Define dimensions that categorize the errors. The dimensions include the following:
host: The host that reported the error
service: The service that returned the error
error_type: The HTTP response code for the error
This way, to visualize the error data using a chart, you can search for “error count” to locate the metric by name. When you create the chart, you can filter and aggregate incoming metric time series by host, service, error_type, or all three. You can add a dashboard filter so that when you view the chart in a specific dashboard, you don’t have the chart itself.