Guidance for metric and dimension names 🔗
Splunk Observability Cloud has two main categories of data names: existing names and custom names.
Existing names 🔗
When you use an existing data collection integration such as the collectd agent or the AWS CloudWatch integration, the integration defines metric, dimension, and event names for you. To learn more, see Metric name standards.
Custom names 🔗
When you send custom metrics, dimensions, or events to Splunk Infrastructure Monitoring, you choose your own names. To learn more about custom event names, see Guidance for custom event names.
Modify naming schemes you sent to other metrics systems 🔗
Splunk Infrastructure Monitoring lets you associate arbitrary key-value pairs called dimensions with metrics. Dimensions let you represent multi-dimensional data without overloading your metric name with metadata.
If you send metrics that you previously sent to other metrics systems such as Graphite or New Relic, then modify the naming scheme to leverage the full feature set of Splunk Observability Cloud.
If you already have metrics with period-separated names, use Splunk OTel parse dimensions from metric names. To learn more, see Example: Custom metric name and dimensions.
Metric name standards 🔗
Metrics are distinct numeric measurements that change over time. Metrics are generated by system infrastructure, application instrumentation, or other hardware or software. The following are some examples of metrics:
Count of GET requests received
Percent of total memory in use
Network response time in milliseconds
If you apply a calculation to the metric before you send it, you can also use the calculation as part of the description. For example, if you calculate the ninety-fifth percentile of measurements and send the result in a metric, use p95 as part of the metric name. This table lists the type of information you can apply to a metric name:
On the other hand, some information is better to include in a dimension instead of a metric name, such as description of hardware or software being measured. For example, don’t use
production1 to indicate that the measurement is for a particular host. To learn more, see Dimension name and value standards.
Create metric names using a hierarchical left to right structure 🔗
Start at the highest level, then add more specific values as you proceed. In this example, all of these metrics have a dimension key called
hostname with values such as analytics-1, analytics-2, and so forth. These metrics also have a customer dimension key with values org-x, org-y, and so on. The dimensions provide an infrastructure-focused or a customer-focused view of the analytics service usage. For more information on gauge metrics, see Identify metric types.
Start with a domain or namespace that the metric belongs to, such as analytics or web.
Next, add the entity that the metric measures, such as jobs or http.
At your discretion, add intermediate names, such as errors.
Finish with a unit of measurement. For example, the SignalFlow analytics service reports the following metrics:
analytics.jobs.total: Gauge metric that periodically measures the current number of executing jobs
analytics.thrift.execute.count: Counter metric that’s incremented each time new job starts
analytics.thrift.execute.time: Gauge metric that measures the time needed to process a job execution request
analytics.jobs_by_state: Counter metric with a dimension key called state, incremented each time a job reaches a particular state.
Use different metric names to indicate metric types 🔗
It is necessary to use different metric names to indicate metric types. When you create metric names, follow these best practices:
Give each metric its own name
When you define your own metric, give each metric a name that includes a reference of the metric type.
Avoid assigning custom metric names that include dimensions. For example, if you have 100 server instances and you want to create a custom metric that tracks the number of disk writes for each one, differentiante between the instances with a dimension.
Metric types and rollups 🔗
In Infrastructure Monitoring, all metrics have a single metric type, with a specific default rollup. A rollup is a statistical function that takes all the data points for an MTS over a time period and outputs a single data point. Observability Cloud applies rollups after it retrieves the data points from storage but before it applies analytics functions. For more information on rollups, see Rollups in Data resolution and rollups in charts.
The following list shows the types and their default rollups:
Gauge metric: Average (mean() SignalFlow function)
Counter metric: Sum (sum() SignalFlow function)
Cumulative counter: Delta (delta() SignalFlow function). This measures the change in the value of the metric from the previous data point.
To track a measurable value using two different metric types, use two metrics instead of one metric with two dimensions. For example, suppose you have a
network_latency measurement that you want to send as two different types:
Gauge metric: Average network latency in milliseconds
Counter metric: Total number of latency values sent in an interval
Send the measurement using two different metric names, such as
network_latency.count, instead of one metric name with two dimensions type:average and type:count.
Dimension name and value standards 🔗
Dimensions are arbitrary key-value pairs you associate with metrics. They can be numeric or non-numeric. Some dimensions, such as host name and value, come from a system you’re monitoring. You can also create your own dimensions. Metrics identify a measurement, whereas dimensions identify a specific aspect of the system that’s generating the measurement or characterizes the measurement.
Dimension names have the following requirements:
UTF-8 string, maximum length of 128 characters (512 bytes).
Must start with an uppercase or lowercase letter. The rest of the name can contain letters, numbers, underscores (_) and hyphens (-).
Must not start with the underscore character (_).
Must not start with the prefix
sf_, except for dimensions defined by Observability Cloud such as
Must not start with the prefix
Dimension values are UTF-8 strings with a maximum length of 256 UTF-8 characters (1024 bytes). Numbers are represented as numeric strings.
You can have up to 36 dimensions per MTS.
To ensure readability, keep names and values to 40 characters or less.
Length limits for metric name, dimension name, and dimension value 🔗
Metric and dimension length specifications:
Metric names up to 256 characters
Dimension names up to 128 characters
Dimension values up to 256 characters
Example: dimensions 🔗
The following are some examples of dimensions:
Benefits of dimensions 🔗
The following are some examples of benefits of dimensions:
Classification of different streams of data points for a metric.
Simplified filtering and aggregation. For example, SignalFlow lets you filter and aggregate data streams by one or more dimensions.
Types of information that are suitable for dimension values 🔗
The following are some examples of types of information that you can add to dimensions:
Categories rather than measurements: If doing an arithmetic operation on dimension values results in something meaningful, you don’t have a dimension.
Metadata for filtering, grouping, or aggregating
Name of entity being measured: For example
Metadata with large number of possible values: Use one dimension key for many different dimension values.
Non-numeric values: Numeric dimension values are usually labels rather than measurements.
Example: Custom metric name and dimensions 🔗
For example, consider the measurement of HTTP errors.
You want to track the following data:
Number of errors
HTTP response code for each error
Host that reported the error
Service (app) that returned the error
Suppose you identify your data with a long metric name instead of a metric name and a dimension. A metric name that represented the number of HTTP response code 500 errors reported by the host named myhost for the service checkout would have to be the following:
As a result of using this metric name, you’d experience the following:
To visualize this data in a Splunk Infrastructure Monitoring chart, you would have to perform a wildcard query with the syntax
To sum up the errors by host, service, or error type, you would have to change the query.
You couldn’t use filters or dashboard variables in your chart.
You would have to define a separate metric name to track HTTP 400 errors, or errors reported by other hosts, or errors reported by other services.
Leverage dimensions to track the same data you can do the following:
Define a metric name that describes the measurement you want, which is the number of HTTP errors: web.http.error.count. The metric name includes the following:
web: Your name for a family of metrics for web measurements
http.error: Your name for the protocol you’re measuring (http) and an aspect of the protocol (error)
count: The unit of measure
Define dimensions that categorize the errors. The dimensions include the following:
host: The host that reported the error
service: The service that returned the error
error_type: The HTTP response code for the error
When you want to visualize the error data using a chart, you can search for “error count” to locate the metric by name. When you create the chart, you can filter and aggregate incoming metric time series by host, service, error_type, or all three. You can add a dashboard filter so that when you view the chart in a specific dashboard, you don’t have the chart itself.
Considerations for metrics and dimensions names for your organization 🔗
Keep this guidance in mind so that you can create a consitent naming proccess in your organization.
Use a single consistent delimiter in metric names. Using a single consistent delimiter in metric names helps you search with wildcards. Use periods or underscores as delimiters. Don’t use colons or slashes.
Avoid changing metric and dimension names. If you change a name, you have to update the charts and detectors that use the old name. Infrastructure Monitoring doesn’t do this automatically.
Remember that you’re not the only person using the metric or dimension. Use names that others in your organization can identify and understand. Follow established conventions. To find out the conventions in your organization, browse your metrics using the Metric Finder.
Guidelines for working with low and high cardinality data 🔗
Send low-cardinality data only in metric names or dimension key names. Low-cardinality data has a small number of distinct values. For example, the metric name web.http.error.count for a gauge metric that reports the number of HTTP request errors has a single value. This name is also readable and self-explanatory. For more information on gauge metrics, see Identify metric types.
High-cardinality data has a large number of distinct values. For example, timestamps are high-cardinality data. Only send this kind of high-cardinality data in dimension values. If you send high-cardinality data in metric names, Infrastructure Monitoring might not ingest the data. Infrastructure Monitoring specifically rejects metrics with names that contain timestamps. High-cardinality data does have legitimate uses. For example, in containerized environments, container_id is usually a high-cardinality field. If you include container_id in a metric name such as
system.cpu.utilization.<container_id>, instead of having one MTS, you would have as many MTS as you have containers.
Guidance for custom event names 🔗
Custom events are collections of key-value pairs you can send to Infrastructure Monitoring to display in charts and view in event feeds. For example, you can send “release” events whenever you release new code and then correlate metric changes with releases by overlaying the release events on charts. The Metric and dimension key naming standards also apply to custom event naming.