View organization metrics for Splunk Observability Cloud 🔗
Splunk Observability Cloud provides the specific metrics so you can measure your organization’s usage of the platform.
Org metrics include:
Ingest metrics: Measure the data you’re sending to Infrastructure Monitoring, such as the number of data points you’ve sent.
App usage metrics: Measure your use of application features, such as the number of dashboards in your organization.
Integration metrics: Measure your use of cloud services integrated with your organization, such as the number of calls to the AWS CloudWatch API.
Resource metrics: Measure your use of resources that you can specify limits for, such as the number of custom metric time series (MTS) you’ve created.
You’re not charged for these metrics, and they don’t count against any system limits.
Access organization metrics 🔗
If you’re an admin, you can view some of these metrics in built-in charts on the Organization Overview page. Any user can view these metrics in custom charts.
To access the Organization Overview page, follow these steps:
Log into Observability Cloud.
On the left nav, select Settings, then select Organization Overview.
Select the tab for the metrics you want to view:
Engagement: Metrics about your users and the charts, detectors, dashboards, dashboard groups, and teams they’ve created.
APM entitlements: For APM troubleshooting.
APM throttling: These charts highlight metrics that track throttling and limiting in your organization.
IMM entitlements: For IMM troubleshooting.
IMM system limits: These charts identify metrics that track usage of system limits in your organization.
IMM throttling: These charts highlight metrics that track throttling and limiting in your organization.
Cloud integrations: These charts highlight metrics that track errors and throttling which might limit collection of telemetry from cloud provider APIs.
Interpret and work with org metrics 🔗
This section provides tips that can help you interpret and work with usage metrics.
Data limiting, data throttling, and data filtering 🔗
Data is also filtered out of the platform, and can be tracked with certain org metric values:
Data can be automatically filtered out by certain components, such as the SignalFx exporter
Invalid data is also filtered once it reaches the platform. For example, data points without a metric name or value are invalid and will be dropped. Same with spans without a trace or span id.
num metric values 🔗
Some metrics report a
gross value and a
num value. Compare the
num values of a metric to verify if the system has
limited or filtered out data.
grossmetric reports the total number of data points the system receives before any throttling or filtering kicks in.
nummetric reports the total number of data points the system receives after it completes any throttling or filtering.
Metrics that track system limits 🔗
These metrics track limits that Infrastructure Monitoring enforces for your organization. If you exceed these limits, your data might be dropped:
sf.org.limit.activeTimeSeries(gauge): Maximum number of active MTS, within a moving window of the past 25 hours, that your organization can have. If you exceed this limit, Infrastructure Monitoring stops accepting data points for new MTS, but continues to accept data points for existing MTS. To monitor your usage against the limit, use the metric
sf.org.limit.containers(gauge): Maximum number of containers that can send data to your organization. This limit is higher than your contractual limit to allow for burst and overage usage. If you exceed this limit, Infrastructure Monitoring drops data points from new containers but keeps accepting data points for existing containers. To monitor your usage against the limit, use the metric
sf.org.numResourcesMonitoredand filter for the dimension
sf.org.limit.computationsPerMinute(gauge): Maximum number of SignalFlow computations per minute.
sf.org.limit.customMetricMaxLimit(gauge): Maximum number of active custom MTS, within a moving window of the previous 60 minutes, that can send data to your organization. If you exceed this limit, Infrastructure Monitoring drops data points for the custom MTS that exceeded the limit, but it continues to accept data points for custom MTS that already existed. See the custom metrics you’ve defined with
To learn more about custom MTS, see About custom, bundled, and high-resolution metrics
sf.org.limit.customMetricTimeSeries(gauge): Maximum number of active custom MTS.
sf.org.limit.detector(gauge): Maximum number of detectors that you can use for your organization. After you reach this limit, you can’t create new detectors. To monitor the number of detectors you create, use the metric
sf.org.limit.eventsPerMinute(gauge): Maximum number of incoming events per minute.
sf.org.limit.hosts(gauge): Maximum number of hosts that can send data to your organization. The limit is higher than your contractual limit to allow for burst and overage usage. If you exceed this limit, Infrastructure Monitoring drops data points from new hosts but keeps accepting data points for existing hosts. To monitor your usage against the limit, use the metric
sf.org.numResourcesMonitoredand filter for the dimension
sf.org.limit.metricTimeSeriesCreatedPerMinute(gauge): Maximum rate at which you can create new MTS in your organization, measured in MTS per minute. If you exceed this rate, Infrastructure Monitoring stops accepting data points for new MTS, but continues to accept data points for existing MTS. To monitor the number of metrics you’ve created overall, use the metric
Metrics that track data throttling 🔗
As explained in the previous section, certain system limits act as a “ceiling”, or a maximum number of elements allowed in Observability Cloud. But the platform also limits ingestion pace. If you exceed your rate limits, Observability Cloud might throttle, or slow down, the data you send in.
While org metrics whose name contains
you’ve hit an amount limit, metrics with
throttled (for example,
sf.org.numThrottledMetricTimeSeriesCreateCalls) show that you’ve hit
a rate/time limit, and therefore you won’t be able to send in more data
points per minute.
See more in Per product system limits
Metrics for values by token 🔗
In some cases, Infrastructure Monitoring has two similar metrics:
One metric, such as
sf.org.numAddDatapointCalls, represents the total across your entire organization.
The similar metric,
sf.org.numAddDatapointCallsByToken, represents the total for each unique access token you use.
The sum of all the by token metric values for a measurement might be
less than the total value metric value. For example, the sum, of all
sf.org.numAddDatapointCallsByToken values might be less than the
sf.org.numAddDatapointCalls. The sums differ because
Infrastructure Monitoring doesn’t use a token to retrieve data from
cloud services you’ve integrated. Infrastructure Monitoring counts the
data point calls for the integrated services, but it doesn’t have a way
to count the calls for any specific token.
This difference in values applies to AWS CloudWatch, GCP StackDriver, and AppDynamics.
Metrics with values for each metric type 🔗
Some metrics have a value for each metric type (counter, cumulative
counter, or gauge), so you have three MTS per metric. Each MTS has a
category with a value of
GAUGE. Because you can have multiple MTS
for these metrics, you need to use the
sum() SignalFlow function to
see the total value.
For example, you might receive three MTS for
sf.org.numMetricTimeSeriesCreated, one for the number of MTS that
are counters, another for the number of MTS that are cumulative
counters, and a third for the number of MTS that are gauges.
Also, you can filter by a single value of
category, such as
GAUGE, to see only the metrics of that type.
A metric that counts stopped detectors 🔗
sf.org.numDetectorsAborted monitors the number of
detectors that Infrastructure Monitoring stopped because the detector
reached a resource limit. In most cases, the detector exceeds the limit
of 250K MTS. This condition also generates the event
sf.org.abortedDetectors, which records details including the
detector ID, the reason it stopped, and the value or limit of MTS or
data points, whichever caused the detector to stop.
To learn more, see Add context to metrics using events
Cloud authentication error metrics 🔗
Editing a role and removing a user’s permissions to cloud services might generate authentication errors from your cloud service provider. When this happens, Observability Cloud integrations won’t work properly, and won’t be able to collect data and metadata from your services.
Observability Cloud has the following metrics to track auth errors:
If you’re getting any of these errors, you need to fix your roles or tokens so Observability Cloud can retrieve your data.
You can use these errors in dashboards to detect whether you’re experiencing this issues.
Child org metrics 🔗
If a parent org has associated child organizations, child org metrics
are also added to Observability Cloud. They represent the same values as
the equivalent parent org metric, and you can identify them with the
sf.org.child.numCustomMetrics represents the number of
custom metrics Observability Cloud monitors for the child org, the same
sf.org.numCustomMetrics is the number of custom metrics
monitored for the parent org.
List of organization metrics 🔗
Use the Metric Finder to find your org metrics.
Observability Cloud provides the following organization metrics:
If you are a Splunk Observability Cloud customer and are not able to see your data in Splunk Observability Cloud, you can get help in the following ways.
Available to Splunk Observability Cloud customers
Available to prospective customers and free trial users
Ask a question and get answers through community support at Splunk Answers .
Join the Splunk #observability user group Slack channel to communicate with customers, partners, and Splunk employees worldwide. To join, see Chat groups in the Get Started with Splunk Community manual.
To learn about even more support options, see Splunk Customer Success .