Docs » Monitor hosts

Monitor hosts 🔗

You can monitor host metrics with Splunk Observability Cloud. Before you can start monitoring hosts, you must sign in with your administrator credentials, and export metrics from the hosts. The following resources explain how to export metrics from supported host types:

Observability Cloud provides infrastructure monitoring capabilities powered by Splunk Distribution of OpenTelemetry Collector. If you’re also exporting logs from hosts and want to learn about how to view logs in Observability Cloud, see Troubleshoot with logs.

You can also export and monitor data related to hosts, as described in the following table.

Get data in

Monitor

Description

Connect to the cloud service provider your hosts run in, if any.

Collect application spans and traces

Monitor applications with Splunk APM

Collect metrics and spans from applications running on hosts.

Monitor hosts from the Infrastructure page 🔗

View the health of your entire data center at a glance from the Infrastructure page. This page provides key performance information about each host at a glance.

Follow these steps to analyze problem hosts from the Infrastructure page:

  1. Select Navigation menu > Infrastructure and select the Hosts category.

  2. Compare hosts along the following metrics with the Color by drop-down list. In the heat map, colors represent the health of instances based on the metrics you select. For example, a heat map that shows green and red, uses green to denote healthy and red to denote unhealthy instances. If your heat map has multiple colors, then the lighter gradient represents less activity, and the darker gradient represents more activity. To apply visually accessible color palettes on custom dashboards and charts and throughout Infrastructure Monitoring, select Account Settings > Color Accessibility.

    Metric

    Description

    cpu.utilization

    Hosts with CPU utilization under 20% are green. Hosts with CPU utilization over 80% are red.

    memory.utilization

    Hosts with memory utilization under 20% are green. Hosts with memory utilization over 80% are red.

    disk.summary_utilization

    Hosts with disk space utilization under 20% are green. Hosts with disk space utilization over 80% are red.

    network.total

    Relative comparison where hosts with the lowest 20% of network traffic are green and hosts with the highest 20% of network traffic are red.

    disk_ops.total

    Relative comparison where hosts with the lowest 20% of disk operations are green and hosts with the highest 20% of disk operations are red.

  3. Group hosts based on metadata about each host with the Group by drop-down list.

    For example, you can see hosts in groups according to the region they are running in, the operating system version, or the environment tag. Use this to see correlations between different parts of your infrastructure and its performance.

  4. Find outliers for your metrics with the Find Outliers setting. Specify the Scope and Strategy.

    Set the Scope to analyze outliers from across the entire visible population of instances, or only within groups defined by the dimension you grouped instances by.

    You can select one of two Strategies to find outliers, as described in the following table.

    Strategy

    Description

    Deviation from Mean

    Hosts appear as red that exceed the mean value of the metric by at least three standard deviations. Use this setting to find the most extreme outliers.

    Deviation from Median

    Hosts appear as red that exceed the median absolute deviation value by at least three absolute deviations. This setting does not weigh extreme outliers as heavily as the standard deviation.

  5. Select a specific host you want to investigate further to view all the metadata and key metrics for the host. For every host, Observability Cloud provides a default dashboard.

    Analyze all the available metadata about the cloud service the host is running in, the host itself, and any custom tags associated with the host. The default dashboard provides metric time series (MTS) for the following metrics:

    • CPU utilization

    • Memory utilization

    • Disk space utilization

    • Disk operations

    • Network I/O