Configure thresholds
All metrics listed on this page can be found in VMware's PerformanceManager or VMware VirtualMachineQuickStats unless it is a Splunk defined metric.
cpu
PerfType | Splunk metric name (vCenter API counter name) |
Entity | Threshold Value | Description |
---|---|---|---|---|
cpu | AvgUsg_Pct (cpu.usage.average) |
vm | Critical > 75% Warning > 50% |
Virtual machine's average usage in percent. |
host | Critical > 90%, Warning > 75% |
Average usage of the host's cpu in percent. | ||
SumRdy_ms (cpu.ready.summation) |
vm | Critical >2000 Warning >1000 |
Virtual machine's state waiting for cpu time measured in milliseconds. | |
host | Critical > 2000 Warning >1000 |
Amount of time in milliseconds the host waited for cpu cycles. | ||
AvgDemand_MHz (cpu.demand.average) |
vm | Critical < 0 Warning < 0 |
The amount of cpu resources that a virtual machine would use if there was no cpu limit and no contention for cpu. | |
host | Critical < 0 Warning < 0 |
The aggregate amount of cpu resources that all virtual machines would use if there was no cpu limit and no contention for cpu. | ||
AvgUsg_MHz (cpu.usagemhz.average) |
vm | Critical < 0 Warning < 0 |
The CPU usage, measure in megahertz. This is the amount of actively used vCPU. This is the hypervisor's view of the CPU usage, not the guest OSes version of the same metric. Less than 0 indicates the VM is using any CPU. | |
host | Critical < 0 Warning < 0 |
This is the CPU usage measured in megahertz. This is the aggregate of CPU usage across all VMs on a host. Less than 0 indicates that none of the VMs on the host require CPU usage |
mem
PerfType | Splunk metric name (vCenter API counter name) |
Entity | Threshold Value | Description |
---|---|---|---|---|
mem | AvgUsg_Pct (cpu.usage.average) |
vm | Critical >= 75% Warning >= 90% |
Virtual machine's average usage in percent. |
host | Critical >= 75% Warning |
Average usage of the host's cpu in percent. | ||
AvgAct_KB (mem.active.average) |
vm | Critical >95 Warning >75 |
A virtual machine's that is actively in use. | |
host | Critical > 95 Warning >75 |
Average amount of all memory in active state by all virtual machines and the vpxd services. | ||
AvgConsum_KB (mem.consumed.average) |
vm | Critical >95 Warning >75 |
Virtual machine's memory - memory saved by memory sharing. | |
host | Critical > 95 Warning > 75 |
Average amount of memory being consumed by the host. This includes all virtual machines and the overhead of the vmkernal. | ||
AvgOvrhd_KB (mem.overhead.average) |
vm | Critical > 95 Warning > 75 |
Memory used by vmware to actually power the virtual machine. | |
host | Critical > 95 Warning> 75 |
The average overhead of all virtual machines and the overhead of the vSphere. | ||
AvgGrtd_KB (mem.overhead.average) |
vm | Critical > 95 Warning > 75 |
Physical memory that is mapped to the virtual machine. Does not include overhead memory. | |
host | Critical > 95 Warning > 75 |
Average memory granted to all virtual machines and vSphere. | ||
AvgVmctl_KB (mem.vmmemctl.average) |
vm | Critical > 10 Warning > 2 |
Amount of physical memory that is being reclaimed by the host through vmware's ballooning driver. Frequent ballooning is a sign of a host in stress. | |
host | Critical > 10 Warning > 2 |
The sum of all vmmemctl values for all powered-on virtual machines. This value may be greater then the ballloon value of the host which is a sign of the kernel trying to have more virtual machines to release memory. | ||
AvgSwpIn_KB (mem.swapin.average) |
vm | Critical > 10 Warning > 0 |
Memory that's being read by the virtual machine from the hosts swap file. Any amount of swapping is a sign of a host in stress. | |
host | Critical > 10 Warning > 0 |
Combined sum of all the swap-in values for all powered-on virtual machines. | ||
AvgSwpOut_KB (mem.swapout.average) |
vm | Critical >10 Warning > 0 |
The amount of memory the virtual machine has had to write to a swap file. | |
host | Critical > 10 Warning > 0 |
Combined sum of all the swap-off values for all powered-on virtual machines. | ||
AvgSwpd_KB (mem.swapped.average) |
vm | Critical > 5000 Warning > 0 |
Amount of memory from a virtual machine that has been swapped by the host. This is a host swapping and is always a sign of the host being in stress. Any time this threshold is triggered, the host has no memory, and can not reclaim it from the ballooning driver. | |
AvgSwpUsd_KB (mem.swapped.average) |
host | Critical >= 5000 Warning >= 0 |
Amount of memory from all virtual machine that has been swapped by the host. This is a host swapping and is always a sign of the host being in stress. Any time this threshold is triggered, the host has no memory, and can not reclaim it from the ballooning driver. |
inv
PerfType | Splunk metric name (vCenter API counter name) |
Entity | Threshold Value | Description |
---|---|---|---|---|
inv | PercentHighCPUVm | vm | Critical > 75 Warning > 50 |
This is a Splunk metric. The threshold is implemented on top of VMInvCpuMaxUsg. Used on the home_proactive_monitoring dashboard to to give a warning / critical level of vms that are in a "critical" state. This allows you to color the gauges based on the % of vm's in critical state out of the total number of vms. |
PercentHighMemVm | vm | Critical > 75 Warning > 50 |
This is a Splunk metric.The threshold is implemented on top of VMInvMemMaxUsg. Used on the home_proactive_monitoring dashboard to give a warning / critical level of vms that are in a "critical" state. This allows you to color the gauges based on the % of vm's in critical state out of the total number of vms. | |
PercentHighSumRdyVm | vm | Critical > 75 Warning > 50 |
This is a Splunk metric.The threshold is implemented on top of SumRdy_ms. Used on the home_proactive_monitoring dashboard to give a warning / critical level of vms that are in a "critical" state. This allows you to color the gauges based on the % of vm's in critical state out of the total number of vms. | |
VMinvCpuMaxUsg | vm | Critical > 90 Warning > 75 |
This is a Splunk metric.This threshold is based on the max cpu that the host can give a vm. It is not the max of the reservations. If the vm is >= 100%, the vm is requesting more cpu then the host can allocate. | |
VMinvMemMaxUsg | vm | Critical > 90 Warning > 75 |
This is a Splunk metric.This is the a threshold that's based on the max mem that the host could give a vm. Not the max of the reservations. If the vm is >= 100%, the vm is requesting more mem then the host can allocate. | |
PercentHighBalloonHosts | Host | Critical > 75 Warning > 50 |
This is a Splunk metric. This threshold is a threshold on top of BalloonedMemory_MB. Used on the home_proactive_monitoring dashboard to give a warning / critical level of hosts that are in a "critical" state. This allows you to color the gauges based on the % of hosts in critical state out of the total number of hosts. | |
PercentHighSwapHosts | Host | Critical > 75 Warning > 50 |
This is a Splunk metric. This threshold is a threshold on top of SwappedMemory_MB. Used on the home_proactive_monitoring dashboard to give a warning / critical level of hosts that are in a "critical" state. This allows you to color the gauges based on the % of hosts in critical state out of the total number of vms. | |
PercentHighCPUHosts | Host | Critical > 75 Warning > 50 |
This is a Splunk metric. This threshold is a threshold on top of AvgUsg_pct. Used on the home_proactive_monitoring dashboard to give a warning / critical level of vms that are in a "critical" state. This allows you to color the gauges based on the % of hosts in critical state out of the total number of hosts. | |
BalloonedMemory_MB (balloonedMemory) |
Host | Critical >= 10 Warning >= 2 |
This metric belongs to the VMware VirtualMachineQuickStats object type. Pulled from inventory data based on the reported vms that exist on the host at the time of collection. The threshold is based on the total amount of memory in MB that is reclaimed from all of the vms on that host. | |
SwappedMemory_MB (swappedMemory) |
Host | Critical > 5 Warning > 0 |
This metric belongs to the VMware VirtualMachineQuickStats object type. Pulled from inventory data based on the reported vms that exist on the host at the time of collection. The threshold is based on the total amount of memory in MB that is being swapped from all vms on that host. | |
RemainingCapacity_GB |
Datastore | Critical <= 50 Warning <= 100 |
This is a Splunk metric. Changes state based on the remaining disk space in gigabytes on a datastore. | |
Overprovisioned_GB | Datastore | Critical > 95 Warning > 75 |
This is a Splunk metric. Changes state based on how much space is over-provisioned in gigabytes. Negative numbers are a representation of an under-provisioned datastore. |
disk
PerfType | Splunk metric name (vCenter API counter name) |
Entity | Threshold Value | Description |
---|---|---|---|---|
disk | AvgRd (disk.read.average) |
vm | Critical > 95% Warning > 75% |
Average read rate in kilobytes per second to the virtual disks attached. |
host | Critical > 95% Warning > 75% |
Average kilobytes read from each LUN on the host. | ||
AvgWr (disk.write.average) |
vm | Critical > 95% Warning > 75% |
Average write rate in kilobytes per second to the virtual disks attached. | |
host | Critical > 95% Warning > 75% |
Average kilobytes written to each LUN on the host. | ||
AvgUsg_KBps (disk.usage.average) |
vm | Critical > 95% Warning > 75% |
Average I/O rate to the virtual disk. | |
host | Critical > 95% Warning > 75% |
Average aggregated disk I/O for all virtual machines running on the host. | ||
SumWr (disk.numberWrite.summation) |
vm | Critical > 95% Warning > 75% |
Number of times the virtual machine wrote to it's virtual disk. | |
host | Critical > 95% Warning > 75% |
Total number of writes to the target LUN. | ||
SumRd (cpu.ready.summation) |
vm | Critical > 95% Warning > 75% |
Number of times the virtual machine read from it's virtual disk. | |
host | Critical > 95% Warning > 75% |
Total number of reads from the target LUN. | ||
AvgTotLat_ms (disk.totalLatency.average) |
vm | Critical > 30% Warning > 15% |
Time in milliseconds it took to process a SCSI command by the virtual machine. | |
host | Critical > 30% Warning > 15% |
The sum in milliseconds of the kernel requests to the device. | ||
AvgQueLat_ms (disk.queueLatency.average) |
vm | Critical > 5% Warning > 1% |
Time in milliseconds that a virtual machines request spent in a queue state. | |
host | Critical > 5% Warning > 1% |
The sum in milliseconds a request spent in a queue state. | ||
SumCmdsAbort (disk.commandsAborted.summation) |
vm | Critical > 2% Warning > 0% |
Number of commands that were aborted on the virtual machine. | |
host | Critical > 2% Warning > 0% |
Number of commands that were aborted on the host. | ||
SumBusResets (disk.busResets.summation) |
vm | Critical > 2% Warning > 0% |
Number of SCSI-bus reset commands that were issued. | |
host | Critical > 2% Warning > 0% |
Number of SCSI-bus reset commands that were issued. |
net
PerfType | Splunk metric name (vCenter API counter name) |
Entity | Threshold Value | Description |
---|---|---|---|---|
net | AvgRvcd_KBps (net.received.average) |
vm | Critical > 95% Warning > 75% |
Average kilobytes read across the virtual machine's virtual nic. |
host | Critical > 95% Warning > 75% |
Average amount of data in kilobytes received across the host's physical adapter. | ||
AvgXmit_KBps (net.transmitted.average) |
vm | Critical > 95% Warning > 75% |
Average kilobytes broadcasted across the virtual machine's virtual nic. | |
host | Critical > 95% Warning > 75% |
Average amount of data in kilobytes broadcasted across the host's physical adapter. | ||
AvgUsg_KBps (net.usage.average) |
vm | Critical > 95% Warning > 75% |
Combined broadcast and received rates across all virtual NIC instances. | |
host | Critical > 95% Warning > 75% |
Combined broadcast and received rates across all physical NIC instances. |
Set the time zone for vCenter log files | Controlling data volumes |
This documentation applies to the following versions of Splunk® App for VMware (Legacy): 2.0
Feedback submitted, thanks!