1. CPU load average:
The number mean of CPU loads in 1min,5mins,and 15mins , based on 6-second samples. “1” means that almost CPU have already been used, “2” means that the required CPU of vm is 2 times of physical CPU , And “0.5” means 50% of physical CPU have been consumed.
The percentage of unhalted CPU cycles per PCPU, and its average over all PCPUs.
Q: What does it mean if PCPU UTIL% is high?
A: It means that you are using lots of resource. (a) If all of the PCPUs are near 100%, it is possible that you are overcommiting your CPU resource. You need to check RDY% of the groups in the system to verify CPU overcommitment. Refer to RDY% below. (b) If some PCPUs stay near 100%, but others are not, there might be an imbalance issue. Note that you’d better monitor the system for a few minutes to verify whether the same PCPUs are using ~100% CPU. If so, check VM CPU affinity settings.
3."CORE UTIL(%)" (only displayed when hyper-threading is enabled)
The percentage of CPU cycles per core when at least one of the PCPUs in this core is unhalted, and its average over all cores. It’s the reverse of the "CORE IDLE" percentage, which is the percentage of CPU cycles when both PCPUs in this core are halted.
If hyper-threading is used, get the average "CORE UTIL(%)" directly. Otherwise, i.e. hyper-threading is unavailable or disabled, a PCPU is a Core, then We can just use the average "PCPU UTIL(%)". Based on esxtop batch output, we can use something like below.
Q: What is the difference between "PCPU UTIL(%)" and "CORE UTIL(%)"?
A: A core is utilized, if either or both of the PCPUs on this core are utilized. The percentage utilization of a core is not the sum of the percentage utilization of both PCPUs.
While "PCPU UTIL(%)" indicates how much time a PCPU was busy (unhalted) in the last duration, "PCPU USED(%)" shows the amount of "effective work" that has been done by this PCPU. The value of "PCPU USED(%)" can be different from "PCPU UTIL(%)" mainly for the following two reasons:Hyper-threading;Power Management.
Q: What is the difference between "PCPU UTIL(%)" and "PCPU USED(%)"?
A: While "PCPU UTIL(%)" indicates how much time a PCPU was busy (unhalted) in the last duration, "PCPU USED(%)" shows the amount of "effective work" that has been done by this PCPU. The value of "PCPU USED(%)" can be different from "PCPU UTIL(%)" mainly for the following two reasons:
(2) Power Management
Q: Why is average CPU usage in vSphere client ~100%, but, average "PCPU USED(%)" in esxtop is ~50%?
A: Same as above. It is likely due to hyper-threading. The average CPU usage in vSphere client is deliberately doubled when hyper-threading is used; while esxtop does not double the average "PCPU USED(%)", which would otherwise mean the average USED% of all the cores.
Q: How do I retrieve the average core USED% no matter whether hyper-threading is used.
A: If hyper-threading is used, USED% for a core would be the sum of USED% for the corresponding PCPUs on that core. So, the average core USED% doubles the average PCPU USED%. Otherwise, i.e. hyper-threading is unavailable or disabled, a PCPU is a core, then We can just use the average "PCPU USED(%)".
Percentages of total CPU time as reported by the ESX Service Console. "us" is for percentage user time, "sy" is for percentage system time, "id" is for percentage idle time and "wa" is for percentage wait time. "cs/sec" is for the context switches per second recorded by the ESX Service Console.
Q: What’s the difference of CCPU% and the console group stats?
A: CCPU% is measured by the COS. "console" group CPU stats is measured by VMKernel. The stats are related, but not the same.
6. vm %RDY
A world in a run queue is waiting for CPU scheduler to let it run on a PCPU. %RDY accounts the percentage of this time.
The %RDY value is a sum of all vCPU %RDY for the VM. Some examples:
The max %RDY value of a 1vCPU VM is 100%
The max %RDY value of a 4vCPU VM is 400%
The recommended RDY% thresholds is :
%RDY 10 for 1 vCPU
%RDY 20 for 2 vCPU
%RDY 40 for 4 vCPU
For 1 vCPU of vm :
0-4 per vCPU = green
5-9 per vCPU = yellow
10+ per vCPU = red
To determine whether the poor performance is due to a CPU constraint:
· Examine the load average on the first line of the command output.
A load average of 1.00 means that the ESXi/ESX Server machine’s physical CPUs are fully utilized, and a load average of 0.5 means that they are half utilized. A load average of 2.00 means that the system as a whole is overloaded.
· Examine the %READY field for the percentage of time that the virtual machine was ready but could not be scheduled to run on a physical CPU.
Under normal operating conditions, this value should remain under 5%. If the ready time values are high on the virtual machines that experience bad performance, then check for CPU limiting:
· Make sure the virtual machine is not constrained by a CPU limit set on itself
· Make sure that the virtual machine is not constrained by its resource pool.
The CPU load average at the top of the screen can be a quick way to determine if your physical CPUs are being hammered on that particular host. The load average is represented in 1, 5 and 15 minutes from left to right based on 6 second samples. The
CPU load takes into account the ready time and run time for all groups on the host e pool.