VMware HA Failover Capacity = 0 March 11, 2010Posted by General Zod in Tech, VMware.
I spun up my VI client this morning, and found that my HA Failover Capacity had dropped to 0 Hosts. WTF? From a resources capacity standpoint, I’ve got some serious overkill… so unless someone’s been creating a ton of VMs without my knowledge, then there must be some mistake.
So I did some investigation and discovered what I believe to be the cause of this HA Failover result. In truth, there wasn’t really a problem other than that VMware crunches the HA capacity numbers with what I like to call “lazy math”. I shall explain…
Let’s pretend you have an ESX Cluster with 4 Host servers with your HA Failover configured to 1. Each Host has the following amount of physical memory installed:
Host1: 64 GB
Host2: 64 GB
Host3: 32 GB
Host4: 64 GB
Also, within your cluster let’s say you have 26 VMs with the following “configured memory” assignments:
2 VMs with 4 GB
5 VMs with 2 GB
8 VMs with 1 GB
11 VMs with 512 MB
Note that one of your Host servers has less installed physical memory (32 GB) than the other Hosts. Also, make note that a couple of the VMs have as much as 4 GB of memory assigned to it.
VMware divides the minimum physical Host memory by the largest assigned VM memory, and assumes that this is the maximum VM capacity per host.
Capacity per Host: 32 / 4 = 8 VMs per Host
Since you have 4 Host servers and your HA Failover is configured to 1, then we must calculate your HA Failover Capacity under the assumption that one of the Hosts has failed. So VMware calculates the maximum number of VMs that you cluster can handle as…
Max HA Capacity: 8 VMs per Host x 3 Hosts = 24 VMs
As I’ve previously stated in this example, you have 26 VMs in your cluster, which is in excess of the detected HA Capacity… so VMware reports your HA Failover Capacity to be 0.
The cluster has way more resources than is required to house these VMs. In truth, all 26 VMs could be housed on a single Host with 64 GB of physical memory. (It’s just the leveraged math is hinky enough to provide inadequate results.)
Since there’s not much you can do to alter the results of this math (other than reducing the size of assigned VM memory or installing more physical memory in the Hosts), then you can simply work around the issue. Edit the Settings on your Cluster, and select the “Allow VMs to be powered on even if they violate availability constraints” option under the VMware HA settings. If a Host fails on you, then your environment will ignore the HA Failover calculation, but will still provide the HA failover for the VMs in your Cluster. Provided you have sufficient resources to accommodate your VMs in the event of a Host failure, then you shouldn’t have too much difficulty.
I recommend that you crunch your own numbers to determine how your assigned memory stacks up against your available physical memory. Just don’t forget to deduct some of the memory from each Host (say at least 2 GB) for the Service Console management and any additional overhead requirements of each Host.
At least, that’s my suggestion… if you want more reliable data, then you should probably consider giving VMware Support a call and ask them for guidance.