vMotion: Launch Failure

Engineers are always diligent about patching, especially when it comes to the lab (wink wink). So being the diligent engineer that I am, yesterday afternoon I setup a job in Update Manager to patch all my hosts in my cluster. I stepped away to run an errand, and when I came back my first host still had not entered maintenance mode. What gives?

The Problem

When I check the Tasks & Events for the host, I see a failure for a “Migrate virtual machine” job with a status of “A general system error occurred: Launch failure”. The error stack listed “The VM Failed to resume on the destination host during early power on”.

vmotion failure reservations
Why won’t you resume?!

I attempt a manual vMotion to the host it failed to migrate to, and it works. Interesting. Let’s dig a little deeper and check out the logs.

vmotion failure reservations
cat /vmfs/volumes/vsanDatastore/VM/vmware.log

Look at that. The VM is configured with a reservation of 1000 mhz, and the cluster is unable to satisfy it (hey, it’s a lab). Let’s validate we are looking at the correct VM:

Get-Cluster vLAB01 | Get-VM | ? {$_.Name -like "*UTZ*"} | Get-VMResourceConfiguration | Select CpuReservationMhz

CpuReservationMhz
-----------------
             1000

The main purpose of this cluster is for NSX development. As the majority of the VM’s are ESG’s or DLR’s, they are configured with a reservation for both CPU and Memory (see NSX 6.4 Release Notes).

nsx reservations

Reservations are not ideal in a lab environment, typically due to limited resources. The option to edit the resource settings are disabled in the vSphere Client, which is by design to ensure that the NSX Edges get enough resources to function. Now I can edit them in the NSX UI, but that’s no fun.

The Solution

So what is the alternative? To the Postman! To start, I’ll need the ESG ID number. This can be obtaining from the NSX UI or with PowerNSX:

 Get-NsxEdge | ? {$_.Name -like "UTZ"} | select id)
vmotion failure reservations postman api

Here I’m running a GET command against my first troublesome ESG; edge-1. As you can see from the output, Line 161 is our target. Let’s change it to “0” and see what we get:

vmotion failure reservations postman api

Validate the change were successful:

Get-Cluster vLAB01| Get-VM | ? {$_.Name -like "*UTZ*"} | Get-VMResourceConfiguration | Select CpuReservationMhz

CpuReservationMhz
-----------------
                0

This is not an ideal solution for a Production environment. You should never disable the reservations for ESG’s, as they are there to ensure proper performance. But for a lab environment where resources are limited, this should be okay.

Leave a Reply

Your email address will not be published. Required fields are marked *