Quiescing Issues in VMware for Data Recovery March 6, 2011Posted by Chad in Backup, Tech, VMware.
Last week, we implemented the VMware Data Recovery appliance into our cluster of hosts. We have had a SAN LUN dedicated to doing backups, and finally got around to using it. The installation of the VMWare DR appliance is very easy, with just a few key elements to know about.
After the installation, you add a new VMDK disk that will be used for the deduplicated backups. When we created it, it was limited to 256gb, which was half of what we wanted. The fix for this is to delete the VMDK, and then delete the data store itself, it was created with a too small block size. Hopefully you don’t have one large data store for everything, since it will all be lost if you do. So when you recreate the data store, select the largest block size, and it will let you have VMDKs for the backup up to 2 terabytes.
Next, we ran backups on all our virtuals. Every single one worked great, but for one particular virtual. We were able to backup and restore both entire Windows or Linux VMs, and also restore any particular file or files in both guest operating systems. The Data Recovery tool simply worked great.
Except for the one system. It was a Windows 2008 32-bit Server running SQL Server 2008. It hosted the vSphere database among others and was fairy active. But it just wouldn’t let you back it up with VMDR.
Here’s the error message you’d see in the console:
Creating a quiesced snapshot failed because the created snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine.
OK, lots of folks with the problem. Quiescing means having the OS tell all the apps running on it to stop writing to disk for a minute, so the Windows Shadow Copy process can make an image of what is on the disk.
Lot’s of potential fixes. First step, reinstall VMware tools in the virtual. That didn’t work for me. Then, some folks have reported success deleting a registry key and rebooting. Didn’t work for me either. Whole bunch of stuff about making sure the right services are stopped and started.
The key to the issue I was having is that I could take a typical snapshot of the virtual just fine. It’s just when DR tries to do it that it fails. This led me to believe the issue was outside of the virtual itself, that all the mucking around with the OS wouldn’t work.
The fix that ended up working was the one that shouldn’t have applied. It specified that it’s only for AD servers running on Win2k8 R2. But it worked for my issue, and now my server works fine.
- Right click the Windows 2008 virtual machine having the problem, and power it off.
- Right click the virtual machine, and click Edit Settings.
- Click the Options tab, and select the General entry in the settings column.
- Click Configuration Parameters… The Configuration Parameters window appears.
- In the Name column, enter disk.EnableUUID.
- In the Value column, enter FALSE.
- Click OK and click Save.
- Power on the virtual machine.