VMware: Repairing orphaned ESX snapshots

Update: Consolidate Snapshots

Snapshots created via API (NetApp SMVI, Equallogic’s Auto-Snapshot Manager/VMware Edition, VMware VCB or VDR) occasionally get stuck. If you find an orphaned snapshot (ie – it is not listed in the snapshot manager but you can’t change the size of a vmdk / the provisioned size grayed out, or you happen to notice a -delta file where there shouldn’t be one or your scheduled monitoring tasks report an issue (you do run scheduled monitoring tasks don’t you?)) here are some potential fixes.

Background
VMware hard drives are stored as two files, a “name”.vmdk descriptor text file and a “name”-flat.vmdk binary (or “name”-delta.vmdk for snapshots) which holds the actual blocks.

When you take a snapshot of a hard drive a new set of files are created. The new file holds any new or changed blocks and the old descriptor file is updated

No Snapshot

Before a snapshot is created, the VM configuration file (“name”.vmx) contains (among many other things) a line referencing the hard drive descriptor file:

ide0:0.fileName = “KyleA.vmdk”

At this point the “name”.vmsd is essentially blank

The original “name”.vmdk hard drive descriptor file contains (among other things) two lines referencing the ID of the drive (note the ID is unique only for snapshots – all master disks are fffffffe) as well as a line referencing the binary “-flat” file for this drive.
CID=fffffffe
parentCID=ffffffff
# Extent description
RW 41943040 VMFS “KyleA-flat.vmdk”

One snapshot
When a snapshot is taken, the configuration file (“name”.vmx) gets updated with the name of the current snapshot

ide0:0.fileName = “KyleA-000001.vmdk”

The snapshot descriptor file gets updated with (among other things) the name of the associated .vmsn file which includes the state of RAM, CPU and VMX.

snapshot0.filename = “KyleA-Snapshot2.vmsn”

The original “name”.vmdk is left unchanged (see above)

A snapshot binary is created as “name”-000001-delta.vmdk (note the name change from “-flat”)
The descriptor file “name”-000001.vmdk is created with a line referencing the “delta” binary file plus a new ID. The important item is the reference to the parentCID and parentFileNameHint, both referencing the master vmdk descriptor.
Snap descriptor .vmdk:
CID=fffffffe
parentCID=fffffffe
parentFileNameHint=”KyleA.vmdk”
# Extent description
RW 41943040 VMFSSPARSE “KyleA-000001-delta.vmdk”

The end result is the .vmx points to the descriptor file of the VMDK to be written to. If that is a snapshot then it in turn references the next file “up” the snapshot tree.

When a snapshot is committed (ie deleted when it is currently being written to or is directly up the tree from the current running state), the blocks in the -delta are committed to the -delta or -flat it calls “parent” and the descriptor and -delta are deleted. If there is a snap below it, that snap is updated to reference the deleted snapshot’s parent as parent.

ie
If you start with VMX->snap3->snap2->snap1->flat
then delete snap2  (assuming snap3 is the current binary)
you end up with VMX->snap3->snap1->flat

Normally you don’t have to worry about the details, however you occasionally run into issues where snapshots can’t be removed.

Removing “hidden” snapshots
Method 1:

Use the GUI to make a snapshot, then use the Snapshot Manager to “Delete All”

Method 1a:
Power off the VM
Use the GUI to make a snapshot, then use the Snapshot Manager to “Delete All”

Method 2:
Connect to the ESX server with an SSH utility like putty
open each “name”.vmdk descriptor files and look for the line
ddb.deletable = “false” (see a walk-through on this below)
Change this to “true”
Create another snapshot then delete them all. You can use the GUI, but now that your this far the command line to create a snapshot is

vmware-cmd “name”.vmx createsnapshot “test” “” 0 0

The command to remove all snapshots is

vmware-cmd “name”.vmx removesnapshots

Method 2a:
Shut the VM down before doing Method2

Walk through on finding and changing the ddb.deletable setting:
From the console of an ESX server login as root. From an ESXi server enable local troubleshooting mode then login as root

from the command prompt :
cd /vmfs/volumes/”datastore name”
“datastore name” is the case-sensitive name of the datastore the VM is stored on.
Use “ls” to get a list of all datastores if needed

cd “vm name”
“vm name” is the case sensitive name of the VM. use “ls” to get a list of all VMs

use “ls” to get a list of all files

for each vmdk file to check use: “cat name.vmdk” to display the contents of the file
If you find one with ddb.deletable = “false” open the file with vi to edit.

vi “name”.vmdk
arrow to the “f” in “false” and hit “x” five times until you have deleted the work “false”
hit “i” to switch to insert mode and type “true” the line should now read
ddb.deletable = “true”

hit the escape key and then type “:wr” then “q” to save your changed and exit

Note that while you can use the GUI to create a snapshot and then commit them all I’ve had better luck using the command line.

This entry was posted in Computing, Virtualization, VMware and tagged . Bookmark the permalink.

27 Responses to VMware: Repairing orphaned ESX snapshots

                  Leave a Reply

                  This site uses Akismet to reduce spam. Learn how your comment data is processed.