Over the last couple months I have been performing upgrades from ESXi 4.1, to 5.0 Beta, to 5.0 RTM, and earlier today to 5.0 GA. With the exception of the 5.0 GA release all these upgrades were handled with VMware Update Manager (VUM). I have encountered a few errors along the way I and I felt it was worthwhile to share them.
First of all to use VUM and the ESXi 5.0 GA ISO to upgrade to your hosts to 5.0 GA you must be running ESXi 220.127.116.11 or later (per details from VUM), but not any previous release of 5.0. The odd thing is that you can do an upgrade of your pre-ESXi 5.0 GA hosts by booting with the install CD and choosing the “upgrade” option. This preserves all the settings as you would expect it to. The ESXi “depot” installer for 5.0 GA for VUM has not been released yet so I do not know if you will be able to use it to upgrade 5.0 Beta or RC to 5.0 GA (stay tuned as I have a ton on hosts running 5.0 RTM so I will test the depot install as soon as I get it!).
For further details about upgrade requirements and such visit the vSphere 5 online documentation here about that very subject. I have had nothing but success using VUM; I’ve used it for ESX 4 to ESXi 5 upgrades as well ESXi 4 toESXi 5 upgrades, even with the guest VM’s *suspended* during the upgrade (I was feeling adventurous). The onlyissue I had was with my Iomega IX-200 (Cloud Edition) that I use for iSCSI shared storage in my lab. I had no issues going from 4.X to 5.0 RTM; the datastores were available as expected after the VUM orchestrated upgrade. This morning though I went from 5.0 RTM to 5.0 GA and my datastores were not available, however the iSCSI connected devices did display.
Devices looks good, but where is my VMDK volume (only the local volume is shown)?
I’ve done a little work with SAN copied VMDK volumes before and as such have had to deal with VMDK volume resignaturing. VMware has a nice KB articlethat explains why resignaturing is needed:
“VMFS3 metadata identifies the volumes by several properties which include the LUN number and the LUN ID (UUID or Serial Number). Because the LUNs now have new UUIDs, the resulting mismatch with the metadata leads to LVM identifying the volumes as snapshots. You must resignature the VMFS3 volumes to make them visible again.“
Ignore the bit that specifies VMFS3; the KB article
hasn’t been updated and it would appear that this issue applies to VMFS5 as well. In a nutshell what is happening is that ESXi sees the “new” datastore as a snapshot and as such does not mount it as a VMDK volume.
Resignaturing the drives is quick and painless although please remember that it will affect every host that connects to the datastore that may not have been upgraded yet and/or is still accessing it. I had brought down the whole lab so I wasn’t concerned about this. The steps are as follows:
- Power off any/all VM’s that may be running on the datastore(s) you will be resignaturing.
- SSH into one of the hosts that currently has access to the datastore(s) that are having problems.
- Execute “vmkfstools -V” from the ESXi console to rescan the volumes on the host. If this fixes the problem then you are all set. Odds are you already did this via the vSphere client so you need to move on to the next step.
- Remove any VM’s from the ESXi inventory that reside on the volume(s) you will resignature.
- Verify that the volumes are seen by the host by executing “esxcfg-mpath -l | less” from the ESXi console.
- From the ESXi console execute “esxcfg-advcfg -s 1 /LVM/EnableResignature“. This will resignature ALLdatastores that were detected as snapshots during the next rescan, so hopefully you remembered to take all the precautions I specified above.
- Repeat step three to initiate the rescan and perform the resignature operation. YOU ARE NOT DONE YET! You should however be able to see the VMDK volumes on the host now at this point (they will have new datastore names that start with “snapshot-“, if not your problem goes beyond the scope of this post.
- Execute “esxcfg-advcfg -s 0 /LVM/EnableResignature” to disable the resignature-during-rescan option. If you fail to do this your datastores will be resignatured during EVERY rescan, which I am fairly certain you do not want.
- Still not done, now execute step 3 again to make sure the volumes stay mounted after the rescan. Assuming that they appeared during step 7 they should still be present after you run another rescan. If they disappear after this step it means you did something wrong in step 8 and the drives were resignatured again. Repeat step 8 again, then this step, and verify that the volumes remain.
- Browse the datastore(s) and re-add all your VM’s to your inventory. You do that by browsing to the VM folder, right clicking on the vmx file within, and selecting “Add to Inventory”.
- Rename the datastores to match what they were before. This is an optional step but if you are like me the datastore names have meaning and they are part of your overall design.
Everything is back to normal in the lab:
I must admit that it is scary when the datastores disappear. Remain calm and remember that during a CD based (boot time) install you don’t have access to the iSCSI or NFS volumes (unlike fiber channel) so you are most likely just a resignature away from fixing your problem. The fix takes less than a couple of minutes and you will be off and running with your new ESXi 5.0 GA install.
Update (10-1-2011): I encountered this issue again after updating the firmware of my Iomega IX2-200; he same fix worked to restore access to my datastore.