Tag Archives: troubleshooting

How to fix some (common?) VMware View problems

While I have worked at EMC  less than 3 months I have already created and destroyed about 8000 desktops doing some VMware View – EMC VNX testing. The goal of my work is to validate the performance of View running on the VNX platform and document my findings.

As of late I have been working with the some performance tuning that increases the number of desktops that View will configure/refresh/recompose at once. Page 18 of the VMware View 5 Performance and Best Practices document details a couple of changes you can make in ADAM that will speed up these actions. VMware has not responded to my request about the specifics of the values, but based on my experience the pae-SVICreationRampFactor increases the number of desktops that View will deploy at once and the pae-SVICreationRampFactor affects refreshes and recomposes (best as I can tell). The location of these values within ADAM is displayed in the image below.

image

As VMware states in the Performance and Best Practices document your vCenter needs to be capable of this additional provisioning load as you will be doubling (and then some) the typical number of operations View performs. While I had no problems with vCenter itself, I did have issues with ESXi5 hosts being disconnected from vCenter even after increasing the timeout from 30 to 60 seconds. I am going to attempt to gather some ESXTOP data to see if the issue is related to the service console running out of ram or something else.

The side affect of these hosts being disconnected is that it means I end up with a number of desktops in a partially configured state, some of which View cannot fix. In some extreme cases you can’t remove these desktops from the View instance through the GUI, which means you must edit the View ADAM instance and View Composer database directly.

The orphaned data you may end up with includes:

  • View desktop and disk information in ADAM
  • View desktop, disk, and outstanding task information in the View Composer database
  • Computer accounts in Active Directory

Removing this information is important as it can affect your ability to deploy desktops with the affected names again and stuck View Composer tasks can drag the performance of Composer in general to a crawl (trust me I’ve seen it).

Thankfully VMware has created a KB article that explains how to search for these orphaned desktops and remove their information. For the purpose of this article I’m going to assume that you have orphaned desktops that you cannot access; the KB article provides additional options that cover situations where you are able to log into the desktop you wish to remove.

Warning: Before you delete entries from either database, make sure you have a current backup of the database and disable provisioning for the pool in View Manager.

Removing the virtual machine from the ADAM database

Find the virtual machine’s GUID stored in ADAM:

  • Log in to the machine hosting your VMware View Connection Server through the VMware Infrastructure Client or Microsoft RDP.
  • Open the ADAM Active Directory Service Interfaces Editor.
  • In a Windows 2003 Server, click Start > Programs > ADAM > ADAM ADSI Edit.
  • In a Windows 2008 Server, click Start > All Programs >Administrator Tools > ADSI Edit.
  • Right-click ADAM ADSI Edit and click Connect to.
  • Ensure that the Select or type a domain or server option is selected and the destination points to localhost.
  • Select Distinguished Name (DN) or naming context and type dc=vdi, dc=vmware, dc=int.
  • Run a query against OU=Servers, DC=vdi, DC=vmware, DC=int with the this string: (&(objectClass=pae-VM)(pae-displayname=<Virtual Machine name>))
    • Note: Replace <Virtual Machine Name> with the name of the virtual machine you are searching for.  You may use * or ? as wildcards to match multiple Desktops
  • Record the cn=<GUID>.

Take a complete backup of ADAM and Composer database. For more information, see Performing an end-to-end backup and restore for View Manager 3.x/4.x (1008046).

Delete the pae-VM object from the ADAM database:

  • Open the ADAM Active Directory Service Interfaces Editor.
  • In a Windows 2003 Server, click Start > Programs > ADAM > ADAM ADSI Edit.
  • In a Windows 2008 Server, c lick Start > All Programs > Administrator Tools > ADSI Edit.
  • Right-click ADAM ADSI Edit and click Connect to.
  • Choose Distinguished name (DN) or naming context and type dc=vdi, dc=vmware, dc=int.
  • Locate the OU=SERVERS container.
  • Locate the corresponding virtual machine’s GUID (from above) in the list which can be sorted in ascending or descending order, choose Properties and check the pae-DisplayName Attribute to verify the corresponding linked clone virtual machine object.
  • Delete the pae-VM object.
  • Note: Check if there are entries under OU=Desktops and OU=Applications in the ADAM database.

Removing the linked clone references from the View Composer database

In View 4.5 and later, use the SviConfig RemoveSviClone command to remove these items:

  • The linked clone database entries from the View Composer database
  • The linked clone machine account from Active Directory
  • The linked clone virtual machine from vCenter Server

Before you remove the linked clone data, make sure that the View Composer service is running. On the View Composer computer, run the SviConfig RemoveSviClone command.

For example: SviConfig -operation=RemoveSviClone -VmName=VM name -AdminUser=the local admin user -AdminPassword= the local admin password -ServerUrl=the View Composer server URL

Where:

  • VmName- The name of the virtual machine to remove.
  • AdminUser- The name of the user who is part of the local administrator group. The default value is Administrator.
  • AdminPassword- The password of the administrator used to connect to the View Composer server.
  • ServerUrl – The View Composer server URL. The default value is https://localhost:18443/SviService/v2_0
  • The VmName and AdminPassword parameters are required. AdminUser and ServerUrl are optional.

Note: The location of SviConfig is:

  • In 32-bit servers – <install_drive> Program FilesVMwareVMware View Composer
  • In 64-bit servers – <install_drive> Program Files (x86)VMwareVMware View Composer

In View 4.0.x and earlier, you must manually delete linked-clone data from the View Composer database.

To remove the linked clone references from the View Composer database:

  • Open SQL Manager > Databases > View Composer database > Tables.
  • Open dbo.SVI_VM_NAME table and delete the entire row where the virtual machine is referenced under column NAME.
  • Open dbo.SVI_COMPUTER_NAME table and delete the entire row where the virtual machine is referenced under column NAME.
  • Open dbo.SVI_SIM_CLONE table, find the virtual machine reference under column VM_NAMEand note the ID. If you try to delete this row it complains about other table dependencies.
  • Open dbo.SVI_SC_PDISK_INFO table and delete the entire row where dbo.SVI_SIM_CLONE ID is referenced under column PARENT_ID.
  • Open dbo.SVI_SC_BASE_DISK_KEYS table and delete the entire row where dbo.SVI_SIM_CLONE ID is referenced under column PARENT_ID.
  • If the linked clone was in the process of being deployed when a problem occurred, there may be additional references to the clone left around in the dbo.SVI_TASK_STATE table and dbo.SVI_REQUESTtable:
  • Open dbo.SVI_TASK_STATE table and find the row where dbo.SVI_SIM_CLONE ID is referenced under column SIM_CLONE_ID. Note the REQUEST_IDin that row.
  • Open dbo.SVI_REQUEST table and delete the entire row where dbo.SVI_TASK_STATE REQUEST_IDis referenced ID.
  • Delete the entire row from dbo.SVI_TASK_STATEtable
  • In dbo.SVI_SIM_CLONEtable, delete the entire row where the virtual machine is referenced.
  • Remove the virtual machine from Active Directory Users and Computers.

Deleting the virtual machine from vCenter Server

Note: If you run the SviConfig RemoveSviClone command to remove linked clone data, the virtual machine is removed from vCenter Server. You can skip this task.

To delete the virtual machine from vCenter Server:

  • Log in to vCenter Server using the vSphere Client.
  • Right-click the linked clone virtual machine and click Delete from Disk.

While this process appears difficult it really isn’t. At the end of the day you have information in about 8 places that you need to remove, and the structure of the View Composer and ADAM databases is rather simple. As I said earlier in the article leaving this information in place can affect View Composer performance and prevent desktops from being created as View “sees” the names still in use. I’ve actually made it a point to check all these database locations when I am done with a test set just to make sure that I have a healthy View instance for my next test.

If you decide to start altering the default View settings to speed up recomposes/refreshes/deployments I recommend you make sure your ADAM and SQL backups are current and that you watch out for these issues.

- Jason

Issue with ESXi 5 upgrades (via VUM) and Iomega IX2-200 VMDK datastores

Over the last couple months I have been performing upgrades from ESXi 4.1, to 5.0 Beta, to 5.0 RTM, and earlier today to 5.0 GA. With the exception of the 5.0 GA release all these upgrades were handled with VMware Update Manager (VUM). I have encountered a few errors along the way I and I felt it was worthwhile to share them.

First of all to use VUM and the ESXi 5.0 GA ISO to upgrade to your hosts to 5.0 GA you must be running  ESXi 4.0.4.1 or later (per details from VUM), but not any previous release of 5.0. The odd thing is that you can do an upgrade of your pre-ESXi 5.0 GA hosts by booting with the install CD and choosing the “upgrade” option. This preserves all the settings as you would expect it to. The ESXi “depot” installer for 5.0 GA for VUM has not been released yet so I do not know if you will be able to use it to upgrade 5.0 Beta or RC to 5.0 GA (stay tuned as I have a ton on hosts running 5.0 RTM so I will test the depot install as soon as I get it!).

For further details about upgrade requirements and such visit the vSphere 5 online documentation here about that very subject. I have had nothing but success using VUM; I’ve used it for ESX 4 to ESXi 5 upgrades as well ESXi 4 toESXi 5 upgrades, even with the guest VM’s *suspended* during the upgrade (I was feeling adventurous). The onlyissue I had was with my Iomega IX-200 (Cloud Edition) that I use for iSCSI shared storage in my lab. I had no issues going from 4.X to 5.0 RTM; the datastores were available as expected after the VUM orchestrated upgrade. This morning though I went from 5.0 RTM to 5.0 GA and my datastores were not available, however the iSCSI connected devices did display.

Device View:

 Datastore View:

 Devices looks good, but where is my VMDK volume (only the local volume is shown)?

I’ve done a little work with SAN copied VMDK volumes before and as such have had to deal with VMDK volume resignaturing. VMware has a nice KB articlethat explains why resignaturing is needed:

VMFS3 metadata identifies the volumes by several properties which include the LUN number and the LUN ID (UUID or Serial Number). Because the LUNs now have new UUIDs, the resulting mismatch with the metadata leads to LVM identifying the volumes as snapshots. You must resignature the VMFS3 volumes to make them visible again.

Ignore the bit that specifies VMFS3; the KB article hasn’t been updated and it would appear that this issue applies to VMFS5 as well. In a nutshell what is happening is that ESXi sees the “new” datastore as a snapshot and as such does not mount it as a VMDK volume.
Resignaturing the drives is quick and painless although please remember that it will affect every host that connects to the datastore that may not have been upgraded yet and/or is still accessing it. I had brought down the whole lab so I wasn’t concerned about this. The steps are as follows:
  1. Power off any/all VM’s that may be running on the datastore(s) you will be resignaturing.
  2. SSH into one of the hosts that currently has access to the datastore(s) that are having problems.
  3. Execute “vmkfstools -V” from the ESXi console to rescan the volumes on the host. If this fixes the problem then you are all set. Odds are you already did this via the vSphere client so you need to move on to the next step.
  4. Remove any VM’s from the ESXi inventory that reside on the volume(s) you will resignature.
  5. Verify that the volumes are seen by the host by executing “esxcfg-mpath -l | less” from the ESXi console.
  6. From the ESXi console execute “esxcfg-advcfg -s 1 /LVM/EnableResignature“. This will resignature ALLdatastores that were detected as snapshots during the next rescan, so hopefully you remembered to take all the precautions I specified above.
  7. Repeat step three to initiate the rescan and perform the resignature operation. YOU ARE NOT DONE YET! You should however be able to see the VMDK volumes on the host now at this point (they will have new datastore names that start with “snapshot-“, if not your problem goes beyond the scope of this post.
  8. Execute “esxcfg-advcfg -s 0 /LVM/EnableResignature” to disable the resignature-during-rescan option. If you fail to do this your datastores will be resignatured during EVERY rescan, which I am fairly certain you do not want.
  9. Still not done, now execute step 3 again to make sure the volumes stay mounted after the rescan. Assuming that they appeared during step 7 they should still be present after you run another rescan. If they disappear after this step it means you did something wrong in step 8 and the drives were resignatured again. Repeat step 8 again, then this step, and verify that the volumes remain.
  10. Browse the datastore(s) and re-add all your VM’s to your inventory. You do that by browsing to the VM folder, right clicking on the vmx file within, and selecting “Add to Inventory”.
  11. Rename the datastores to match what they were before. This is an optional step but if you are like me the datastore names have meaning and they are part of your overall design.
Everything is back to normal in the lab:
I must admit that it is scary when the datastores disappear. Remain calm and remember that during a CD based (boot time) install you don’t have access to the iSCSI or NFS volumes (unlike fiber channel) so you are most likely just a resignature away from fixing your problem. The fix takes less than a couple of minutes and you will be off and running with your new ESXi 5.0 GA install.
Update (10-1-2011): I encountered this issue again after updating the firmware of my Iomega IX2-200; he same fix worked to restore access to my datastore.
- Jason