Tag Archives: vmware

How and why to replace the default VMware View Composer SSL certificate – Part 2

In Part 1 we went through all the steps needed to generate a new SSL certificate for View Composer. We were left with a file titled rui.pfx, which we need to import into our View Composer certificate store.

Step 1 – Import the certificate to the local certificate store

Open a MMC console, then from the File menu add the Certificates snap-in (Add/Remove Snap-in from the menu).

image

We need to manage the Computer account….:

image

For the Local computer:

image

Click Ok once you have added the snap-in.

Expand Personal – Certificates. You’ll see the default Composer SSL certificate there.

image

Right click on the Certificates folder and select All Tasks – Import.

image

Go through the wizard, selecting the rui.pfx file we previously copied to the server. You’ll need to change the file extension to Personal Information Exchange to see the file.

image

Click Next to move through the wizard.

The next decision is yours. If you mark the certificates as exportable you do open up a potential security risk as someone could come along and grab a full copy of the certificate. You already have a copy of the PFX file (which you will protect right?), so lets leave the settings at the default. Fill in the password we selected when generating the PFX file (testpassword) and click Next.

image

The destination store should already be what we want since we selected in in the beginning. If not, select Personal as shown and click Next then Finish. You will get a dialog box indicating that the action was successful.

image

Step 2 – Activate the certificate

From the View Management Console dashboard; note that our current View Composer certificate is untrusted but accepted (I accepted it during the initial configuration, prior to replacing the certificate):

image

Stop the VMware View Composer service.

From the command line, change into the View Composer install directory. It should be Program Files (x86)VMwareVMware View Composer.

Execute the command:

SviConfig.exe –operation=replacecertificate -delete=false

The delete=false leaves the default SSL certificate in place, so you can switch to it if you want.

Select the certificate you wish to activate. It should be obvious since if has the details you entered when generating the certificate request. We want certificate 1; press Enter to bind the certificate.

image

You should get confirmation:

image

Start the View Composer Service. Check the Composer Server event logs for any issues, but assuming that you followed the directions as indicated (known valid for View 5.1) Composer should be working as expected.

Go back to the View dashboard, hit refresh, and click on the View Composer Server again. The SSL Certificate should now show as valid.

image

You now have a trusted certificate on your View Composer Server, and a usable backup of the Composer Server SSL certificate (with private key).

How and why to replace the default VMware View Composer SSL certificate–Part 1

Lets pretend for a moment that you do not back up full virtual machines, and prefer a more minimal approach to disaster recovery. VMware View Composer is a good candidate for this, as all you really need to restore it is the Composer database (which is likely hosted elsewhere), and the RSA keystore data OR a SSL certificate that can be exported with the private key. The default View Composer (self signed) SSL certificate cannot be exported, so you must export the RSA keystore using the process outlined here if you wish to be able to restore Composer on a new server and continue to use the existing Composer database.

You restore the RSA keystore on the new host server prior to re-installing View Composer. Composer will install a new self-signed certificate, but since the keystore data is there the existing Composer database can still be used. While that works fine, I’m more interested in using certificate that was signed by an internal CA, as it will be trusted by the View Connection Server AND I have full control over the certificate.

Problem number 1 is that you cannot export the default Composer SSL certificate. Unlike the vCenter certificates, the View Composer certificate is stored within the Windows certificate store and is not marked as exportable. This is a good of reason as any to replace this certificate immediately after installing View Composer, or ideally before it is even installed as you can then select it during the installation process.

Note: The procedure is the same regardless of whether or not you are using a dedicated View Composer Server OR if you installed View Composer directly on your vCenter Server.

Here sits the default View Composer SSL certificate, valid for 2 years:

image

The tools we need to perform this task include:

  • A trusted certificate authority. I’ll be using a Microsoft Root CA since that is automatically trusted by the members of my lab domain.
  • An installed and configured copy of OpenSSL for Windows, which you can get from: http://slproweb.com/products/Win32OpenSSL.html
    • Note: Don’t forgot to install the Visual C++ 2008 Redistributables first.
    • Note 2: Have a Linux box? OpenSSL was ported from Linux so you can do all of this from Linux if you want.
    • Note 3: Yes you could do this directly through the Microsoft CA tool, but to do that you need a custom certificate request template that allows you to export the certificate keys (which is something I wanted to do). Learning to use OpenSSL is helpful as you can use it to generate the PFX file used to replace your vCenter and View Connection Server SSL certificates (something I will do posts on later).
How to create the certificate request

Edit the default openssl.cfg file. Assuming you performed a default installation, this file is located in C:OpenSSL-Win32bin.

Scroll down to the the [ req ]section and remove the # to uncomment out this line (if the line isn’t present, just add it at the end of the section):

req_extensions = v3_req

Scroll down to the [ v3_req ] section and add a new subjectAltNameline listing the required DNS names of the View Composer server including the full FQDN:

subjectAltName = "DNS:viewcomp-01.vjason.local,DNS:viewcomp-01"

OpenSSL was ported from Linux and still expects the config file to be in a specific Linux path. Due to this, you need to add an environment variable pointing to the openssl.cfg file. From the Windows command prompt enter the command:

set OPENSSL_CONF=C:OpenSSL-Win32binopenssl.cfg

Enter the following command to generate your SSL key:

openssl.exe genrsa 2048 > rui.key

Assuming that you configured everything as expected, you should see output similar to this:

Loading 'screen' into random state - done
Generating RSA private key, 2048 bit long modulus
.........................................+++
..................................+++
e is 65537 (0x10001)

Enter the following command to generate your certificate request:

openssl req -new -nodes -out rui.csr -keyout rui.key

You should see output similar to this:

Loading 'screen' into random state - done
Generating a 1024 bit RSA private key
............................++++++
............++++++
writing new private key to 'rui.key'
-----

Followed by the familiar cert request questions. All I am really interested in is the question in bold; make sure you enter in the FQDN for your Composer server:

You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:NC
Locality Name (eg, city) []:Research Triangle Park
Organization Name (eg, company) [Internet Widgits Pty Ltd]:vJason.com
Organizational Unit Name (eg, section) []:Lab
Common Name (e.g. server FQDN or YOUR name) []:viewcomp-01.vjason.local
Email Address []:jason@vjason.com
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:Password123
An optional company name []:

Once complete, you will have certificate request file named rui.csr.

Generating the new certificate

Open the Microsoft CA web interface and select Request a certificate

image

Then select advanced certificate request.

image

We will be pasting in the contents of the rui.csr file, so we need to select Submit a certificate request….. (I cut off the link; this is one of two options):

image

Paste the full contents of rui.csr in the Saved Request window and change the Certificate Template to Web Server. Click Submit when complete.

image

Answer Yes here to complete the request.

image

We need to export the certificate with the private key rather than download it, so you can close the Certificate Services web interface at this point.

Approve the certificate request (if required) in your Microsoft Root CA MMC console. I don’t have a screenshot of a pending request, but you would see it here:

image

Once approved, we need to export the certificate. Open the Issued Certificates folder and double click on the certificate you wish to export. In my lab it was the last certificate issued:

image

Click on the Details tab of the Certificate window and press the Copy to File button.

image

Click Next in the first window in the Certificate Export Wizard, then select the Base-64 encoded X.509 (.CER) radio button, and finally click Next.

image

Save the file as rui.cer and copy it into the C:OpenSSL-Win32bin directory on the server with OpenSSL installed. Rename the file rui.crt, then run the following command (Note: I used testpassword as that VMware uses with vCenter PFX files. For View you can use anything but remember it since you’ll need it later):

openssl pkcs12 -export -in rui.crt -inkey rui.key -name rui
-passout pass:testpassword -out rui.pfx

You should see output similar to this when done:

Loading ‘screen’ into random state – done

Copy the rui.pfx file to the server that is hosting View Composer. You could install it remotely, but there are parts of this procedure that require local server access so it is just easier to copy it directly to the server.

Note: That rui.pfx file is what you want to back up. If the event you need to rebuild your View Composer Server, you will want to install that certificate first (see the second half of this post for how) and select it during the installation of View Composer.

This concludes the first half of this post. The second half will be much shorter and very simple by comparison, I promise.

VMware View 5.1 Storage Accelerator in Action

Early today VMware formally announced the (almost) release of VMware View 5.1. Many assumed that View 5.1 would support vSphere 5 Content Based Read Cache (also known as CBRC); they were correct. For those who have been living under a rock, vSphere 5 has the ability to cache bits of a virtual machine in ram, where latency is measured in nanoseconds and not milliseconds. This is of particular benefit for linked clone virtual machines, where under View 5.x you have up to 1000 clones linked to a single image. Note: CBRC is referred to within VMware View as “VMware View Storage Accelerator”; this is the official term now that View 5.1 has been released.

Andre Leibovici of VMware has had a series of blog posts all about CBRC. Rather than plagiarize all his hard work, I’m going to recommend you visit his site if you want a technical introduction into how CBRC works.

Earlier this week I finished up some testing that shows exactly what CBRC does. The following graphs show IO reduction for three specific scenarios: 2000 desktop boot storm, logon storm, and on demand virus scan storm. View 5.1 allows you to enable caching for either the master replica image OR the master replica image and the persistent disks. My hosts in this case were rather close to overcommitting memory, so I chose to cache the master replica image only to minimize the amount of ram used for the cache. Read this to understand how much ram you will need based on your own settings.

I created these graphs because they are those which show the greatest amount of benefit from CBRC. Remember that much of the EUC storage workload is writes, so I’m looking for read heavy scenarios in order to find out what CBRC can really do.

The stats are all reads of the master replica image measured using ESXTOP. Storage stats are interesting enough, but the truth is if you see less reads at the ESXi host you will see less reads within your storage environment.

The results

So that you have some perspective, you are looking at results for a ESXi host that is running 143 Windows 7 desktops. I was actually testing 2000 desktops at once, but for simplicities sake I am showing the results for only one of my hosts.

clip_image002[5]

Do I really need to explain this? Yes there was still a (small) read spike at the 2 minute mark even with CBRC enabled, but even with that you are looking at over a 95% reduction in reads (red line) to the replica image. Even though vSphere uses at most 2 GB of ram for CBRC, the working set (the data that is actually read) of the master replica image is rather small during boot up.

clip_image002[7]

This is a 90 minute logon storm. Again, the benefits of CBRC during this window are obvious. With CBRC enabled (blue line) the reads to the replica image were again reduced by over 95% on average. This would be of great benefit in environments where logon storms were a frequent occurrence.

clip_image002[9]

Let me preface this graph by saying that you really should be using antivirus solutions that are optimized for EUC environments. This includes vShield Endpoint (with McAfee or Trend Micro plugins) or even McAfee MOVE. I’ve tested them all, and they are a huge improvement over traditional client-based AV tools. Now that I’ve gotten that out of the way, you are looking at an AV scan storm that used the McAfee command line AV client. Each AV session was initiated one right after another, a process which takes about 5-7 seconds per desktop. In this case not only was IO reduced by over 70% (blue line), with CBRC enabled the scans finished in less than a third of the time of the “no CBRC” test. AV scan storms are among the most “stressful” storage tests I do, and CBRC enabled amazing results.

The question is does CBRC change my storage requirements? My opinion: In most cases not really. If I were to show you steady state IO during a Login VSI user simulation test you would see maybe a percent or two reduction in IO to the replica master image, which means that you really can’t adjust your core storage design. I consider CBRC a safety valve that helps you maintain desktop performance during those periods of load that may otherwise affect desktop performance. Given that only a few GB of ram are required, you may find the CBRC a no brainer. As always, test in a lab or with a small pilot first before deploying into production.

- Jason

Antivirus for VDI – McAfee MOVE

Antivirus for virtual desktops is not a fun topic, especially when you are trying to shoehorn as many virtual desktops per CPU core as you can onto a server. Snark from Mac users aside, just about every antivirus platform out there will impact the performance of your workstation in some way, usually cpu, ram, or disk related.

Before I start, yes I know that there are alternatives to McAfee MOVE. McAfee MOVE just happens to be the one I tested since I have access to it and years of experience with ePolicy Orchestrator and VirusScan.

The McAfee MOVE Antivirus solution consists of multiple components, each of which plays a different role in the overall solution:

  • McAfee ePolicy Orchestrator Server (ePO) 4.6 – Enables centralized management of the McAfee software products that comprise the MOVE solution. ePO can be installed on Windows Server 2003 SP2 or newer servers, and McAfee recommends using a dedicated server when managing more than 250 clients.
  • McAfee MOVE Antivirus Offload Server – The MOVE Antivirus Offload Server manages the scanning of files from the virtual desktop environment. McAfee VirusScan 8.8 is installed on the MOVE server and is responsible for performing actual virus scans. The number of MOVE servers required is dependent on the aggregate number of CPU cores present in the hypervisors that host the virtual desktops; the actual sizing requirements will be discussed later in the chapter. McAfee MOVE server requires Windows Server 2008 SP2 or Windows Server 2008R2 SP1.
  • McAfee MOVE Antivirus Agent – The McAfee MOVE Agent is preinstalled on the virtual desktop master image and is responsible for enforcing the antivirus scanning policies as configured within McAfee ePolicy Orchestrator. The agent communicates with the MOVE Antivirus Server to determine if and how a file will be scanned based on the ePO policies. The McAfee MOVE Antivirus Agent supports Windows XP SP3, Windows 7, and Windows Server versions 2003 R2 SP2 and newer.
  • McAfee VirusScan 8.8 – VirusScan 8.8 is an antivirus software package used for traditional host-based virus scanning. It is installed on the McAfee MOVE Antivirus Offload server as well as the other servers that comprise the VMware View test environment.
  • McAfee ePolicy Orchestrator (ePO) Agent – The McAfee ePO agent is used to manage a number of different McAfee products. In the case of this solution, ePO is being used to manage servers and desktops running either the McAfee MOVE Antivirus Agent or McAfee VirusScan 8.8. The ePO agent communicates with the ePO server for management, reporting, and McAfee software deployment tasks. The McAfee ePO agent is preinstalled on the virtual desktop master image.

How MOVE Works

The benefit of the McAfee MOVE solution is that it offloads the scanning of files to a dedicated server, the MOVE Antivirus Offload Server. The MOVE Offload Server maintains a cache of what files have been scanned, eliminating the need to scan the files again regardless of what virtual desktop client makes the request. This differs from traditional host-based antivirus solutions which may maintain a similar cache of scanned files, but only for the benefit of the individual host and not other hosts. I created the below diagram to explain how the different components of the McAfee MOVE solution interact with one another.

image

McAfee MOVE architecture

The virtual desktop client runs the McAfee MOVE client and the ePO agent. The ePO agent enables remote management of the MOVE client by the ePO server, while the MOVE agent is responsible for identifying files that need to be scanned and requesting the scan from the MOVE Antivirus Offload Server.

The McAfee MOVE Antivirus Offload Server runs the MOVE Server software, VirusScan 8.8, and the ePO agent. The MOVE Antivirus Offload Server is responsible for answering file scanning requests from the MOVE clients, determining if the file has been scanned before, and performing the virus scan operations if required. The ePO agent is used for remote management of the VirusScan 8.8 antivirus platform.

The ePO server runs the ePolicy Orchestrator software, which is the management platform for the components that comprise the McAfee MOVE solution. The policies configured within ePO control the parameters within which MOVE operates, both in terms of the configuration of the product itself and policies that govern how and when files are scanned.

McAfee MOVE Sizing

One concern when deploying McAfee MOVE is the number of MOVE Antivirus Offload Servers that will be required. The number of servers required is dependent on the aggregate number of CPU cores, including hyper-threading, present in the hypervisors that host the virtual desktops. McAfee recommends a specific configuration for each MOVE Antivirus Offload Server:

  • Windows Server 2008 SP2 or Windows Server 2008R2 SP1
  • 4 vCPUs
  • 4 GB of ram

McAfee recommends leveraging Microsoft network load balancing (NLB) services to distribute the scanning workload across the MOVE Antivirus Offload Servers. NLB enables the creation of a single virtual IP that is used in place of the dedicated IP’s associated with the individual MOVE servers. This single IP distributes traffic to multiple McAfee MOVE servers based on the NLB settings and whether or not the server can be reached. The process for configuring Microsoft Windows NLB for Windows Server 2008 (and newer) is described in the Microsoft TechNet article Network Load Balancing Deployment Guide.

The McAfee MOVE Antivirus 2.0 Deployment Guide recommends one MOVE Antivirus Offload Server per every 40 vCPUs in the hypervisor cluster, including those created by the enabling of CPU hyper-threading. If the MOVE Antivirus Offload Servers will be installed on the same hypervisors that host the virtual desktops, ten percent of the vCPUs within the hypervisor cluster must be allocated for their use. This means that the hypervisors that will host the MOVE Antivirus Offload Servers will be able to host fewer virtual desktops than may have been otherwise planned for. A minimum of two MOVE Antivirus Offload Servers is recommended at all times for redundancy, regardless of whether or not the hypervisor cluster requires it based on the sizing calculations. The below table details how the number of MOVE Antivirus Offload Servers required increases as the number of vCPUs in the hypervisor cluster increases. A more detailed explanation of MOVE Offload Server sizing is below:

Hypervisors per cluster

Cores per cluster

vCPU per cluster(hyper-threading)

vCPU required for offload scan servers for a cluster (10% of vCPU)

Number of MOVE  Offload Servers required

2

16

32

3.2

2

8

64

128

12

3

10

80

160

16

4

20

160

320

32

8

35

280

560

56

14

MOVE Offload Server sizing

These figures should be applied on a per-hypervisor cluster basis; if more clusters are created additional McAfee MOVE Antivirus Offload Servers should be deployed and dedicated to the new cluster.

Installing McAfee MOVE

The MOVE Agent and ePO agents are installed on the master desktop image prior to the deployment of the virtual desktops. Both components can be installed after the virtual desktops have been deployed, although the impact this will have on the growth of linked clone persistent disks (if applicable) should be considered.

Once the installation of the MOVE and ePO agents has been completed on the virtual desktop master image, additional steps are required to prepare the image for deployment. The following steps should be performed prior to any redeployment of the virtual desktop master image, or if the McAfee Framework service has been started prior to the shutdown of the virtual desktop in preparation for deployment:

  1. Stop the McAfee Framework service.
  2. Delete value for the registry key AgentGUID located in the location determined by the virtual desktop operating system:
    1. 32-bit Windows operating systems — HKEY_LOCAL_MACHINESOFTWARENetwork AssociatesePolicy OrchestratorAgent (32-bit)
    2. 64-bit Windows operating systems — HKEY_LOCAL_MACHINESOFTWAREWow6432NodeNetwork AssociatesePolicy OrchestratorAgent (64-bit)
  3. Power down the workstation and deploy as necessary.

The next time the agent service is started the virtual desktop will generate a new AgentGUID value which will ensure it is able to be managed by McAfee ePolicy Orchestrator.

VMware DRS Rules – MOVE Offload Servers

McAfee recommends that the VMware Distributed Resource Scheduler (DRS) be disabled for the virtual MOVE Antivirus Offload Server guests as scanning activities would be interrupted if a DRS-initiated vMotion were to occur. To accomplish this but still leave DRS enabled for the virtual desktops, a DRS rule was created for each MOVE Antivirus Offload Server that binds the server to a specific hypervisor. To create the DRS rules you must first create virtual machine and host DRS groups; the image below shows the DRS groups as they appear in the DRS Groups Manager tab after they are created. In order to bind a specific virtual server to a specific hypervisor you must create individual DRS group for each hypervisor and each virtual server. These rules and groups are created on a per-cluster basis.

image

DRS Groups Manager – DRS Rules

Once the DRS groups have been configured you can then create the DRS rules that will bind the MOVE Antivirus Offload Servers to a specific hypervisor. Figure 91 displays a completed DRS rule that binds VDI-MOVE-01, a MOVE Antivirus Offload Server, to hypervisor vJason1. The option Should run on hosts in group is selected rather than Must run on hosts in group to ensure that VMware High Availability (HA) will power on the MOVE Antivirus Offload Server were a HA event involving the hypervisor hosting the MOVE Antivirus Offload Server to occur. You must create a DRS rule for each MOVE Antivirus Offload Server within the cluster.

image

DRS Rules

MOVE Antivirus Offload Servers

The MOVE Antivirus Offload Server software and VirusScan 8.8 were deployed on servers running Windows Server 2008R2 SP1. The MOVE Antivirus Offload Servers were added to a Microsoft network load balancing (NLB) cluster, per the recommendations from McAfee. The figure below shows the Network Load Balancing Manager interface for the MOVE Antivirus Offload Server NLB cluster. That cluster contains two member servers, VDI-MOVE-01 and VDI-MOVE-02. The virtual IP for the NLB cluster, 172.16.0.20 in the example provided, is what the MOVE clients will use when contacting the MOVE Antivirus Offload Servers.

image

NLB Cluster containing McAfee MOVE Offload Servers

McAfee ePolicy Orchestrator Configuration

McAfee ePolicy Orchestrator was used to provide a central point of management and reporting for the virtual desktops within the test environment. The figure below shows the System Tree, which provides a hierarchal view of the clients that are being managed by the ePO server.

image

ePO System Tree View

ePO clients are placed into different groups within the system tree based on default placement rules, automated placement rules, or manually by the ePO administrator. For the purpose of the testing, ePO was configured to place the virtual desktop computers in the appropriate group based on what organizational unit (OU) they reside in within Active Directory. The figure below shows the Synchronization Settings for the ePO group Pool A.

image

ePO Group Synchronization Settings

ePO is configured to synchronize the ePO group with the computer accounts found in the organizational unit Pool A, which is located in the parent organizational unit Desktops. The Pool A desktops computer accounts were placed in that organizational unit by VMware View when desktop Pool A was created. The reason why the virtual desktops are placed in different groups is in case an additional hypervisor cluster is added; a new cluster would use different MOVE Antivirus Offload Servers and require a unique MOVE ePO policy. The image below shows the Assigned Policies tab for the group Pool A. What is being shown in this case are the policies that are related to the MOVE Client, that are assigned to the Pool A ePO group.

image

ePO Assigned Policies for Pool A

ePO policies are what are used to control the configuration of McAfee products that support ePO, which includes the MOVE agent. To configure the MOVE Agent on the virtual desktops the policy entries shown in the next two images were configured.

image

MOVE Agent Policy – General Settings

The highlighted value displayed on the policy General tab is the IP address of the MOVE Antivirus Offload Server NLB cluster previously shown in Figure 92. The IP address must be used; the MOVE Agent does not support the use of DNS names when identifying what MOVE Antivirus Offload Server to use.

The second part of the policy that needed updated was the Scan Items tab, which is shown below.

image

MOVE Agent Policy – Scan Items

VMware KB Article 1027713, the VMware technical note Anti-Virus Practices for VMware View, and the McAfee MOVE Antivirus 2.0.0 Deployment Guide contain information about files and processes that should be excluded from antivirus scanning. These recommendations were made because the scanning of these files prevented various aspects of the virtual desktops, including the antivirus software, from working correctly. These recommendations were incorporated into the path and process exclusion settings in the McAfee MOVE agent policy. The list of items excluded from scanning includes:

Processes

  • Pcoip_server_win32.exe
  • UserProfileManager.exe
  • Winlogon.exe
  • Wsnm.exe
  • Wsnm_jms.exe
  • Wssm.exe

Paths

  • McAfeeCommon Framework
  • Pagefile.sys
  • %systemroot%System32Spool (replace %systemroot% with actual Windows directory)
  • %systemroot%SoftwareDistributionDatastore (replace %systemroot% with actual Windows directory)
  • %allusersprofile%NTUser.pol
  • %systemroot%system32GroupPolicyregistry.pol (replace %systemroot% with actual Windows directory)

Once the policies are configured and associated with the appropriate system tree group, the clients should begin to report into the ePO server as shown below.

image

ePO – Pool A Systems

The Managed State and Last Communication columns indicate if a client is being managed by ePO and when the last time was that client communicated with the ePO server.

McAfee MOVE – Test Results

The McAfee MOVE solution was tested by deploying desktops both with and without the MOVE Agent installed on the master image. Once the desktops were deployed and the virtual desktops all appeared as “managed” in the ePO console, a popular VDI workload generator was used to simulate a user logon storm and steady state workload. The virtual desktops were logged in sequentially over the course of one hour, and the test workload ran for one full hour after the last desktop was logged in and a steady state user load was achieved. Both tests used identical settings; the only difference was whether or not the MOVE agent was installed on the virtual desktops. Three metrics are displayed: storage processor IOPS, ESXi % Processor Time, and ESXi GAVG.

- Storage Processor IOPS

The graph below provides a comparison of the total number of IOPS of both storage processors observed during the tests. The results both tests are shown.

 image
McAfee MOVE – Storage Processor IOPS Comparison

There was no significant difference between the storage processor IOPS observed during either of the the tests.  There was a small increase in IOPS during the logon storm phase of the test associated with the MOVE Antivirus Offload Server needing to scan a number of files for the first time. By the time that the logon storm had completed the MOVE Antivirus Offload Server had cached the scan results for these files, and scanning was not required again on subsequent desktops. This is evident in the IOPS observed during the steady state phase as the IOPS observed varied by less than two percent.

- ESXi – % Processor Time

The image below displays the average ESXi CPU load that was observed during the tests.

image

McAfee MOVE – ESXi CPU Load

The CPU load results were similar for both tests. A slightly higher CPU load was observed during the first half of the login storm, which can be attributed to the increased antivirus scanning that was occurring during that time period as the antivirus cache was established. As the MOVE Antivirus Offload Server built a cache of files that had been scanned the amount of scans that were required decreased along with the ESXI server CPU load. The CPU load observed during the steady state phase was similar between both tests.

- ESXi – GAVG (disk response time observed at the hypervisor level)

The next figure displays the average ESXi disk response time, also referred to as the GAVG, observed during the tests. The desktops were deployed as linked clones so the response time for the replica LUN and the linked clone LUN are displayed.

image

The disk response times observed during the both tests were similar for the replica and linked clone LUNs during both the logon storm and steady state phases of the test.

Results

McAfee MOVE provided file level antivirus protection with very little noticeable impact to the virtual desktop. I expected the performance numbers to stabilize as the MOVE cache warmed up, and based on the metrics provided it is obvious that they did. All in all I was pleased with the performance I saw and I would recommend that anyone interested in antivirus designed for VDI look at MOVE and see if it meets their needs. If you are already using ePO you can have MOVE up and running in less than an afternoon.

The McAfee MOVE agent installed on the virtual desktops required less than 29 MB of space and the related services utilized approximately 22 MB of memory and no processor time at idle. When compared to the disk, memory, and CPU utilization of the traditional McAfee VirusScan client as observed during my tests, the McAfee MOVE agent used 75 percent less disk space and 60 percent less memory. This does not include the impact of the VirusScan on-access scanner, which was observed utilizing up to 25 percent of CPU time and 220 MB of ram at random intervals. Since the MOVE agent offloads this activity to the MOVE Antivirus Offload Server, the impact on the desktops is drastically reduced.

Whether you look into MOVE or a competing product, it is worth your time to look at “new generation” antivirus solutions for your VDI deployments.

Additional References

VMware

· VMware View Architecture Planning

· VMware View Installation

· VMware View Administration

· VMware View Security

· VMware View Upgrades

· VMware View Integration

· VMware View Windows XP Deployment Guide

· VMware View Optimization Guide for Windows 7

· vSphere Installation and Setup Guide

· Anti-Virus Practices for VMware View

· VMware KB Article 1027713

McAfee

· McAfee MOVE Antivirus 2.0.0 Product Guide

· McAfee MOVE Antivirus 2.0.0 Software Release Notes

· McAfee MOVE Antivirus 2.0.0 Deployment Guide

How to fix some (common?) VMware View problems

While I have worked at EMC  less than 3 months I have already created and destroyed about 8000 desktops doing some VMware View – EMC VNX testing. The goal of my work is to validate the performance of View running on the VNX platform and document my findings.

As of late I have been working with the some performance tuning that increases the number of desktops that View will configure/refresh/recompose at once. Page 18 of the VMware View 5 Performance and Best Practices document details a couple of changes you can make in ADAM that will speed up these actions. VMware has not responded to my request about the specifics of the values, but based on my experience the pae-SVICreationRampFactor increases the number of desktops that View will deploy at once and the pae-SVICreationRampFactor affects refreshes and recomposes (best as I can tell). The location of these values within ADAM is displayed in the image below.

image

As VMware states in the Performance and Best Practices document your vCenter needs to be capable of this additional provisioning load as you will be doubling (and then some) the typical number of operations View performs. While I had no problems with vCenter itself, I did have issues with ESXi5 hosts being disconnected from vCenter even after increasing the timeout from 30 to 60 seconds. I am going to attempt to gather some ESXTOP data to see if the issue is related to the service console running out of ram or something else.

The side affect of these hosts being disconnected is that it means I end up with a number of desktops in a partially configured state, some of which View cannot fix. In some extreme cases you can’t remove these desktops from the View instance through the GUI, which means you must edit the View ADAM instance and View Composer database directly.

The orphaned data you may end up with includes:

  • View desktop and disk information in ADAM
  • View desktop, disk, and outstanding task information in the View Composer database
  • Computer accounts in Active Directory

Removing this information is important as it can affect your ability to deploy desktops with the affected names again and stuck View Composer tasks can drag the performance of Composer in general to a crawl (trust me I’ve seen it).

Thankfully VMware has created a KB article that explains how to search for these orphaned desktops and remove their information. For the purpose of this article I’m going to assume that you have orphaned desktops that you cannot access; the KB article provides additional options that cover situations where you are able to log into the desktop you wish to remove.

Warning: Before you delete entries from either database, make sure you have a current backup of the database and disable provisioning for the pool in View Manager.

Removing the virtual machine from the ADAM database

Find the virtual machine’s GUID stored in ADAM:

  • Log in to the machine hosting your VMware View Connection Server through the VMware Infrastructure Client or Microsoft RDP.
  • Open the ADAM Active Directory Service Interfaces Editor.
  • In a Windows 2003 Server, click Start > Programs > ADAM > ADAM ADSI Edit.
  • In a Windows 2008 Server, click Start > All Programs >Administrator Tools > ADSI Edit.
  • Right-click ADAM ADSI Edit and click Connect to.
  • Ensure that the Select or type a domain or server option is selected and the destination points to localhost.
  • Select Distinguished Name (DN) or naming context and type dc=vdi, dc=vmware, dc=int.
  • Run a query against OU=Servers, DC=vdi, DC=vmware, DC=int with the this string: (&(objectClass=pae-VM)(pae-displayname=<Virtual Machine name>))
    • Note: Replace <Virtual Machine Name> with the name of the virtual machine you are searching for.  You may use * or ? as wildcards to match multiple Desktops
  • Record the cn=<GUID>.

Take a complete backup of ADAM and Composer database. For more information, see Performing an end-to-end backup and restore for View Manager 3.x/4.x (1008046).

Delete the pae-VM object from the ADAM database:

  • Open the ADAM Active Directory Service Interfaces Editor.
  • In a Windows 2003 Server, click Start > Programs > ADAM > ADAM ADSI Edit.
  • In a Windows 2008 Server, c lick Start > All Programs > Administrator Tools > ADSI Edit.
  • Right-click ADAM ADSI Edit and click Connect to.
  • Choose Distinguished name (DN) or naming context and type dc=vdi, dc=vmware, dc=int.
  • Locate the OU=SERVERS container.
  • Locate the corresponding virtual machine’s GUID (from above) in the list which can be sorted in ascending or descending order, choose Properties and check the pae-DisplayName Attribute to verify the corresponding linked clone virtual machine object.
  • Delete the pae-VM object.
  • Note: Check if there are entries under OU=Desktops and OU=Applications in the ADAM database.

Removing the linked clone references from the View Composer database

In View 4.5 and later, use the SviConfig RemoveSviClone command to remove these items:

  • The linked clone database entries from the View Composer database
  • The linked clone machine account from Active Directory
  • The linked clone virtual machine from vCenter Server

Before you remove the linked clone data, make sure that the View Composer service is running. On the View Composer computer, run the SviConfig RemoveSviClone command.

For example: SviConfig -operation=RemoveSviClone -VmName=VM name -AdminUser=the local admin user -AdminPassword= the local admin password -ServerUrl=the View Composer server URL

Where:

  • VmName- The name of the virtual machine to remove.
  • AdminUser- The name of the user who is part of the local administrator group. The default value is Administrator.
  • AdminPassword- The password of the administrator used to connect to the View Composer server.
  • ServerUrl – The View Composer server URL. The default value is https://localhost:18443/SviService/v2_0
  • The VmName and AdminPassword parameters are required. AdminUser and ServerUrl are optional.

Note: The location of SviConfig is:

  • In 32-bit servers – <install_drive> Program FilesVMwareVMware View Composer
  • In 64-bit servers – <install_drive> Program Files (x86)VMwareVMware View Composer

In View 4.0.x and earlier, you must manually delete linked-clone data from the View Composer database.

To remove the linked clone references from the View Composer database:

  • Open SQL Manager > Databases > View Composer database > Tables.
  • Open dbo.SVI_VM_NAME table and delete the entire row where the virtual machine is referenced under column NAME.
  • Open dbo.SVI_COMPUTER_NAME table and delete the entire row where the virtual machine is referenced under column NAME.
  • Open dbo.SVI_SIM_CLONE table, find the virtual machine reference under column VM_NAMEand note the ID. If you try to delete this row it complains about other table dependencies.
  • Open dbo.SVI_SC_PDISK_INFO table and delete the entire row where dbo.SVI_SIM_CLONE ID is referenced under column PARENT_ID.
  • Open dbo.SVI_SC_BASE_DISK_KEYS table and delete the entire row where dbo.SVI_SIM_CLONE ID is referenced under column PARENT_ID.
  • If the linked clone was in the process of being deployed when a problem occurred, there may be additional references to the clone left around in the dbo.SVI_TASK_STATE table and dbo.SVI_REQUESTtable:
  • Open dbo.SVI_TASK_STATE table and find the row where dbo.SVI_SIM_CLONE ID is referenced under column SIM_CLONE_ID. Note the REQUEST_IDin that row.
  • Open dbo.SVI_REQUEST table and delete the entire row where dbo.SVI_TASK_STATE REQUEST_IDis referenced ID.
  • Delete the entire row from dbo.SVI_TASK_STATEtable
  • In dbo.SVI_SIM_CLONEtable, delete the entire row where the virtual machine is referenced.
  • Remove the virtual machine from Active Directory Users and Computers.

Deleting the virtual machine from vCenter Server

Note: If you run the SviConfig RemoveSviClone command to remove linked clone data, the virtual machine is removed from vCenter Server. You can skip this task.

To delete the virtual machine from vCenter Server:

  • Log in to vCenter Server using the vSphere Client.
  • Right-click the linked clone virtual machine and click Delete from Disk.

While this process appears difficult it really isn’t. At the end of the day you have information in about 8 places that you need to remove, and the structure of the View Composer and ADAM databases is rather simple. As I said earlier in the article leaving this information in place can affect View Composer performance and prevent desktops from being created as View “sees” the names still in use. I’ve actually made it a point to check all these database locations when I am done with a test set just to make sure that I have a healthy View instance for my next test.

If you decide to start altering the default View settings to speed up recomposes/refreshes/deployments I recommend you make sure your ADAM and SQL backups are current and that you watch out for these issues.

- Jason

Host Profiles for vSphere customers without Enterprise Plus

Not everyone can justify the costs associated with VMware vSphere Enterprise Plus licenses.

For vSphere 5 the licensing is broken down as follows, with each “higher” license level adding to the features of the previous levels:

  • “Standard” featuring:
    • High Availability
    • Data Recovery
    • vMotion
  • “Enterprise” adds the following features:
    • Virtual Serial Port Concentrator
    • Hot Add
    • vShield Zones
    • Fault Tolerance
    • Storage APIs for Array Integration
    • Storage vMotion
    • Distributed Resource Scheduler
    • Distributed Power Management
  • “Enterprise Plus” adds:
    • Distributed vSwitch
    • Network and Storage I/O Controls
    • Host Profiles
    • Auto Deploy
    • Policy-Driven Storage
    • Storage DRS
Enterprise Plus adds a lot of neat features but can be a little harder to justify when trying to get a project approved. The good news is that you can use vSphere PowerCLI to get some (parts) of those Enterprise Plus “features” for free.
-
My example for today is host profiles. If you manage an environment with a large number of ESXi hosts the vSphere Enterprise Plus Host Profiles feature is something you should probably have. It will help you streamline host deployment and guarantee a standard configuration across your entire environment (important when you are subject to various regulations). If you run a smaller environment, and regulatory concerns aren’t as big of a deal to you, and IF you are willing to spend some time with PowerCLI perhaps you can learn to live without the Host Profiles (and those more expensive Enterprise Plus licenses).
-
Disclaimer: I am not a PowerCLI expert. I do however own the Vmware vSphere PowerCLI Reference: Automating vSphere Administration book and I can attest that it is the most “powerful” vSphere book out there. I would expect the authors to release an updated version soon which covers some of the new features of vSphere 5, as there appears to be enough new features to warrant it.
-
My goal was to create a script that configures my lab hosts with all those tedious settings I would otherwise have to do by hand. I’ve divided up the commands into three sections but the truth is that you could paste them all at once into the PowerCLI window. You will of course need to download PowerCLI first to run these commands.
-
My goals were as follows:
  • Enable SSH Server, create the necessary firewall exceptions, and start the SSH Server service (in that order).
  • Enable the ESXi shell, create the necessary firewall exceptions, and start the ESXi Shell service (in that order).
  • Set the NTP servers, create the necessary firewall exceptions, and enable the NTP service (in that order).
  • Set the Sysvol server and create any required firewall exceptions
  • Set the domain name and DNS search domain values
  • Create the firewall rules required for vCenter update manager
The commands below query vCenter for all attached hosts. You could target individual ESXi hosts if you wanted, you would just need to edit the commands some.
Set syslog host, search domains, domain name, and NTP servers:
$esxhosts=Get-vmhost
foreach ($esxhost in $esxhosts) {Set-VMHostAdvancedConfiguration -VMHost $esxhost -Name Syslog.global.logHost -Value ‘udp://10.0.0.5:514′}
foreach ($esxhost in $esxhosts) {set-vmhostnetwork -VMHost $esxhost -name SearchDomain “lab.vjason.com”}
foreach ($esxhost in $esxhosts) {set-vmhostnetwork -VMHost $esxhost -name DomainName “lab.vjason.com”}
foreach ($esxhost in $esxhosts) {Add-VmHostNtpServer -VMHost $esxhost -NtpServer “10.0.0.2″}
foreach ($esxhost in $esxhosts) {Add-VmHostNtpServer -VMHost $esxhost -NtpServer “10.0.0.3″}
Firewall exceptions for VUM, SSH inbound, syslog, and NTP:
get-vmhost | Get-VMHostFirewallException | where {$_.Name -eq “SSH Server”} | Set-VMHostFirewallException -Enabled:$true
get-vmhost | Get-VMHostFirewallException | where {$_.Name -eq “vCenter Update Manager”} | Set-VMHostFirewallException -Enabled:$true
get-vmhost | Get-VMHostFirewallException | where {$_.Name -eq “NTP client”} | Set-VMHostFirewallException -Enabled:$true
get-vmhost | Get-VMHostFirewallException | where {$_.Name -eq “syslog”} | Set-VMHostFirewallException -Enabled:$true
Enable/start service console, SSH, and NTP:
Get-VMHost | Get-VMHostService | Where { $_.Key -eq “TSM-SSH” } | set-VMHostService -Policy On
Get-VMHost | Get-VMHostService | Where { $_.Key -eq “TSM” } | set-VMHostService -Policy On
Get-VMHost | Get-VMHostService | Where { $_.Key -eq “ntpd” } | set-VMHostService -Policy On
Get-VMHost | Get-VMHostService | Where { $_.Key -eq “TSM-SSH” } | start-VMHostService
Get-VMHost | Get-VMHostService | Where { $_.Key -eq “TSM” } | start-VMHostService
Get-VMHost | Get-VMHostService | Where { $_.Key -eq “ntpd” } | start-VMHostService
These commands are just the tip of the iceberg when it comes to PowerCLI. You only need refer to the cmdlet reference to see the many different ways that you can leverage PowerCLI to help you configure and administer your vSphere environment, and maybe even make up for the lack of features that you are missing by only having Standard or Enterprise licensing.
- Jason

VMware View 5 group policies

This is the first post in a small series I am doing that will walk through some of the new features of VMware View 5. This product was announced on August 30th at VMworld, and features a number of improvements.

The subject for today is new group policy templates that have been introduced with VMware View 5. VMware has introduced new (Microsoft Active Directory) group policies that will grant View admins and architects further control over their VDI environment. The big two policies focus on two things: maintaining control over bandwidth utilization by allowing a more granular control over session image quality AND user persona control.

Some of the more prominent policies in these new templates focus on new View 5 features such as:

  • Client side caching: Caches image content on client to avoid retransmission
  • Build to lossless: 0-60 second window for the View client to “build” images to a fully lossless state
    • Perceptually lossless: Known as “build to lossless disabled”. Primarily for task and knowledge workers as well as the majority of desktop use cases. Use when bandwidth efficiency is more important than image quality.
    • Lossless (aka “fully lossless”): Best quality available. Use cases include healthcare imaging, designers, illustrators, etc.
  • Persona management: View 5 Persona Management is designed to extend the use cases for stateless desktops by enabling control over more end user settings than ever before. View admins will be able to “manage settings and files, policies such as access privileges, performance and various other settings, as well as suspend-on-logoff, from a central location”. View Persona Management will maintain this personalization across sessions with higher level of performance than previous options.

Lets get to the policies! I am detailing all of the View 5 policies that are available today, although only the first two templates are what you would call new. Most of these settings will be familiar to existing View admins if not Microsoft admins, so I am just going to list all the settings for the time being. If you want to know more about a setting comment on the article and I will provide more details.

PCoIP Configuration group policy template – pcoip.adm

This template focuses on PCoIP optimization settings and contains machine group policies located in two different sections: “Overridable Administrator Defaults” and “Not Overridable Administrator Settings”. The settings for each section are the same, the only difference is whether or not the values can be overridden.

Top level hierarchy of the PCoIP Session Variables policies:

image

The settings in detail (Again, the settings are the same for both the “Overridable Administrator Defaults” and “Not Overridable Administrator Settings”):

image

These settings are all fairly self explanatory, and combine to give the View admin a significant amount of control over View PCoIP client connections.

View Persona Management group policy template – ViewPM.adm

This template is for View 5 Persona Management settings and contains machine group policies located in four different sections: Roaming & Synchronization, Folder Redirection, Desktop UI, and Logging.

Top level hierarchy of the VMware View Persona Management computer policies:

image

VMware View Persona Management > Roaming & Synchronization – Computer Policies:

image

VMware View Persona Management > Folder Redirection – Computer Policies:

image

VMware View Persona Management > Desktop UI – Computer Policies:

image

VMware View Persona Management > Logging – Computer Policies:

image

Remaining group policy templates (4 in all) – vdm_agent.adm, vdm_client.adm, vdm_common.adm, and vdm_server.adm

These templates are similar to what was included with View 4.6, and are for controlling general settings of theView agents, clients, servers, and other common settings. The policies are broken down as follows:

Top level hierarchy of the View 5 computer policies:

image

VMware View Agent Configuration (folder root) – Computer Policies:

image

VMware View Agent Configuration > Agent Configuration – Computer Policies:

image

VMware View Client Configuration (folder root) – Computer Policies:

image

VMware View Client Configuration > Scripting Definitions – Computer Policies:

image

VMware View Client Configuration > Security Settings – Computer Policies:

image

VMware View Common Configuration (folder root) – Computer Policies:

image

VMware View Common Configuration > Log Configuration – Computer Policies:

image

VMware View Common Configuration > Performance Alarms – Computer Policies:

image

VMware View Server Configuration (folder root) – Computer Policies:

image

Top level hierarchy of the View 5 user policies:

image

VMware View Agent Configuration > Agent Configuration – User Policies:

image

VMware View Client Configuration (folder root) – User Policies:

image

VMware View Client Configuration > Scripting Definitions – User Policies:

image

VMware View Client Configuration > RDP Settings – User Policies:

image

VMware View Client Configuration > Security Settings – User Policies:

image

 

That completes the listing of all the View 5 policies that were in place as of this post. If things change when View 5is officially released I will update this post with the most current information and make a note of any changes of interest.

If you have questions please don’t hesitate to ask! I’ll do my best to get answers for you.

- Jason

VCP5 exam: Passed!

Today is August 29th, 2011. This also happens to be the day that the VCP-510 exam goes live, which is also known as the (VMware) VCP5 exam. My VCP4 number was in the 66000 range, and I really wanted a lower number this time around. I also wanted to get the exam out of the way long before the end of February 2012 “upgrade your VCP4 to 5 without taking any additional classes” deadline. After February 2012 you will be required to attend a “What’s New in vSphere 5″ class AND pass the exam in order to earn your VCP5 certification. People who hold the VCP3 were given a reprieve that allows them to take the shorter (read: less expensive) “What’s New in vSphere 5″ class and sit for the VCP5 exam. As with the VCP4, VCP3′s have until the end of February 2012 to take the “What’s New” course and pass the exam after which they will need to take the full vSphere 5 Install/Configure/Manage (or Troubleshooting) course before they can earn their VCP5 cert.

Side note: If you need to take a class JOIN VMUG Advantage! 20% off of classes and exams and you get access to official VMware coursework (visit the link for details). Yes it costs $200 per year but you don’t have to renew. If you are like me virtualization is becoming/has become integral to your career so you probably will renew when the time comes.

What can I say about the test? Nothing specific per the rules of course. I’ll keep my bullet points simple and ambigious:

  • Read the exam blueprint here (you will need to register for a VMware Learning account).
  • Read what “Andre” has to say about the beta version of the exam.
  • Remind yourself that much of what you know about configuring and administering VMware has NOT changed since 4.X.
  • Read the “vSphere 5 What’s New” documents that Duncan Epping has provided links to here.
  • Review the vSphere 5 documentation. Some of the PDF links don’t work but the other versions seem to. I think the VMware employees are all busy at VMworld right now and haven’t fixed the PDF’s yet.
  • The test is 85 questions total; it took me about 90 minutes to complete including about 10 questions I had marked for review.
  • If you passed the VCP4 exam, and feel you could still pass it today, I think you are well on your way to passing the VCP5.
  • The test is fair, in my opinion.
I think the exam is very similar in style, feel, and content to the VCP4 exam. I have about 3.5 years VMware experience, about a year of which is with ESXi 4 and the rest with 3.5. Most of that time was as an administrator of smaller VMware datacenters, but the advantage being that I worked with everything (VMware, Microsoft, EMC storage, Avamar backups) but the core switches (although I could handle what I needed to know for VMware if required). This means I got to do some vSphere setup and upgrades in addition to just administration. I think that it would be very difficult to pass the exam if I had nothing more than VMware administration experience, so if that is you I recommend going through some setups from empty standalone hosts to HA/DRS clusters.
What is next for me certification wise? Well the truth is the exam hasn’t been released yet, nor has the product for that matter. I should be able to answer that one after a few more press releases trickle out of VMworld.

Issue with ESXi 5 upgrades (via VUM) and Iomega IX2-200 VMDK datastores

Over the last couple months I have been performing upgrades from ESXi 4.1, to 5.0 Beta, to 5.0 RTM, and earlier today to 5.0 GA. With the exception of the 5.0 GA release all these upgrades were handled with VMware Update Manager (VUM). I have encountered a few errors along the way I and I felt it was worthwhile to share them.

First of all to use VUM and the ESXi 5.0 GA ISO to upgrade to your hosts to 5.0 GA you must be running  ESXi 4.0.4.1 or later (per details from VUM), but not any previous release of 5.0. The odd thing is that you can do an upgrade of your pre-ESXi 5.0 GA hosts by booting with the install CD and choosing the “upgrade” option. This preserves all the settings as you would expect it to. The ESXi “depot” installer for 5.0 GA for VUM has not been released yet so I do not know if you will be able to use it to upgrade 5.0 Beta or RC to 5.0 GA (stay tuned as I have a ton on hosts running 5.0 RTM so I will test the depot install as soon as I get it!).

For further details about upgrade requirements and such visit the vSphere 5 online documentation here about that very subject. I have had nothing but success using VUM; I’ve used it for ESX 4 to ESXi 5 upgrades as well ESXi 4 toESXi 5 upgrades, even with the guest VM’s *suspended* during the upgrade (I was feeling adventurous). The onlyissue I had was with my Iomega IX-200 (Cloud Edition) that I use for iSCSI shared storage in my lab. I had no issues going from 4.X to 5.0 RTM; the datastores were available as expected after the VUM orchestrated upgrade. This morning though I went from 5.0 RTM to 5.0 GA and my datastores were not available, however the iSCSI connected devices did display.

Device View:

 Datastore View:

 Devices looks good, but where is my VMDK volume (only the local volume is shown)?

I’ve done a little work with SAN copied VMDK volumes before and as such have had to deal with VMDK volume resignaturing. VMware has a nice KB articlethat explains why resignaturing is needed:

VMFS3 metadata identifies the volumes by several properties which include the LUN number and the LUN ID (UUID or Serial Number). Because the LUNs now have new UUIDs, the resulting mismatch with the metadata leads to LVM identifying the volumes as snapshots. You must resignature the VMFS3 volumes to make them visible again.

Ignore the bit that specifies VMFS3; the KB article hasn’t been updated and it would appear that this issue applies to VMFS5 as well. In a nutshell what is happening is that ESXi sees the “new” datastore as a snapshot and as such does not mount it as a VMDK volume.
Resignaturing the drives is quick and painless although please remember that it will affect every host that connects to the datastore that may not have been upgraded yet and/or is still accessing it. I had brought down the whole lab so I wasn’t concerned about this. The steps are as follows:
  1. Power off any/all VM’s that may be running on the datastore(s) you will be resignaturing.
  2. SSH into one of the hosts that currently has access to the datastore(s) that are having problems.
  3. Execute “vmkfstools -V” from the ESXi console to rescan the volumes on the host. If this fixes the problem then you are all set. Odds are you already did this via the vSphere client so you need to move on to the next step.
  4. Remove any VM’s from the ESXi inventory that reside on the volume(s) you will resignature.
  5. Verify that the volumes are seen by the host by executing “esxcfg-mpath -l | less” from the ESXi console.
  6. From the ESXi console execute “esxcfg-advcfg -s 1 /LVM/EnableResignature“. This will resignature ALLdatastores that were detected as snapshots during the next rescan, so hopefully you remembered to take all the precautions I specified above.
  7. Repeat step three to initiate the rescan and perform the resignature operation. YOU ARE NOT DONE YET! You should however be able to see the VMDK volumes on the host now at this point (they will have new datastore names that start with “snapshot-“, if not your problem goes beyond the scope of this post.
  8. Execute “esxcfg-advcfg -s 0 /LVM/EnableResignature” to disable the resignature-during-rescan option. If you fail to do this your datastores will be resignatured during EVERY rescan, which I am fairly certain you do not want.
  9. Still not done, now execute step 3 again to make sure the volumes stay mounted after the rescan. Assuming that they appeared during step 7 they should still be present after you run another rescan. If they disappear after this step it means you did something wrong in step 8 and the drives were resignatured again. Repeat step 8 again, then this step, and verify that the volumes remain.
  10. Browse the datastore(s) and re-add all your VM’s to your inventory. You do that by browsing to the VM folder, right clicking on the vmx file within, and selecting “Add to Inventory”.
  11. Rename the datastores to match what they were before. This is an optional step but if you are like me the datastore names have meaning and they are part of your overall design.
Everything is back to normal in the lab:
I must admit that it is scary when the datastores disappear. Remain calm and remember that during a CD based (boot time) install you don’t have access to the iSCSI or NFS volumes (unlike fiber channel) so you are most likely just a resignature away from fixing your problem. The fix takes less than a couple of minutes and you will be off and running with your new ESXi 5.0 GA install.
Update (10-1-2011): I encountered this issue again after updating the firmware of my Iomega IX2-200; he same fix worked to restore access to my datastore.
- Jason

VMware vSphere 5 officially released

This morning VMware decided to “officially” release VMware vSphere 5.

  • To download vSphere 5 access the VMware evaluation center here.
  • The vSphere 5 documentation has also been released; access it here.
  • The ESXi 5 VMware community forum is also available. It is just one of many VMware communities so I encourage you to visit more of them.

I will save my comments for a later blog post since I now have 5 vCenter instances and 30+ hosts that need upgraded. In the interim I encourage you to try out vSphere 5 and learn about all the exciting new features that many bloggers before me have already discussed.