Disaster Recovery Is Like Your Sump Pump

Oh snap.

Oh snap.

It seems to me that nobody cares about their sump pump until it stops working. I use this analogy all the time with customers and given yesterday’s torrential downpour and flood warnings here in North East Ohio, I thought it fitting to jot down a quick post about it. I’ll start with a story from a few years ago. I was traveling and out of town, when I got a phone call from my wife that the basement had completely flooded with 2′ of water. It turned out that the water main had broken away from the foundation of the house and was pouring water into the ground surrounding the foundation at a ridiculous rate. It didn’t take long for the primary sump pump to burn out and then all of a sudden boom; my finished basement was a swimming pool. The damage ended up being right around the $18,000 after repairs and replacements.
Often times the folks I’m dealing with are the sysadmins and network admins, one of the frequent rumblings I hear is the fact that management refuses to spend money on disaster recovery. When I hear this, I simply look at them and ask them a very simple series of questions that usually looks like this:

Do you have a basement?  “Yes.” Do you have a sump pump in your basement?  “Yes.” Do you have two sump pumps in your basement with a battery backup?  “No.” Why?   “Because my basement has never flooded.”

And at this point the light bulb actually goes on and the sysadmin/network admin realize that they themselves are managing their own homes, something which they usually have complete autonomy over, in the same fashion as management sometimes sees IT. In the same way that the admins have not invested in a sump pump disaster recovery solution in their own home, this is often parallel to the philosophy that management undertakes with IT systems.. Sadly enough, we saw that with my own behavior in the above story, that I really only gave a thought to actively doing something to mitigate the risk of a disaster only once it actually happened to me. This is the same exact management quandary that senior IT management must battle; risk management. All of a sudden in my case, $330 dollars to purchase a backup sump and battery with trickle charger seems a lot more reasonable.

Sometimes a flooded basement is just what management needs to wake up and smell the coffee. I don’t think anybody wants you to have a flooded basement or wants your IT infrastructure to have a critical fault or suffer a disaster, but our job as good IT consultants is to help prepare folks for when it does and help them realize the risks to their business if they don’t.

How to Open Multiple Instances of the Horizon View Client on OSX

So, I have to put this one up here because I use it way too often. It bugged me that I couldn’t be remoted into my VDI home lab and a customer site at the same time. This command will spawn a separate instance of the View client on OSX that can be used completely independently from the first instance.

open -n /Applications/VMware\ Horizon\ View\ Client.app

###1CMD###2

My Little VSAN

MyLittleVSANSo I finally got around to refreshing my VSAN after the 1.0 release and I was hit by the AHCI bug. I must say, I had forgotten how powerful it is to put the disk “close to home” with VSAN technology is and how truly simple it is to set up. The goal of this post is simply to provide you with the hardware that I utilized to make my little VSAN lab, which has ~11TB of capacity and 360GB of flash. I also ran two quick IOmeter runs and posted their output. And yes the graphic over there is cheesy. I have lots of children in my house so it somewhat explains my rationale.

I purchased 4 refurbished C1100 CSTY-24 servers spec’d as such:

(2) Xeon L5520 CPUs @ 2.27Ghz

72 GB Memory

(3) 2TB Hitachi  7200 RPM spinning disks per server

Host_ConfigVSANVSAN2

I purchased them from deepdiscountservers and they cost me about $900 USD each, which was a pretty good deal, even when comparing against a whitebox build. Of note, since these are refurbed OEM’d custom spec’d systems, Dell will not support them and good luck updating the BIOS on them. One of them had some bizarre fan issue where it would spin the fans all the way up at random even though the temp and workload were not changing, so I’m just dealing with that. If I had to buy again, I’m pretty sure I’d actually choose a different flavor of pizza box. May you have better fortune than I in the refurb land. Also of note, there is no RAID card in this server, just an AHCI controller, which is perfect for what we need to do with VSAN. Assuming you’re on the Beta refresh. : )

I also purchased (3) Kingston SSDNow V300 120GB SATA disks from Microcenter for a cool $70 each. 

So far with the VSAN beta refresh, everything has been great! Two base runs of iometer on a Windows 7 VM came out as such. I used http://vmktree.org/iometer/ to format the output nice and quick. I will say this, not bad considering it’s refurbed servers with 7200 RPM disks and non-certified MLC flash! Not bad for a basement project, not bad at all. Stay thirsty my friends.

VM Policy: 1 Failure

WIn7IO_1Failure

VM storage Policy: 1 Failure, 1 Stripe Per Disk, 20% Read Cache

WIn7IO_1Failure1stripe20percent

vDGA with View 5.3 and K1 and K2 Cards: Pool Settings, Linked Clones, Black Screens Oh my!

I ran into an issue when deploying vDGA with View 5.3 on Cisco C240 M3s with both K1 and K2 nVidia graphics cards.

First, I’m sure if you’re here, you probably already downloaded and read the whitepaper on how to configure the entire setup.

Virtual Machine Graphics Acceleration Deployment Guide – VMware

If not, go read it.

Here was the setup I was utilizing:
vCenter 5.1 U1 1235232
ESXi 5.1 U1 799733
View 5.3 Build 1427931
View Clients
Windows 7 View Client 2.2.0 Build 1404668
Wyse P25 Tera 2

Cisco C240 M3
nVidia K1 Graphics Card
Teradici 2800 Apex Card

Cisco C240 M3
nVidia K2 Graphics Card
Teradici 2800 Apex Card

The problem I was running into was that most of the documentation was focused on the vSGA configuration, rather than vDGA and it wasn’t perfectly clear what we had tried.

So when I followed the guide to the T with an existing image (and a new bare bones Win7 x64 golden image) would receive a black screen when attempting to log in. Another issue that I had with the documentation was that it was definitely not clear what the View pool settings should be, specifically as it pertains to 3D/hardware acceleration for the vDGA virtual machines.

So in my testing and trials and support with VMware, I finally came across a working solution. Some of these tips and tricks will definitely help you surf the water, I really wish someone had posted this before I had started, it would have made my life a lot easier.

1. Although the deployment guide in its current form says that ESXi 5.1 is supported, for me, it didn’t work with vDGA, we were getting black screens and yellow bangs on the K1 card in device manager on the Win7 VM. My team contacted some of the reps at NVIDIA who cleared up that point for us with the following information: since vDGA was tech preview with 5.1, VMware still supports it. However, for use with production environments you NEED to be at 5.5. No ifs ands or buts. So if you’re not at 5.5? Go ahead and plan that upgrade. Once we upgraded to 5.5 things pretty much fell into place.
2. Firmware firmware firmware. Make sure your hosts are COMPLETELY up to date on their firmware. Let’s be honest, you’re standing up a new environment for best in class graphics performance. Do you really want to start off with outdated firmware?
3. Do you need the xorg service to be running for vDGA to work? No, you don’t. Can you mix and match 2 GPUs on vDGA and 2 on vSGA? Sure you can.
4. When configuring the virtual machine for vDGA, it was really not clear on how to configure the VM and what pool settings to use for graphics acceleration. The pool needs to be set on “Use vSphere Settings” on the pool.

1. Use Virtual Machine hardware version 10. Yes…this does lock you into using the web client to edit the virtual machine settings. Rawr. Also, make sure you set the other advanced parameters that are specified in the vDGA guide that is linked at the top of this article.
2. Use Windows 7 SP1 64 Bit Only
3. Use at least 2 vCPU and 8Gb of RAM
4. Make sure you reserve all the assigned memory to the virtual desktop
General_Desktop2
6. You must edit these settings using the web client:

Make sure the video card is set to:

1. Software Acceleration

2. 3D Acceleration is UNCHECKED

So once we identified those roadblocks, the configuration now looks like this:

  • vCenter 5.5 U1 1235232
  • ESXi 5.5 1331820
  • View 5.3 Build 1427931
  • View Clients
  • Windows 7 View Client 2.2.0 Build 1404668
  • Wyse P25 Tera 2

Cisco C240 M3

  • (1) NVIDIA K1 Graphics Card
  • Teradici 2800 Apex Card

Cisco C240 M3

  • (1) NVIDIA K2 Graphics Card
  • Teradici 2800 Apex Card

And voila! It worked like a champ.

One of the open issues we discovered was that when we followed this procedure to add the cards to the linked clone VMs: We configured base image to have a PCI Device of one of the video cards. Load NVIDIA drivers. Run monteryenable and shutdown. Remove PCI device. Snapshot and deploy. Configure passthru devices for each linked clone VM.

When we did that, we were able to see that the cards remained attached to the VMs. However, the first time we tried to log onto the VM, we’d get disconnected from the session during the first logon. When we reconnected, we found that the logon had worked fine, but for some reason we dropped during the initial connection period.  It worked normally for subsequent logins, but every time we refreshed the PC the connection would drop during login. We still have an open case with VMware as to why this is, since the 5.3 release notes say pretty clearly that this is possible.

The moral of the story here is that vDGA rocks, and that we should probably get some more detailed documentation on the setup procedures, particularly around vDGA and linked clones.

Below I’ve got two different videos using a REMOTE connection. When running a simple Geeks3D test using the native graphics and PCoIP, then the second with the K1 utilizing vDGA.

vSGA & PCoIP with 3D Acceleration from Joe Clarke on Vimeo.

vDGA with Nvidia GRID K1 & PCoIP from Joe Clarke on Vimeo.

Speeding Up VMware View Logon Times with Windows 7 and Imprivata OneSign

Logon times are funny. It’s the first interaction that a user has with a system and they will base a lot of their opinion around the solution based on that first interaction. With a stateless VDI desktop that VMware View can provide, the user is logging onto a machine that they’ve never logged onto….every time. This creates an interesting problem. The first time you EVER log onto a PC, the experience usually stinks. It’s because your profile is being built from whatever has been defined on your box as the “default profile”, plus whatever scripts and GPOs need to be applied resulting in a 40-60 second login.

I am here to tell you today that the last few weeks of my life were dedicated to strapping a rocket to logon times with Windows 7, then adding the Imprivata solution into the mix and seeing how fast I could get logons from there. The use case that I was working with didn’t care about saving a users profile, they wanted the FASTEST VMware View logon times available no matter what it took, and Kiosk mode was not an option because user security context needed to be established. My first thought was, hey, let’s see if we can use mandatory profiles or somehow force all users to use the same profile that’s already loaded on the desktop. I was not successful with mandatory profiles or with forcing a user into a profile that was already on the machine, but during my search I was able to get connected with a Sysadmin who was able to achieve subsequent 10 second logins with Windows 7 and VMware View.

Quick breakdown on the environment I was working with: VMware View 5.1.1, ESXi 5.0, vCenter 5.0, Imprivata OneSign 4.6.

Here is what I learned during the course of my time.

  1. Windows XP is a dog. Get off it if you’re on it. You’re about to be forced to anyway, since the VMware stance is that they will not support a product that the original manufacturer itself won’t support.
  2. Establish baselines without hardware performance in the picture. If performance is an issue, troubleshoot that as well, but do it apart from your baselining. If you can’t prove what you Establish a best case for physical and virtual. Put your VDI linked clones on a flash LUN if you have one to remove the possibility of storage contention.
  3. Look at group policy and logon scripts. In my case this wasn’t the issue, but I’ve seen countless others where simply turning on asyncrhonous policy processing cut the times in half. Here’s the GPO optimization guide Microsoft has. Good stuff there. http://support.microsoft.com/kb/315418
  4. Slim and Trim down the profile. If the use case you have doesn’t need any profile data such as the desktop or documents folder? Guess what? Don’t include it.

Here are the tricks that I used to get 10 second initial logins with VMware View and Windows 7 while insuring the AppData in the profile didn’t bloat up over time and ultimately hurt login times again. Remember that in this use case, we expected that the users utilize their mapped drives to save ANY DATA that they cared about. It was understood and expected that their desktop folder and documents folder would NOT follow them around.

  1. We got group policy out of the equation. Machines were in a blocked OU. No loopback policy, only one GPO was utilized for the persona management configurations. Ensure that policy is processed and you log on with domain user in the golden image before building the pool. Let’s be honest, with refresh on logoff, how bad can a user mess up their PC?
  2. We used VMware View persona management and didn’t synchronize anything. I’ll go into the exact policy settings I utilized later and what I would have done if it were a requirement to keep the documents and desktop folders. Remember, we’re going for total speed here, not saving a user’s hello kitty wallpaper.
  3. Windows 7. Optimized for View following the guide. 2GB Memory. 1 vCPU. Initial logon times with a 1.75MB profile on a known good performance profile share. Consistently 15-17 second login.
  4. Windows 7. Optimized for View following the guide. 2GB Memory. 2 vCPU. Initial logon times with a 1.75MB profile on a known good performance profile share. Consistently 10 second login. This was one of the bigger performance notices. And before you go wild bumping your golden images to 2 vCPUs, make darn sure you’re aware of the impact that’s going to make on your hosts.

At this point we had a working solution with consistent 10 second logins post profile build. Now what do we do? We start layering in pieces of the solution and document logon times ONE BY ONE.

One such piece of of the solution was the Imprivata OneSign agent. We noticed that once we installed the OneSign agent MSI, the logon times went from 10 seconds to 30. Then we noticed that if for some reason we didn’t refresh that machine, subsequent logins were back to 10 seconds again. What the heck? We discovered that when you install the OneSign agent from the OneSign Appliance web page..it will automatically stuff the Trusted CA cert for the self signed cert on the Imprivata appliances into the machines local certificate store. This doesn’t happen when you install the MSI. So once we manually put the CA cert in the machines local store, we were back to 10 second logins. Another tip is to ensure that Domain Users have modify rights to C:\ProgramData\SSOProvider, as that’s where Imprivata keeps its DAT files which are regularly updated. Since users are the ones logging on, they should have full rights to that folder so they can be updated.

And so the story continues for me, layering in the solution one at a time. One pro-tip….do AV LAST. If you do it first it might have a performance impact that you’re confusing with application performance, when in fact it’s AV. Another pro-tip is to use a VDI optimized AV, such as a host based AV, or one that can scan and hash all the files in the golden image.

So here’s what you probably scrolled to the bottom to see: the list of Persona Management settings. Well here you go. We also ran into that “White start bar icons” with Windows 7, and were able to work around it by including the synchronization of the AppData\Local\IconCache.dll, along with the AppData\Roaming\Microsoft\Internet Explorer\Quick Launch folder. Pretty much everything else was excluded. If we would have wanted the Documents and Desktop folder redirected, we would have excluded them from the “Do not sync” policy, and redirected them either with native Windows policy or using the “VMware View Agent Configuration/Persona Management/Folder Redirection” section of the GPO.

If you have knowledge worker/power users that need all their documents and AppData redirected and are going to be using Linked Clones, my suggestion is to not go the persona management route, but to use a third party product such as Liquidware Labs, RES, or another suitable product. I’ve seen those two work well.

GPO Settings

VMware View Agent Configuration/Persona Management/Roaming & Synchronization

Manage User Persona: Enabled
Profile Upload interval (in minutes): 10

Persona Repository Location: \\servername\servershare\%USERNAME%\Personal
Override Active Directory Path if configured: enabled

Roam Local Settings Folders: enabled

Files and Folders Excluded from Roaming:
AppData
AppData\Local
Contacts
Desktop
Documents
Documents\My Music
Documents\My Pictures
Documents\My Videos
Downloads
Favorites
Links
My Documents\My Music
My Documents\My Pictures
My Documents\My Videos
My Music
My Pictures
My Videos
Saved Games 
Searches

Files and Folders Excluded from Roaming (exceptions):
AppData\Local\IconCache.db
AppData\Roaming\Microsoft\Internet Explorer\Quick Launch