VMware View 5.1 and Imprivata Biometric ProveID workflows: Issues with PCoIP.

I ran into an interesting issue recently during a View deployment for a customer that was making use of VMware View 5.1.1, Imprivata OneSign 4.6 Hotfix 11, and Meditech EMR with Forward Advantage’s Authentication Manager product for ProveID.

Now, some of you may not know about this ProveID, what is it, and why are we even bothering it? Well, here in the great state of Ohio (and several other states), doctors and nurses are legally required to prove their identity during an EMR (Electronic Medical Record) clinical workflow which results in either the signing of a medication order or the distribution of a medication. This is because if a nurse or doctor were to leave their workstation unlocked, anyone who could walk up to the workstation could purposefully or accidentally sign a medication order or give a medication, while behaving as someone else. This is ESPECIALLY important when it comes to the authorization of medication. The difference between the great state of Ohio and other states, is that password alone has been deemed by the Ohio Board of Pharmacy not enough security for the ProveID workflow, and that some form of Biometric, Token or other dual-factor authentication is required to ensure that the person is who they say they are. Now, enough of that tangent.

For this particular client, the Windows XP golden images were set, the pools had been configured, and the Imprivata VDI integration module was integrated and activated. Some of you guys who are VMware junkies may not know what functionality the Imprivata product brings to the table. Check out this drawing I made which basically outlines the functionality that it brings from a SSO and Clinical logon workflow perspective. Here is a drawing I put together that shows the average logon process with Imprivata integration.

Now, keep in mind, the above flowchart is just the process for getting logged in. That process works just fine regardless of the connection protocol. The next drawing is specifically about what happens when a ProveID event takes place, where a person needs to prove their identity specifically in this case with biometrics. It should be noted that the 3rd party authentication software isn’t always needed if Imprivata has native support for ProveID actions in their application. In this case, a third party product was required.

This is how it should look when it is working. The problem I had with the customer was that when using PCoIP, during step #4, where the fingerprint data was being sent back to the VM. Here is where if the virtual channel between the endpoint and the VM was broken, the communication which passes the fingerprint data back would not function, and result in a broken workflow and a “disconnected fingerprint reader” error message in the VM. Another impact of the virtual channel breaking was that if a lock command was sent to the VM, it would only lock the VM and not the endpoint. We often found that when the error was occurring sporadically, it most frequently occurred the first time that a user logged onto a VM that they hadn’t previously logged onto, but individual results varied.

First we tried using the OneSign 4.6 HF11 agent and a Windows XP View 5.1.1 pool with PCoIP as the required protocol. ProveID did not function with the biometric readers.
Second, we tried using the OneSign 4.6 HF11 agent and a Windows XP View 5.1.1. pool with RDP as the required protocol. ProveID did not function with the biometric readers.
Third, we tried using the OneSign 4.6 HF5 agent and a Windows XP View 5.1.1 pool with PCoIP as the default protocol. ProveID sporadically functioned with the biometric readers.
Finally, we tried using the OneSign 4.6 HF5 agent and a Windows XP View 5.1.1 pool with RDP as the default protocol. ProveID consistently functioned as shown in the workflow above.

At least in the situation I was in, I believe we isolated the issue down to a defect with the Imprivata agent that is only exposed when using VMware View 5.1.1 Windows XP pools with PCoIP, and Biometrics for ProveID workflows. The ticket with Imprivata is still open, so the jury is still out, but I think we narrowed it pretty far.

Also as a bonus:

Don’t forget what PCoIP looks like from a stack perspective. To get a quick look at what the PCoIP network stack looks like, check out this article that Andre Leibovici,wrote on disabling the clipboard transfer with View, there is a great graphic on there that highlights how it’s broken down. http://myvirtualcloud.net/?p=927

Hope this helps someone understand the madness that is the ProveID workflow shanagins. I will post an update when we get a final answer as to the root cause.

** Update **

This ultimately was happening because Imprivata 4.6 HF11 was not yet certified for VMware View 5.1.1. After talking to a friend of mine who attended HIMMS, I learned View 5.1.2 it is currently scheduled to be certified by mid-april.

Intermittent Network Connectivity with ESXi 5.0 and DL380p Gen8 Servers with 331FLR NICs

So, I was recently troubleshooting some interesting intermittent network connectivity. Here’s the layout:

Upon examination of the environment, the following issues presented on three hosts in the problem cluster contained only of new Gen8 DL380p servers running ESXi 5.0.0:

CDP was not functioning on two vSwitch network uplinks that were not trunked and plugged into physical ports in access mode
CDP was sporadically functioning on portgroups with VLAN tagging;
The observed IP address ranges on the network adapters kept changing every 10 minutes or so
VM traffic was intermittently failing entirely (without a presenting failure of the network link)
The VMkernel log was filled with messages pertaining to the NetQ feature
No changes were being made on the physical network switches and the physical switches were configured properly
The tg3 driver version of the 331FLR was 3.123b.v50.1-1OEM.500.0.0.472560

Articles of Note:

The official KB: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2035701
An excellent wordpress writeup from a fellow consultant: http://rcmtech.wordpress.com/2012/08/15/vmware-esxesxi-issues-with-broadcom-5719-quad-port-gb-nic/
The updated (4/13/2012) driver: https://my.vmware.com/group/vmware/details?downloadGroup=DT-ESXi50-Broadcom-tg3-3123bv501&productId=229&download=true&fileId=2501008005&secureParam=&downloadType=#product_downloads
The quickspecs on the 331FLR: http://h18000.www1.hp.com/products/quickspecs/14214_na/14214_na.pdf

These symptoms exactly describe the problem highlighted in the article, and would be especially highlighted in a VMware cluster with high network utilization, which is exactly the case in the environment I was troubleshooting.

We followed the article’s recommendation and performed the following workaround:
Disabled the NetQ on the network adapter on one of the hosts following the official KB
Verified from the command line that NetQ was disabled
Verified that the vmkernel log was not filling with messages pertaining to the NetQ feature
Tested Ping traffic on a test VM on the host that we disabled NetQ on, there were no ping drops
vMotioned the test VM to a host, which at the time had NetQ enabled, and saw several ping drops
vMotioned the test VM back and saw no network packets were dropped after the vMotion was completed

The question I had for VMware support, was does the updated version of the driver linked above (tg3-3.123b.v50.1-682322.zip) permanently address this issue?

The response was:

Since the release notes for the updated driver (tg3-3.123b.v50.1-682322.zip) above make no mention of fixing this specific issue, we have to assume that the updated driver does not permanently remediate the situation. This means the only method of addressing this issue is to do exactly what we have already done in the documented workaround. VMware still recommended updating the driver simply for the sake of being on the latest version, but that it will not specifically address the issue.

Release notes:

v3.123b (April 03, 2012)
========================
    Fixes
    —–
        1) Problem: (CQ62172) 5720 hangs when running software iSCSI with
                    TSO enabled.
           Cause : TSO packets with total size equal to or bigger than 64K
                    are transmitted in this setup. Hardware has a limitation
                    that individual BDs cannot approach 64K in TSO.
           Change : Set dma_limit to 32K for all TSO capable chips when
                    compiled for VMWare. Driver will break up any BDs bigger
                    than 32K. Linux limits TSO size to less than 64K, so no
                    workaround is needed.
           Impact : All TSO capable chips under VMWare.

    Enhancements
    ————
        1) Change : Added psod_on_tx_timeout parameter for VMWare debugging.
           Impact : None.

Prevent Old Computers from Being Discovered in AD by SCCM 2012

What’s that you say? You hate it that system discovery in SCCM brings in ancient systems that don’t exist, nobody cared to clean up and still resolve in DNS?

Have we got a deal for you! One of my favorite new options in the system discovery for SCCM 2012 allows you to choose to discover only computers that have logged onto the domain in the past X days, also only discover computers that have updated their computer account password in the past X days. This feature is so totally awesome, because it will keep your database clear of non-important non-client installed records that somehow still have DNS records. This will be particularly valuable for those organizations which don’t clean out DNS and AD for their servers as a part of a proper decommission process. It’s Friday, it’s a good day. Carry on tech world!

Hello world!

The goal of this site is meant to be my personal yet public site retaining technical stuff I need to remember. Here’s to hoping that I use it!