AppVolumes 2.9 – Near 0 RTO Multi-Datacenter Design Options

So this is more of me just thinking out loud on how this could be accomplished. Any thoughts or comments would be appreciated. Check Dale’s blog “VMware App Volumes Multi-vCenter and Multi-Site Deployments” for a great pre-read resource.

The requirements for this solution I’m working on are for a highly available single AppVolumes instance between two highly available datacenters with a 100% VMware View non-persistent deployment. A large constraint is that block level active/active storage replication is desired to be avoided, which is largely why UIA writable volumes is such a large piece of the puzzle; we need another way to replicate the UIA persistence without leveraging persistent desktops. CPA will be leveraged for global entitlements and home sites to ensure users only ever connect to one datacenter unless there is a datacenter failure. AppVolumes and Writable UIA volumes are to be utilized to get the near persistent experience. There can be no loss functionality (including UIA applications on writable volumes) for any user if a datacenter completely fails. RTO = near 0, RPO = 24 Hours for UIA data. In this case assume <1ms latency and dark fiber bandwidth available between sites.

Another important point is that as of AppVolumes 2.9, if leveraging a multi-vCenter deployment, the recommended scale per the release notes is 1,700 concurrent connections per AppVolumes Manager, so I bumped up my number to 3 per side.

So, let’s talk about the options I’ve come up with leveraging tools in the VMware toolchest.

  • Option 1a: Leveraging AppVolumes storage groups with replication
  • Option 1b: Leveraging AppVolumes storage groups with replication and transfer storage
  • Option 2: Use home sites and In-Guest VHD with DFS replication
  • Option 3a: Use home sites and native storage replication
  • Option 3b: Use home sites and SCP/PowerCLI scripted replication for writable volumes, manual replication for appstacks
  • Option 4: Use a stretched cluster with active/active storage across the two datacenters

Option 1a: Leveraging AppVolumes Storage groups with Replication

Option1a

 

The problem is AppVolumes does not allow storage group replication two different vCenter servers, and having multiple Pod vCenters is a requirement for the Always-On architecture. I read that in this community post, the error message in the System Messages reads: “Cannot copy files between different vCenters”, and a fellow technologist posted up recently and said he tried it with 2.9 and the error message still exists. So in short, Dale’s blog remains true in that storage groups can only replicate within a single vCenter, so this isn’t a valid option for multi pod View.

Option 1b: Leveraging AppVolumes Storage groups with Replication and Transfer Storage

Option1b

 

 

So that’s a another pretty good workaround option. Requires a native storage connectivity between the two pods, which means there must be a lot of available bandwidth and low latency between the sites, as some users in POD2 might actually be leveraging storage in POD1 during production use. Providing you can get native storage connectivity between the two sites, this is a very doable option. In the event of a failure of POD1, the remaining datastores in POD2 will sufficiently handle the storage requests. Seems like this is actually a pretty darn good option.

** EDIT: Except that storage groups will ONLY replicate AppStacks, not writable volumes per a test in my lab, so that won’t work. Also there’s this KB stating that to even move a writable volume from one location to another, you need to copy it, delete it from the manager, re-import it and re-assign it. Not exactly runbook friendly when dealing in the thousands.

Option 2: Use Home Sites and In-Guest VHD with DFS Replication

 

Option2

The problem with option 2 is In-Guest VHD seems like a less favorable option given that block storage should be faster than SMB network based storage, particularly when the software was written primarily to leverage hypervisor integration. I’m also not sure this configuration will even work. When I tried it in my lab, the AppVolumes agent kept tagging itself as a vCenter type of endpoint, even though I wanted it to use In-Guest mode. Tried to force it to VHD without luck. So I’m pretty sure this option as well is out for now, but the jury is still out on how smart this configuration would be compared to other options. Leveraging DFS seems pretty sweet though, the automatic replication for both AppStacks and Writables seems like a very elegant solution, if it works.

Option 3a: Use Home Sites and native storage replication

Option3a

The problem with option 3 is the RTO isn’t met and it adds recovery complexity. From an AppVolumes perspective it is the easiest to carry out, as the writable volume LUNs can be replicated and the AppStack LUNs can be copied manually via SCP or the datastore browser without automatic replication. As it stands right now, I believe this is the first supported and functional configuration in the list, but the near 0 RTO isn’t met due to storage failover orchestration and appvolumes reconfiguration.

The aforementioned KB titled Moving writable volumes to new location in AppVolumes and some extra lab time indicates to me that in order to actually fail over properly, the Writable volume (since it is supposed to exist in one place only, and the datastore location is tied to the AppVolumes configuration in the database) would need to be deleted, imported and reassigned for every user of writable volumes. Not exactly a quick and easy recovery runbook.

Option 3b: Use Home Sites and SCP/PowerCLI scripted replication for writable volumes, manual replication for appstacks

This is basically option 3, but instead of leveraging block level replication, we could try to use an SCP/PowerCLI script that would copy only the writable volumes and metadata files from one vCenter to the other in one direction. Pretty sure this would just be disastrous. Not recommended.

Option 4: Use a stretched cluster with active/active storage across the two datacenters for all app stacks and writable volumes

Option4

This pretty much defeats the whole point, because if we’re going to leverage a metrocluster we can simply protect persistent desktops with the same active/active technology and the conversation is over. It does add complexity because a separate entire pod must be leveraged so that the vCenter/Connection servers can failover with the desktops, which is a large amount of added complexity. A simplified config would be just to leverage active/active storage replication for the AppVolumes storage, but that’s super expensive for such a small return. At that point might as well put the whole thing on stretched clusters.

EDIT: But then again, near zero RTO dictates synchronous replication for the persistence. If the users’ UIA applications cannot be persisted via writable volumes, they must be persisted via Persistent desktops, which will end up requiring the same storage solution. I’ve pretty much ended up back here, which is right where I started from a design standpoint.

So! If any of you guys have any input I’d be interested to hear about it. The real kicker here is trying to avoid the complexity block level active/active storage configuration. If we remove that constraint or the writable volumes from the equation, this whole thing becomes a lot easier.

EDIT: The active/active complexity for near 0 RTO can’t be avoided just yet in my opinion. At least with option 4 we would only be replicating VMDK files instead of full virtual machines. To me that’s at least a little simpler, because we don’t have to fail over connection brokers and a separate vCenter server just for managing those persistent desktops.

Posted in Uncategorized

Entering VSAN Maintenance Mode Hangs at 65%

Ran into a weird situation in my lab where entering maintenance mode from the Web Client while attempting doing a full data evacuation WHILE there’s a failed disk in the disk group. It looked like it was unable to enter maintenance mode because one VM couldn’t get moved off the host, the reason? I’m not exactly sure.

Short very un-completely analyzed answer for me was either don’t do a full evacution, or vmotion the VMs to a different host BEFORE attempting to evacuate it. My situation was also weird because I had 1 out of the 2 magnetic disks in a failed state in the single disk group in the host that I was attempting to move away from.

I canceled the task after it ran for 12 hours, moved the powered off VM with a normal vmotion, then entered maintenance mode AOK. Sorry for the complete lack of detail, but nothing came up on teh interwebz for VSAN maintenance mode hanging at 65%.

 

 

Posted in Uncategorized

RSAT Tools on an App Volume?

App Volumes is probably one of the coolest technologies out right now. When VMware bought them, the first thing I thought is that we now finally have the means to control the virtual desktop, soup to nuts. We can make minute changes, have versioning, and quickly react to new challenges presented to us in various use cases.

Administrators for a long time have wanted the granularity of complete control of applications on a desktop, but that control leads to issues when managing at scale with traditional architecture. App Volumes gives us the ability to get down and dirty. To try various setups and ultimately allow for the flexibility of layering applications. But what happens if what we want to put in an App Volume isn’t an application at all? What if you want to enable operating system features like Microsoft Remote Server Administration Tools?

Recommended practice for your provisioning machine is to keep it as clean as possible, only putting the bare essentials in the OS so that the app layer you are trying to capture only contains the application information. Unfortunately with RSAT Tools, there is no application to install, but rather features that you can enable with a Windows Update that can be found here: http://www.microsoft.com/en-us/download/details.aspx?id=7887

To enable RSAT Tools in an App Stack, install the above Windows Update on both the provisioning PC and the Gold Image that is used as the parent machine for your vdi pool. Create a new App Stack or update an existing Stack and attach it to the provisioning machine. Head over to the Control Panel, open Programs and Features, and click on Turn Windows features on or off. Under Windows Features, choose Remote Server Administration Tools, and then enable the features you want users that will subscribe to the App Stack to have. At this point, you can finish the provisioning, and reboot the provisioning machine. The only thing left to do is attach the App Stack to a user desktop and test the functionality.

~ DJ Gillit, VCP5-DCV, VCP5-DT, VCP-NV

Follow me on Twitter: @djgillit

Posted in Uncategorized

LAN in a CAN 1.0 – VMware ESXi, Multi-WAN pfSense with QoS, Steam Caching, Game Servers

The goal of this blog post is to highlight one of my “off the clock” creations that has faithfully serviced 4 LAN parties and counting. I wanted to share the recipe I used; I call it LAN-in-a-CAN. The goal of this project was to simply provide a portable platform which could handle LAN parties of up to 50 guests in an easy to use and quick to set up configuration, complete with download caching for CDN networks, pre-validated QoS rules to ensure smooth gaming traffic and a virtual game server capable of hosting any game servers. Everyone in the Ohio and surrounding area, I invite you to come check our LAN out, you can find details at www.forgelan.com.

BIG PICTURE: Roll in, plug in internet, LAN Party GO! Less headache, more time having fun!

intro

First of all, massive props to the following:

  • All the guys at Multiplay.co.uk and their groundbreaking work on the LAN caching config
  • All the guys investing time in pfSense development
  • @Khrainos from DGSLAN for the pfSense help
  • Sideout from NexusLAN for the Posts and Help

Pictures:

IMG_1008 IMG_1006

IMG_1002

IMG_1004 IMG_1005

 

Physical Components:

Case: SKB 4U Rolling Rack ~$100

Network Switch: (2x) Dell PowerConnect 5224 (x2) 24 port Gig Switch ~$70 each

Server Operating System: VMware ESXi 5.5 U2 Standalone – Free

Server Chassis: iStarUSA D-213-MATX Black Metal/ Aluminum 2U Rackmount microATX Server Chassis – OEM ~$70

Server Motherboard: ASUS P8Z77-M LGA 1155 Intel Z77 HDMI SATA 6Gb/s USB 3.0 Micro ATX Intel Motherboard ~$90 (DISCONTINUED)

Server Memory: G.SKILL Ripjaws Series 32GB (4x 8GB DIMMS) ~$240

Server CPU: Intel Core i3-3250 Ivy Bridge Dual-Core 3.5GHz LGA 1155 55W Desktop Processor ~$120

Server PCI-Express NICS: (3x) Rosewill RNG-407 – PCI-Express Dual Port Gigabit Ethernet Network Adapter – 2 x RJ45 ~$35 each

Server Magnetic Hard Drive: Seagate Barracuda STBD2000101 2TB 7200 RPM 64MB Cache SATA 6.0Gb/s 3.5″ Internal Hard Drive -Retail kit ~$90

Server SSD Hard Drive: (2x) SAMSUNG 850 Pro Series MZ-7KE256BW 2.5″ 256GB SATA III 3-D Vertical Internal Solid State Drive (SSD) ~$340

Wireless Access Point: $35

Total Cost: ~$1350 with Shipping

My comments on the configuration at present:

When you think about it….it’s not much more than a PC build. As cool as it looks, this is a garage sale lame-oh build. I was able to save some cash by repurposing older hardware I had laying around such as the CPU, memory, motherboard and hard drives. When examining the scalability of the solution, the first thing that would have to change is the CPU clock speed & core count, followed immediately by a decent RAID controller. If it was required to service LANs larger than 50-70 in attendance, we’d probably want to change the layout so that the caching server would have its own physical box.

Virtual Components:

pfSense Virtual Machine

Purpose: Serves DNS, DHCP, QoS, ISP connectivity and routing.

Operating System: FreeBSD 8.3

Virtual vCPU: 2

Virtual Memory: 2GB

Virtual NIC: 3 vNICS, 2 for WAN connectivity, 1 for LAN connectivity

Virtual Hard Drive: 60GB located on 2TB slow disk

Windows Game Server (Could Also be Ubuntu if Linux Savvy)

Purpose: Serves up all the local game servers, LAN web content, teamspeak, and is a TeamViewer point of management into the system remotely

Operating System: Windows Server

Virtual vCPU: 2

Virtual Memory: 4GB

Virtual NIC: 1 for LAN connectivity

Virtual Hard Drive: 400GB located on 2TB slow disk

Nginx Caching Server

Purpose: Caches data for Steam, Blizzard, RIOT and Origin CDN networks. I’ve only gotten Steam to actually cache properly. The others proxy successfully, but do not cache.

Operating System: Ubuntu Server 14.04, nginx 1.7.2

Virtual vCPU: 4

Virtual Memory: 12GB

Virtual NIC: 8 for LAN connectivity

Virtual Hard Drive: 1x 80GB on slow disk, 2x 250GB disks, one on each Samsung 850. ZFS stripe across both disks which holds the cache

Download Box

Purpose: Windows box to pre-cache updates before the LAN starts. Allows multiple people to log in teamviewer, then into steam and seed all relevant games into the cache.

Operating System: Windows Desktop OS

Virtual vCPU: 1

Virtual Memory: 1GB

Virtual NIC: 1 for LAN connectivity

Virtual Hard Drive: 400GB located on 2TB slow disk

Big Picture Network and Caching Advice:

  • pfSense and Network
    • Use Multi-WAN configuration (yes, that means two ISPs) so that all proxied download traffic via the caching box goes out one WAN unthrottled, and all other traffic goes out a different WAN throttled. 2 WANs for the WIN. If you only have one WAN ISP connection, then just limit the available bandwidth to the steam caching server.
    • Use per IP limits to ensure that all endpoints are limited to 2Mbps or another reasonable number per your total available bandwidth. QoSing the individual gaming traffic into queues proved to be too nuanced for me, it was just plain easier and more consistent to grant individual bandwidth maximums. This doesn’t count the proxied and cached downloads, which should be handled separately per the above bullet point.
    • Burst still isn’t working properly with the limiter config, apparently not an easy fix. Bug report here: https://redmine.pfsense.org/issues/3933
    • DNS spoofing is required here which can be done out of the box using the DNS forwarder on pfSense
    • We made the config work once on a 5/1 DSL link, but it was horrible. Once we at least had a 35/5 cable connection everything worked great.
    • The PowerConnect 5224 just doesn’t have enough available ports to fully service all the connectivity required. While it does allow you to create up to 6 LAGs with up to 4 links each, it basically means that there aren’t enough ports to have a main table, plus 2 large tables with enough LAGs in-between them. I recently purchased a used Dell 2748 to replace the 5224 at the core in the rack for around $100, and am looking forward to updating it.
  • Caching Box
    • Future state for big lans, this would be a separate box like the guys at Multiplay.co.uk have laid out, 8x 1TB SSD in a ZFS RAID, 192GB memory with 10Gb networking, and dual 12 core procs. i3 procs with Samsung Pros seems to be working OK for us, but that’s because of the small scale. Would definitely need a couple hoss boxes at scale.
    • When using multiple disks and ESXi, don’t simply create a VMFS datastore with extents, as the data will mostly be written to one of them and IO will not be staggered evenly. For this reason I opted to present two virtual hard drives to the virtual machine and let ZFS stripe it at the OS.
    • Created a NIC and IP for each service that I was caching for simplicity sake.
    • There are some Nginx improvements coming that will natively allow caching for Blizzard, Origin, which don’t work today due to the random range requests, which can’t be partially cached. See Steven’s forum post here for updates: http://forum.nginx.org/read.php?29,255315,255324
    • CLI tips:
    • Utilize nload to display NIC in/out traffic
      • sudo apt-get install nload
      • sudo nload -U G -u M -i 102400 -o 102400
    • Log directories to tail for hit/miss logs
      • cd /data/www/logs
      • tail -f /location_of_log/name_of_log.log
    • Storage Disk Sizes & Capacity with Human Readable Values
      • df -h

Achievable Performance:

Single Portal 2 Download to a Mac with SSD drive: 68MBps! Whaaaaat??? Ahamazing.

Portal2Download

 

Was able to push 94MBps of storage throughput at a small miniLAN with 8 people downloading portal 2 at the same time. Those 8 people downloaded portal 2 in about 10 minutes total.


forgelanstorageesxi

It’s been a great project, and I’m excited for the future, especially with the up coming nginx modules that will hopefully provide partial cache capability.

Posted in Uncategorized

Installing ESXi 6.0 with NVIDIA Card Gives Fatal Error 10: Out of Resources

Thanks much to the original author of this post in the Cisco community, I was able to find the answer quickly. Figured I’d put it here in case anyone else hits it as well since the error message is a little off.

https://communities.cisco.com/community/technology/datacenter/ucs_management/blog/2014/02/17/error-loading-toolst00-fatal-error-10-out-of-resources-cisco-ucs-c240-m3-server-with-esxi-55

Decompressed MD5: 00000000000000000000000000000000

Fatal Error: 10 (Out of Resources)

Fatal

fatal2

 

Change the following MMCFG base parameter to 2 and you can install AOK.

Posted in Uncategorized

Horizon Workspace 2.1 – Logon Loop after Joining AD Domain

Turns out that if you join Horizon Workspace to the AD domain which uses a different URL namespace such as “pyro.local” than what you’d actually publish, such as “workspace.pyro.com” you actually have to modify the local hosts file of the appliance otherwise you’ll end up in an endless loop of frustration like I was for the past several hours. The worst part is that the config saves just fine, but now authentication no longer works, leaving you puzzled and sad. Just goes to show you, read the release notes. Hopefully this helps someone faster than google did previously for me.

Screen shots below. Release notes which apparently I didn’t take time to read state:

1a

 

When joining the AD domain, you’ll get this. Which is fine, except people on the outside can’t resolve that address.

1bNow it’s off to the vi to add us a hosts entry. I had to login as sshuser then sudo to root. 
3 3aFor those of you not fluent with the VI text editor, which is myself included, just hit I to insert, type what you need to type then hit escape to stop inserting text.

After you do that hit ZZ to save and write the changes to the file. Lo and behold! You can resolve your external address and you don’t have a sad loop. It’s a Christmas miracle. What a rawr.

 

 

 

Posted in Uncategorized

Adding Microseconds to vRealize Operations Graphs

In my old role as a vSphere administrator for a single company, we upgraded our storage from legacy spinning disk with a small amount of cache. This environment often experienced disk latency greater than 3ms on average and often time much higher than that, getting up to 20ms or higher for 30 seconds at a time, over a thousand times a day. We upgraded to a hybrid array with results of less than a millisecond latency for read and write operations…consistently.

vCenter Performance

We tracked this with vCenter as well as vCenter Operations Manager. In vCenter, it’s tracked on a per virtual machine instance. We could monitor this by using the vCenter performance tab on a virtual machine and selecting Virtual Disk and monitoring the read and write latency numbers for both Milliseconds and Microseconds, as seen below.

Monitoring in vCenter was great, but it’s only available for live data, or the last hour. That doesn’t help in seeing history. We fixed this by looking into vCenter Ops and in our installation, the numbers were there. Great…history. We can now use this latency number as justification for our new storage selection.

Fast forward to my current role, as a consultant. I recently installed vRealize Operations Manager for a customer. One of the reasons was to monitor disk performance as they evaluated new storage platforms for their virtual desktop environment. As I looked into vCenter, the counters were there, but when I looked into vRealize Ops the counter wasn’t available. What gives?

After contacting my previous coworker, we compared configurations and neither one of us could figure it out. I already had a case open with VMware regarding the vRealize Ops install for an unrelated issue. On a recent call with VMware support, we compared these two vCenter/vRealize Operations manager environments. It really stumped the support engineer as well. After about five minutes, we figured it out. When vCenter Operations Manager was installed, the selection to gather all metrics was selected. vRealize Operations Manager doesn’t give you this options. Instead you need to change the policy.

Here’s How:
Log into vRealize Operations Manager, select a VM, and select the troubleshooting tab. Take a note of the policy that is effect for the VM…if you haven’t tweaked with vROPs, then the policy will be the same for all objects.

vRealize Policy

Click on this policy, this will link to the administration portion of vROPs focused on the “Policies” section.

vRealize Policy List

Click on the “Policy Library” tab. Drop down the “Base Setting” group, and select the proper policy that was at the top of the virtual machine troubleshooting tab. Then click not the edit pencil.

vRealize Operations Edit Policy

Once in the “Edit Monitoring Policy” window (shown below), limit the Object type to “Virtual Machine” and filter on “Latency”. This will reduce the object to a single page.

vRealize Override Attributes

Change the “state” for the Virtual Disk|Read Latency (microseconds) and Virtual Disk|Write Latency (microseconds) to “Local” with a green checkmark.

vRealize Operations Attribute Changes

After about 5 minutes, verify that the data is now displayed in vRealize Ops.

vRealize Operations Graph

 

Why is this counter important?
As SSD arrays like Pure Storage and XtremIO or hybrid arrays like Tintri become more prevalent, millisecond latency numbers are near 0 and result in a single flat line and no real data, but microseconds really allows administrators to see any changes in latency that would otherwise be flattened out.

I’d like to thank El Gwhoppo for allowing me to guest author. I hope this tidbit is useful information!

Tagged with: ,
Posted in vCenter, VMware
  • My children love bedtime and go straight to sleep every night without crying or complaining. Also, unrelated, I have a bridge for sale in NY 1 day ago
  • Have you built a new custom home recently? If so, I'm curious about anything you'd change if you could do it again. 3 days ago
  • What the heck. Smelled it, pulled power and ripped open the case. Was super lucky to be present. Still smells. http://t.co/CPNu3UOhhn 5 days ago
Papers
People
Map

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 16 other followers

Follow

Get every new post delivered to your Inbox.