From the Blogosphere
Doggin’ It with VDI
My IT guys were at their wits end trying to manage the 1000 or so desktops, spread out over four different locations
Feb. 16, 2013 10:00 AM
Hey there; the IT Dog back with some color commentary on our VDI experience. When I first heard the name, I thought VDI was the latest model Volkswagen diesel, but as the guy with the suit explained in the last blog, VDI is Virtual Desktop Infrastructure. Now that you know what VDI means, I am sure you know all there is to know about it, right? Well, I wish it was that easy for me.
Where Do I Begin?
I guess I’ll start from the beginning. At my company, my IT guys were at their wits end trying to manage the 1000 or so desktops, spread out over four different locations, with various flavors of Windows and who knows how many different versions of applications and other personal stuff. I kept getting the ‘we need more staff to manage this mess’ line from them. A lot of these problems had to do with acquisition and expanding business – which is all good.
Queue the VDI sales guy with the big Mercedes: “I’ve got just the thing to solve your problems: VDI”. Let’s see…take control of the company computing assets all from the back room, what a great idea! No more tech support phone calls, no more sending staff out to offices to get chewed out because some website they were on loaded some garbage on to their machine and now it runs slower than a weenie dog in a foot of snow. All this ‘problem solving’ was going to cost a bundle however and I was the guy who had to sell management on it. With my tail on the line, we bit the bone and put the system in.
I Wish My Problems Were “Virtual”
We were at the bleeding edge of the VDI wave so we expected some start up and implementation issues. Our vendor helped us specify and design a system to meet the performance requirements and within the budget we had sold management on. We installed racks of new servers, more spinning disks than at the Frisbee Dog World Championships, power, cooling, wires, wires and more wires. We had it all going on. I spent a month going around selling all the end users on this saying their lives were going to be better – no more sitting on hold, waiting for support, the latest and greatest applications, easy access anytime, anywhere with the potential to support any device in the future, yadda, yadda, yadda. We went live a few months later and that is and began to observe performance.
“Houston, We Have A Problem”
It did not take long to find out about some of the potential issues facing VDI installations today. The first problem we had to deal with had to do with simply getting all the users up and running every morning. I learned about the dreaded “boot storm.” Of course I had no idea what a boot storm was until we started this project. I thought it referred to something from Nazi Germany. But there it is, we have a boot storm problem – when lots of users try to start up their machines at the same time, it puts a tremendous load on the VDI hardware and network and all users suffer from poor service and slow startups. I have to admit – it happened to me also and as you know, being a dog, my life is too short to be waiting around for things like that.
It turns out we designed a system for a typical day in the office for our 1000+ users. What we did not do was design the system for that would be responsive for a “100 year” type of event like loading 400 user images at the same time. Basically our system was 90% perfect but the last 10% was really causing problems for the company. The feedback I was getting was pretty tough to take. I felt like I just pooped on the carpet.
Getting to 100%
As you remember from your Econ 101 class, there is a bell curve distribution for just about everything and IT system usage fits that model pretty well. I went back to our vendor to discuss what it would take to get that last 10% of performance to manage the “100 year boot storm event” (really it was every day) and it turns out this is a very common problem with VDI. You see, the main challenge is that if you build your VDI for the average 'steady state' IOPS usage (the 90% system), you can do it cost-effectively but then performance is inadequate during the usage storms. It turns out the problem is the 90% system is limited in the number of IO operations per second (IOPS) it can handle. One solution to the IOPS problem is to scale up by adding more disks to size IOPS for the peak usage. The problem with that is then your system becomes 2x the total price of all the 1,000 PC and, all that extra IOPS you just bought sits idle most of the time and the capacity is unused. Since I put my tail on the line for this system, my bosses promptly cut it off and I was on the hook to fix this problem with limited resources.
Stop the Spinning
We needed to get creative to solve the IOPS problem. The solution we came up with was to buy a limited quantity of SSDs and use SSD caching software to reduce the IOPS workload on the spinning disks. This made sense since things like the PC image which needed to be accessed by all users when they start their system can easily be stored in SSD cache and accessing that from SSD will increase IOPS without adding more spinning disks. Once the boot storm passed every morning, the SSD caching software would recognize that and start caching other heavily accessed data automatically so system performance would be improved all day. We solved our immediate problem and were able to focus on other VDI related management issues.
Tell me about your experience rolling out VDI.
Read the original blog entry...