Citrix Comment on VMware Latest RDSH Scalability Whitepaper

5:00 PM
Citrix Comment on VMware Latest RDSH Scalability Whitepaper -

I've been waiting for this kind - after all, it has nearly one year as our friends in Palo Alto announced Remote desktop session host (RDSH) functions with Horizon (their XenApp like product, if you are not familiar). But, they finally released a few weeks ago some performance and sizing "Best Practices".

And since I have a lot of performance and scalability tests conducted with TS / RDS / XenApp for the last ten years, I thought it was damn interesting. Mainly due to the fact that VMW to this "game" is new, and this is associated with their first stab at with sizing guest VMs some of the best practices are documented and determine how many RDSH sessions on an ESXi host hosted.

So I will give you (or is aligned with what Citrix and MSFT years preaching) a quick rundown of some of its most important results in terms of what I thought it was good to give, and what they could probably do , better in the future (or what "curious" for was to find, or the result which does not align with what we have done in this area for a long time)

By the way - if you have not 't my to read, please start there two part article on XenApp scalability . Since I'm going to touch on a lot of things from my articles relating to concepts such as CPU oversubscription and NUMA.

The Good (Consistent findings or results)

  • Bigger is better. I VMW recommend for testing various vCPU configurations and CPU via ~~ POS = TRUNC conditions. This is literally the only way to find out what the optimum score is size and how many people should be compressed to a single ESXi host. And what VMW is found that 8 vCPU VMs perform the best, which is exactly what I preached in the past few years. Want more guest VM sizes has the additional advantage that you have to manage a total of less nodes in your environment, not to mention that it OS licensing could be additional cost savings due to Windows.
  • activity ratio. I was glad to see that VMW up for about 5 seconds, what they "think time" between each operation that was carried out the test script. I have often commented that the users a lot less work than most people think, something to sleep or think time for each script to add. Now, in fact, if we with LoginVSI and XA test, we often optimize the script to add extra time to mimic a real XA users. Just check out what Dan Allen did here with LoginVSI.
  • User density. VMW found that a 4 × 8 can serve box or host 288 RDSH-based users on a single ESXi 5.5 host at an acceptable level. This is refreshing because we like figures in the field with such boxes and chipsets (~ 300 user / host) have seen. And even if you look at AndyB table of 2012, where we published some XA scalability numbers for a 4 × 8 server, the medium or "normal" number of users, we said that we on a could support single host, was 320th So VMW density figures are actually right in line with what we see and to preach.
  • protocol comparison. One of the strange (but cool in my opinion) did things VMW in this study, was to compare CPU and bandwidth usage of PCoIP, RDP and ICA / HDX. And they found that the CPU utilization guest was slightly better with ICA and RDP vs. PCoIP. And the bandwidth utilization was slightly better with PCoIP against ICA and RDP. But frankly, the numbers are all negligible, if you ask me (72% vs. 71% CPU and 45k vs 48k each for PCoIP and ICA). The more interesting thing I keyed in on it was to be the average bandwidth per user or session, found 45-50k . This all depends on the script and what kind of work you do, but we have often said, about 20-75k per user for ICA traffic for almost 15 years. So VMW findings (and test script they used) directly in line with the figures that we see in the field.
  • NUMA Matters. VMW also courted Not -Uniform memory access (NUMA) at the end of the White Paper and repeated basically what I said in my two-part series -. try on the size of your guest VMs with NUMA in mind and do not cross borders, if possible
  • RDSH tuning and server optimization. Similar to , we always recommend using XA (or XD) workloads used VMW their Horizon 6 Optimization Too l to tune to achieve some Windows elements and maximum user density. So was so good they ported their tools of View / VDI to see and use it for Horizon / RDSH workloads now.

The Bad ( "Curious" discoveries or results)

  • View Planner vs. LoginVSI. VMW used their own capacity planning tool ( View Planner 3.5 ) compared something a bit more industry standard as LoginVSI. And while I understand their reasoning, and they briefly shared what the workload did, and how much "thinking time" they added in the script, they do not share all the details, and it makes it difficult to compare their numbers to other numbers, even on dissimilar hardware or with the same ESXi version. I wish VMW would LoginVSI going to use forward that would really help in the community.
  • reaction time. I must admit I found it very strange how VMW was calculated on "acceptable performance". A year or so ago in their view performance and scalability white paper, she said that 1 second was the threshold for acceptable performance ( to see graph 14 ). But now they say the acceptable amount of reaction time is 6 seconds ?!? Seriously, on page 9, they say that "everything over 6 seconds too high and therefore unusable". Maybe that's apples and oranges or the Scheduler tool has seen some significant updates, but it seems safe not like. And if it is a 6 Seconds acceptable? How did VMW with this number if it was in front of 1 second per year? We usually use 1 or 2 seconds to react to Citrix when scalability tests to do, so I found this to be very curious. But here is what is even stranger - after I read that six seconds was her threshold, I thought they were going to say that every ESXi Box 1000 users or support something could. But no - only 288, which is actually slightly lower than we usually see in the field with this hardware and a Medium / Office'ish users was used as in their script. So who knows what's going on there - one more reason to explain how they arrived at that six seconds threshold in future white papers, or simply make the switch to LoginVSI because it depends a great job of looking at dozens metrics related to reaction time, to consider actually doing what is acceptable to a user and what is not.
  • host metrics and hardware. , the first thing that struck me interesting in VMW test setup, the box was they used to perform the test. For it is not really a config that I would recommend for RDSH based workloads or users. This 4 × 8/512 R820 box is much better suited for VDI workloads in terms of the best value for your money - a Dual-Box socket with say 12 cores (like the R710) or a quad-socket box with probably half of memory are much better "sweet spots" in terms of cost and performance for RDS workloads. So I thought that was interesting that VMW selected this monster box and got only 300 users or so on it ... it did not provide any host metrics (please at least CPU and memory ESXi metrics in the future do supply!), But I can guess that the box was probably CPU-bound, it is not used at the end of each ESXi host more than half of the memory. I thought this was worthy of mention, since it join world VMW roots from the VDI shows and I probably would never recommend this exact box for XA workloads unless I was forced or I have everything for free memory. found
  • CPU ~~ POS = TRUNC VMW that. 2: 1 over-subscription ratio the optimal configuration (basically 8 guest VMs @ was 8 vCPU on each host, which corresponds to 64 logical vCPUs and the box has only 32 physical cores or CPUs). And that bothers me actually not so much because we have used 2: 1 configs sometimes in the past, especially if we want to be a little more aggressive condition because of cost or maybe we have a lighter or less critical workload. But I just wanted to point, as I said 1.5: 1 to use in the past and this result is obviously somewhat different. This optimal CPU over subscription ratio of a number of factors and the only way it hangs out is by testing how VMW did. So, again, props on this one VMW out for the correct way. I think the sweet spot is for the CPU-over-subscription with most workloads these days in modern hypervisor between 1.5-2x.

Anyway, that's all I have. Some pretty amazing stuff in there and I'm really glad VMW has joined the party and published some test results. It is definitely a great thing for the community, and we can learn from each other in the future, if we set aside our minor differences are to provide with security.

-Nick

Nicholas Rintalan

Lead Architect, Americas

Citrix Consulting

Previous
Next Post »
0 Komentar