Sizing LUN with VAAI - Another perspective Citrix

1:07 PM
Sizing LUN with VAAI - Another perspective Citrix -

Let's face it - we see a lot of VMware there. And I could see a bit more than most people since I am part of our team of board and I work primarily in the large space / Corporate. In all honesty, I probably made about the same amount of XenDesktop on vSphere deployments as XenDesktop on XenServer or Hyper-V deployments. So I had to stay strong enough on VMW over the years ... and that means staying on top of major improvements that affect Citrix in the latest versions (vSphere 4.x and 5.x). So while it may seem odd that I write an article on VMware mostly I feel like it would be a disservice to our customers if I have not written on this particular issue because it is such a game changer My opinion. Before you begin, if you did not read my last article on LUN sizing, it would be wise to read it now. block-based Because it is essentially a follow-up article I will refer to several things of this previous article.

Before jumping into VAAI (and ATS and specifically the impact of this important feature is the LUN sizing), we need to reconsider some of the basics related to VMFS and how SCSI reservations have been used in the past to "lock". As many of you know, VMFS is a clustered file system owner of VMware. And while VMFS allows VMW to do some interesting things like storing thin provision bloc, it also has some special considerations being a distributed file system. For several ESX hosts can actually "share" a single VMFS LUNs based on, something is needed to control all operations between ESX hosts. Otherwise, you'd have serious problems such as corruption of data if multiple hosts have tried to write on the same block on a LUN simultaneously. To keep ESX "failed" host and prevent this kind of thing happens, VMFS uses locks on disk to synchronize metadata updates. And these types of metadata updates are required when you create, modify or delete files. Earlier versions of VMFS (pre-ESX 3.5) used SCSI reservations to acquire those locks. And while a SCSI reservation is needed only for a short time to get the lock, if you have many virtual machines on each ESX host (ie as in a deployment XD) and they all reside on the same LUN, you can see how there might be discord and conflict SCSI reservation become a problem. For this reason (and some others that I described in my last article on LUN sizing), storage vendors and VMW and Citrix have sort of agreement to effectively cap the number of virtual machines on a block-based LUNs somewhere in the neighborhood of 20-30 . So most of us out there have been adhering to this advice 20-30 VM per LUN and all is well and good ...

But that was then and this is now. Starting with ESX 3.5, VMW has recognized this problem associated SCSI reservation conflicts and implemented something called optimistic locking . Essentially, it delays the acquisition of the lock disk SCSI reservation through until as late as possible in the life cycle of the update VMFS metadata. But perhaps what is most important is that in optimistic fashion that a single SCSI reserve is used per transaction versus a reservation by lock. So more intelligent locking and less comprehensive SCSI reservations. And this has certainly made things better, but you still do not see VMW (or we) recommending more than 30 virtual machines per LUN in deployments with ESX 3.5.

And vStorage APIs for Array Integration (VAAI) came a couple there for years and it began to change the game (but not quite when it was originally released - I'll explain in a minute what I mean by that). If VAAI is new to you, then perhaps you will want to consult the FAQ VMW about it. But the idea behind VAAI is simple - allow ESXi hosts to offload specific operations VM and storage equipment storage management consistent, freeing up valuable resources on the host. There are several characteristics or "primitive" within VAAI, but a primitive I really care that is related to this topic LUN sizing is ATS (Atomic Test & Set). From vSphere 4.1 and VAAI-capable networks, VMFS-3 began using TTY for locks on disk rather than SCSI reservations. ATS is one of many within primitive VAAI as I mentioned and it is sometimes referred to as "hardware acceleration" or "locking hardware acceleration." What does it mean? Instead of implementing the lock via the SCSI reservations (in software) lock is now discharged in the table and in fact the equipment! What really makes ATS under the covers is a "compare and write" SCSI blocks in a single operation using specific operating codes to owners berries. And this allows much more granular locking block storage devices . in other words, instead of whole LUN locking via SCSI reservations, we can now lock only specific blocks within a LUN we need - and we lock via ATS is done on the side of table in hardware, it is much faster and more efficient. I hope the lights go off because it is absolutely going to revolutionize how we size the LUN with VMFS.

But note that I say "not quite" in that last paragraph. what I mean by that is VAAI ATS and debuted in vSphere 4.1, it was a kind of half-baked. as one of VMware storage architects explains in this excellent article on VMFS locking, VMFS-3 in vSphere 4.1 only used ATS for total "operations" just 2 to 8. so we would always come back with the SCSI reservations if you need one 6 operations described in this article. But the real kick ATS was used only if the lock on the disc is not supported! So if there was a claim whatsoever (XD think of a deployment with 100 virtual machines on ESX hosts that share LUNs), then it would actually come back with SCSI reservations. So even if you had a table VAAI able XD in your vSphere deployment on, I bet VMW Board does not recommend more than 30 virtual machines per LUN block-based still with vSphere 4.1!

Fast forward to today and VMW vSphere 5.0 U1 is shipping now. And vSphere 5.x uses VMFS-5 ... and now I have to give VMW accessories to finally get it right with VMFS-5. Not only almost all network support VAAI companies now, but VMFS-5 uses ATS operations for 8 ... even when there is contention or mid-air collisions. This means absolutely no SCSI reserve on VMFS-5 implementations with VAAI-capable tables. It's awesome.

But what does that mean, Basil?!? This means that if you deploy XenDesktop on vSphere 5.x with VAAI-able table, you should not be the size of the LUN block-based (FC, iSCSI, FCoE) with 20-30 VM per LUN ! Note I say "block-based" LUN in the previous statement -. None of this stuff we're talking about (SCSI reservations, LUN, ATS, etc.) apply file-based storage protocols such as NFS

So, I'm sure your next question is how we size the LUN with VMFS-5 and VAAI? How many virtual machines per LUN is the sweet-spot now? I can tell you that I personally saw a couple newer 5.x XD vSphere 5.x deployments with VAAI tables (both EMC and NetApp) well over 30 virtual machines per LUN. I just watched a deployment that had about 55 PVS wC disks on each LUN basis FC -... with absolutely zero performance issues and extremely low latency. It was pretty amazing since I am so used to say to create more customers, small LUN in the past!

What is the new magic number with VAAI? I did a ton of research on this subject, asked some colleagues of EMC and VMW, and devoured every piece of literature on this subject on the web I could find. The bottom line is the direction varies depending on who you talk to. In a test conducted by EMC, they saw 50% less / O queue and virtual machines I started 4 times faster compared to the same config on a non-VAAI-able network. In another test conducted by HP, they got 6 times more virtual machines per LUN with VAAI. Hitachi said simply that the number of virtual machines per LUN is more stress (which is a hard sell for me personally, but it speaks volumes about the power of VAAI). NetApp has tested 128 VMs per LUN (which is essentially 5 times what we recommended before) and saw no performance degradation with VAAI. Other experts as Duncan provide formulas. And finally, and perhaps this is the most revealing, VMW publishes some limitations and maximum configuration with each new version of vSphere and View. Before the days of VAAI, they recommended more than 64 virtual machines per LUN (and remember - we always set up no more than 20-30 in practice). But now VMW said they recommend no more than 140 virtual machines per LUN with VAAI support. So if you do the math and essentially cut in half like we did before, many "real" we might want to implement in practice could be more like 50-75 per LUN VM . After all this analysis and research, it seems that the smart money is somewhere around 2-3x what we have recommended before. It's nice to see some storage vendors saying 4-6x, but I would start with probably 50-75 VM per LUN and only implement across nearly 100 or 0 virtual machines per LUN after rigorous testing in a non-production environment.

It goes without saying that LUNs must be able to support the IOPS virtual machines generate. But since we're talking VAAI able networks (Enterprise Storage), I bet it will not be a problem because many times "LUN" is actually a representation / abstraction virtual LUN small batches with perhaps 100 pins behind.

Finally, before any LUN sizing with these numbers, please check the network supports VAAI / TTY for your particular storage protocol and 'HardwareAcceleratedLocking is enabled on your ESXi hosts ( it should be by default, but I would definitely check). Here is an excellent article with a handy table broken down by storage array and a protocol to check if your network supports VAAI and ATS (column "Hardware Assisted Locking" in the table). And the FAQ VMW VAAI I told you just now is the CLI / vCLI commands to validate if ATS is activated.

I hope this guide helps. Good luck out there and feel free to leave me a comment below if you liked the article or issue.

Cheers, Nick

Nick Rintalan, principal architect, enterprise architecture, Citrix Consulting

Previous
Next Post »
0 Komentar