XenDesktop - high availability and load balancing for Web Interface Add On

10:42 AM
XenDesktop - high availability and load balancing for Web Interface Add On -

UPDATE July 21, 2013: If you read or download the code in this article which was originally released! November 24, 2012, I recommend you review it again as it has been updated to reflect changes with XenDesktop 7 and there is a new version of the Web development interface at the end of the article.

The availability problem XenDesktop

in recent years, I actively participate to help many of our enterprise customers to deploy environments large VDI. A recurring theme is that most, if not all, have experienced failures in their XenDesktop environment. This problem has not only been produced with large customers that I personally was involved, but I also receive emails and reports on failures of large customers worldwide. The disturbing thing about this is that many of these customers have extremely mature and highly skilled IT staff. In addition, they often have done right by all our best practices by making each component highly available, but failures still occur.

There are many things that can cause a breakdown in a XenDesktop environment. I will not list all the different things, but I will highlight some of the key elements that I saw the cause of problems. Some specific examples I've seen include:

  • In XenDesktop 4 and earlier, a study group might have connected a single hypervisor. If vCenter, SCVMM or Xen Pool Master descended, office group would become inaccessible. I can not count how many times I've seen hypervisor connectivity problems cause failures! Even simple things like vCenter expiring certificate can cause a breakdown!
  • From XenDesktop 5 and later, if SQL connectivity is lost, it's game over. No SQL, no new connections can be made. This is probably one of the biggest challenges of all.
  • With PVS if the SQL connectivity is lost, the server can still function in a limited way as the Stream service is restarted and offline DB is activated. However, if PVS is restarted or the service is restarted, PVS fail if SQL is not.
  • With PVS I saw the hang Stream Service on all servers in a farm when a failover occurs when the servers are under a heavy load. This type of failure Stream service will prevent further diffusion devices successfully.

From XenDesktop 5, we made some improvements to minimize disruptions caused by hypervisor problems now that we can feed catalogs and desktop groups with more than one infrastructure hypervisor. However, we have introduced a new single point of failure by XenDesktop 100% dependent on SQL Server while in XenDesktop 4, each server had an offline copy (LocalHostCache) of the SQL database and could still negotiate office connections SQL failure case.

You can have the most intelligent IT architects and greatest hardware with all the best practices for high availability, followed by the pound (multiple hypervisors, SQL mirroring / cluster, NetScaler load balancing, multiple PVS servers online DB, etc ...); however, if you have a large environment with XenDesktop one site, I can promise you one thing .... You will have a failure !!! It is not a question of whether the failure is but a matter of when and for how long.

POD architecture to the rescue ???

This issue of availability and scalability XenDesktop is really nothing new. There are several years Dan Feller proposed a modular architecture Pod where instead of deploying a single large site XenDesktop, you deploy multiple independent XenDesktop sites (Pods also known) and if a XenDesktop site goes down, the others are still in place and available. In fact, one of our consulting architects, Rich Meesters, wrote an excellent blog and the white paper that discusses this architecture. Move forward with the rest of my blog, I will assume that you have read the articles of Rich and understand what is meant by the Pod architecture.

/ blogs / 2012/04/25 / taking-a-modularization approach to XenDesktop-design /

http://support.citrix.com/article/CTX133162

to briefly summarize the Pod architecture, the main objective is to deploy multiple independent and isolated XenDesktop sites. Each site will be dedicated hypervisors, SQL servers and dedicated servers dedicated PVS. It is essential that none of the components are shared (in particular SQL) between sites. By separating all components of the key core infrastructure that make up the site XenDesktop, you warrant that component failure such as SQL or PVS in one site will not affect all the other sites.

Although this architecture Pod sounds good, there are significant gaps in the implementation of effective and efficient manner. Since we use a common image of gold and non-persistent desktops / shared, it should not matter which user logs Pod. Image you have 5 cloves each with 5000 workstations. If you connect to Pod 1 Pod 3 today and tomorrow, it should not matter because they are identical. Ideally you should be able to handle every 5 cloves as a single logical unit. However, XenDesktop, NetScaler and Web Interface in their current form today does not give you the ability to view logically or load balancing these 5 pods as a single unit.

I know what you think ... Can not NetScaler and / or some of the features of the web interface such as agricultural aggregation, user or recovery Farm Roaming solve this problem ??? The short answer is no, none of these features of the Web interface or NetScaler can load balance multiple terminals properly as a single logical unit. Returning through the issues trying to load balance multiple terminals today ...

Web Interface aggregation Farm, Farms and recovery roaming user ....

  • you can simply list all 5 pods that farms Web Interface and aggregation of agricultural lever. Users will see 5 desktop icons and will decide for themselves which to choose. Do you really want a user to see five identical icons? Would you rely on users randomly selecting icons to distribute the load? It's not elegant and not a very smart load distribution.
  • You can use the Web interface recovery battery function and stagger the farm and recovery of primary firm on each Web Interface server 5. This could be done by having each individual Web Interface only point to a primary firm and a list of recovery farms. Web Interface 1 point to Site 1 and Sites primary 2-5 that the recovery. Web Interface 2 points Site 2 as primary sites and 1, 3, 4, 5 as recovery, and so on ... A NetScaler can balance the load of the Web Interface servers as a logical unit. There are several problems with this approach. The major Achilles heel of this approach is that the NetScaler does not know who you are before you send to the web interface and if you have a disconnected session in one of the farms. If you had a disconnected session in Pod 1, but connected from a new customer, NetScaler could you convey to a Web Interface server that uses Pod 4 as the primary farm and starting a new session instead of reconnecting your disconnected session. This solution also suffers from the problem which firm might be, but he is out of work stations because of hypervisor problems or PVS or simply because it failed to workstations. recovery firm Web interface are not smart enough to switch to a backup farm simply because desktops are exhausted.
  • You can try to implement the user's roaming function and send specific users at a farm / Pod Pod as their home. This means you have to divide your users into separate groups AD assign a primary Pod, adding administrative overhead. Furthermore, it really still do not load-balance or to provide failover. If you are assigned to Pod1 and it works on desktop computers because hypervisors are down or because PVS is down, and the Web interface DDCs will still be alive and not fail you at a recovery farm. Also, if your primary battery is temporarily unavailable, you can start a session in the recovery farm. If you disconnect the battery recovery and now your main farmhouse is available again, you will not be reconnected to your disconnected session of the recovery farm.

NetScaler What ??? No matter how you try to use NetScaler for load balancing a connection to the Web or to a specific interface Pod connection, NetScaler is not smart enough to know if there are desktop computers actually available in the Pod and NetScaler do not know if you have a disconnected session in a particular pod, so you end up being unable to reconnect your disconnected sessions.

as you can see, we have no way to load balance logically 5 cloves as a single unit and truly distribute load based on actual usage. No matter how you try to deal with our current features of the Web interface or with a NetScaler, there are no clean way to load balance pods.

A new development of Web interface to the rescue !!!

the problem of load balancing pods as a single logical unit became a major issue for my clients, and unfortunately, there was nothing on our product roadmap that would address this issue fast enough for my clients. So I decided to put a solution by designing an add-on to the web interface. Since I'm not really a programmer, I partnered with an outside developer and to help write the code as an add-on to the web interface. I have to give a shout out here to Wayne Rouse, wayne@pospda.com, as he was the master mind of coding behind me helping with this improvement! Wayne and I spent many a day and night on GoToMeeting sessions in my laboratory we have developed this solution! Thank you Wayne! An additional cry my colleague Robert Wiggenhorn is also due, as he spent many evenings with Wayne also work to add some new features and capabilities to this improvement.

So what we are doing to improve the web interface to fix this issue Pod load balancing? We approached by leveraging the native agricultural aggregation capacity of the web interface. You list each Web Interface Pod as a separate farm, and then configure our code to work its magic! The highlights are listed below:

  • We identify identical and specially designated groups of offices from several farms and reducing them to a single icon. If you have 5 cloves offering the same office group "Windows 7", you will only see one icon.
  • We run PowerShell queries from Web Interface Pod each XenDesktop and build a table of current usage statistics. We know that 5 cloves is least loaded and pods do have available workstations.
  • When you click the icon of the single desk group that has been aggregated from multiple sites XenDesktop / Pods, we will first examine each firm to locate disconnected sessions. We will always reconnect to your disconnected session first, whatever the Pod is the least loaded.
  • If you do not have a disconnected session, when you launch an aggregated and load balanced Desktop Group, we will check the PowerShell load data table Pod and connect to that is least loaded.
  • If for some reason we can not successfully generate a file for launch.ica least loaded Pod, we will automatically keep trying other pods until we can get a file launch.ica and connect to a desktop computer. No more mistakes about jobs being unavailable for maintenance or fashion! One click and if an office is available in one of the pods, we connect you to it!
  • We also offer enhanced capabilities in maintenance mode. Today, if you put in maintenance mode desktop group, nobody will be able to connect. This prevents users disconnected to reconnect to their workstations. This is a major pain point of being able to successfully drain off users from a group of offices. With our new code, we allow you to mark a desktop group for the leak off without having to put in maintenance mode in the Desktop Studio console. This allows you to avoid further sessions to connect to a desktop group while simultaneously allowing disconnected users to reconnect to their sessions.

The diagram below illustrates how our code works ...

traps, Exclusions and other information .. .

  • This code is provided free tool / add-on utility for your deployment of the Web Interface. It is not a Citrix product officially supported. Like all other major tweaks to web interface you can find out there (especially the impressive site of Thomas Koetzing! Http://www.thomaskoetzing.de), if you have problems with this add-on, you can not open a ticket with the Citrix tech support. Citrix is ​​not responsible for the use of this utility. Please use it in a test environment!
  • This code only works on the Web Interface 5.4 and 5.4.2
  • This code has been tested on Windows 08 R2 as a web server.
  • Make sure your Web servers have at least 2 vCPUs and 4GB of RAM (you should do that already!).
  • We tested this with XenDesktop 5.5, 5.6 and 7.0.
  • This should be used to charge the XenDesktop balance pods that are in the same data center or highly connected data centers.
  • This is the code does not work with StoreFront or Cloud Gateway. It is only a Web 5.4 interface improvements.
  • This code is not intended to aggregate the XenApp sessions, so please do not try to do it.
  • This code is designed for balancing non-persistent desktops load pooled services delivered via Machine Creation and Provisioning services. You should not aggregate the affected workstations or persistent desktops assigned with Personal vDisk.

I would also like to take a minute to discuss StoreFront. As you well know, is our replacement for StoreFront Web Interface and starting with XenDesktop 7.0, we're really encouraging customers to start migrating away from the web interface and the new StoreFront 2.0 that was released with XenDesktop 7.0. I am also pleased to announce that StoreFront 2.0 now has a aggregation and integrated base load distribution mechanism. There are some limitations with the new StoreFront 2.0 capabilities such as:

  • The load balancing is a simple round-robin or random load balancing and does not check the charge state before routing a user desktop.
  • If the pre-aggregation site XenDesktop 7.0, there is no guarantee that users will be reconnected to their disconnected sessions.
  • If users need pass-through authentication or smart card, they must use native Receiver as receiver for the website with StoreFront supports only explicit authentication username / password password when using a browser to connect.

the fact that we now have a base aggregation and load balancing of capacity in the new StoreFront 2.0 is a big step in the right direction. I am convinced that in future versions StoreFront we will see this improved ability to support balancing smarter charging and other features that will ultimately make far superior to what we did with this improvement in the Web Interface. It is also important to note that the position of "official" support beginning with XenDesktop 7.0 is that you must use to access StoreFront XenDesktop 7.0. The reality is that there really is nothing fundamentally different with XenDesktop 7.0 or Citrix XenDesktop XML architecture that would make 7.0 incompatible with the Web Interface. Launch desktops XenDesktop 7.0 works perfectly well with the web interface. Whenever you make changes to custom code or StoreFront Web interface, your configuration is technically "not supported". This simply means that if you have a problem to start a connection through a modified StoreFront or a Web Interface, you must be able to demonstrate that the same problem exists in using a site "plain vanilla" or StoreFront Web Interface. So, as with any changes to the Web or StoreFront interface, I recommend you keep a Web Interface 5.4 site vanilla available when using XenDesktop 5.6 and earlier or keep available StoreFront 2.0 vanilla site when using XenDesktop 7.0 so you can check any connectivity problems are not related to custom changes that have been made

you can download version 2.0 of the tool from the following link :.

https://citrix.sharefile.com/d / sd95c9fd6ef84aab9

I hope you find this useful tool! Feel free to let me know what you think!

Cheers,

Dan Allen

Previous
Next Post »
0 Komentar