XenDesktop 7 Session Launch – Part 3, Brokering

In my last post I talked about the ways that the Citrix client/WI enumerates XenDesktop resources by way of NFuse transactions to the site’s XML broker. The XML broker is responsible for telling the StoreFront server which published resources were found for a particular user. For more technical detail on NFuse transactions, check out my XML Broker Health Check post which gives a good example of NFuse transactions by way of some pretty straightforward XML requests sent through PowerShell.

The next major piece of the session launch process is what’s known as Brokering. This process allows a user to click a desktop or app resource, and have a ‘worker’ selected and readied for an inbound ICA connection. XenDesktop 7’s brokering functionality is mostly unchanged from that of XenDesktop 5, the only main difference being that it now includes multi-user RDS workers.

Conceptually, this factor doesn’t change how the Citrix Connection Brokering Protocol works, it simply adds multi-user support for Windows RDS servers. This functionality has actually existed with limited capabilities since XenDesktop 5.6 for CSPs (Hosted Server VDI), so it’s certainly not a huge leap in terms of changes to the broker agent. The XenDesktop brokering process consists of several key components, including:

  • Citrix Desktop Service (CDS / VDA) – This component provides a bridge between the ‘Delivery Controller’ and the ‘Worker’ and is commonly referred to as the ‘Virtual Desktop Agent’ or VDA. In XD5 this was the WorkstationAgent.exe process, though in XD7 the process was renamed to BrokerAgent.exe. However, the directory still reflects the VDA designation, so I still like to refer to it as the VDA:

CDS

  • Citrix Broker Service – The Broker Service is responsible for negotiating session launch requests with ‘workers’. The Broker service communicates with the CDS over a protocol that Citrix refers to as CBP (connection brokering protocol) to validate a worker’s readiness to fulfill a session launch request, gather the necessary details (IP address or host name), and send the details back to the StoreFront site to be packaged and delivered as an .ICA launch file that’s consumed by the Receiver.
  • Connection Brokering Protocol – This protocol behaves much like NFuse, though it uses .NET WCF endpoints to exchange a series of contracts to communicate registration and session launch details between a worker and delivery controller. This protocol was designed with the following key requirements as it’s functionality is highly critical to reliably providing on-demand desktop sessions:
    • Efficient: information should be exchanged only if and when required (just in time). Limiting the data exchange to a minimum also reduces the risk of leaking sensitive data.
    • Versioned: It must be possible for both workers and controllers to evolve concurrently and out of step without breaking protocol syntax or semantics.
    • Scalable: The delivery controller is a key piece of infrastructure, and its performance must not be impacted by unprompted messages and data from workers, as can happen in IMA, for instance during “election storms”.
    • Flexible: the protocol should allow the architecture to evolve over time, by not building key assumptions into the protocol’s foundation code. Factoring independent operations into separate service interfaces is one example of how a protocol can allow for increasing controller differentiation in future.
    • Compliant: Standards-based mechanisms (WCF) are used instead of proprietary technologies (IMA).
    • Secure: Security is critical, and the protocol must support appropriate mechanisms to ensure confidentiality, integrity (WCF contracts), and authenticity (NTLM/Kerberos auth) of data exchanged between workers and controllers.

The XenDesktop brokering process makes the following basic assumptions about CDS workers:

  • Desktops are either Private or Shared
  • Each desktop is associated with a single delivery group
  • Each desktop is backed by a single worker
  • Each worker is individually associated with a hosting unit, with a null unit index value indicating an unmanaged worker (existing or physical catalog types)
  • Desktops within a private desktop group can have permanent user assignments. The association may comprise one or more users, or a single client IP address
  • Multiple desktops within a private desktop group may have the same user assignments
  • Desktops within a shared desktop group may temporarily be assigned to a single user for the duration of a session
  • Multiple desktops within a shared desktop group may be assigned to the same user concurrently
  • Automatic assign-on-first-use behavior involves the broker selecting a desktop within a private desktop group with no assignment, and assigning it to the currently requesting user; the desktop’s group will not change by virtue of user assignment
  • The assignment of a desktop to its assignee(s) in a private desktop group can only be undone by an administrative user through the PoSH SDK

In a nutshell, the Delivery Controller is responsible for negotiating session launch requests by locating and preparing workers to accept ICA sessions that were requested by a StoreFront server via the XML broker.

XD7brokering

The broker service finds a worker to fulfill the session request, powers it on if needed, waits for it to become ready if a power action was sent. Once the worker is ready, the DDC sends the requisite connection details to the StoreFront server to build and deliver the ICA file, which is sent to the Receiver for consumption by the ICA client.

Hopefully this was a decent enough explanation of brokering. While I didn’t get a chance to go into a lot of detail about how a worker is found, and how CBP interacts with the ICA stack, I think this at least gives a good high level overview of the concept to know what components are involved and what their general interactions with each other are.

My next part in this series will look at the ICA stack, and how a connection is established between ICA clients and servers.

Advertisements

XenDesktop Session Launch Hypervisor Interactions

I got an email recently asking if I knew whether or not a XenDesktop site takes a hosting unit’s load or availability into consideration when brokering session launch requests, especially reconnects to desktops that were ‘In Use’ when a host goes down. This question was posed in the context of Desktop Groups with catalogs that are spread across multiple hosting units.

The simple answer to this question is no. XenDesktop’s interactions with the hypervisor (via the Hypervisor Abstaction Layer) were always intended to be used for power action/status, and MCS/PVS related cloning activities. When a XenDesktop site selects a ‘worker’ to fulfill a session launch request, it only looks at the worker’s registration status, and not that of the host that the worker guest VM is running on.

That said, the selection process for the next available worker is determined via stored procedures. To find out what the ‘next available’ worker is going to be in a XenDesktop 5.x site, you can run following T-SQL against the database, specifying the Desktop Group UID in the first line:

declare @DesktopGroupUid int = 1
declare @Readiness int = 3
declare @Uid int

 update Top(1) chb_State.Workers
 set @Uid = W.Uid
from chb_State.Workers W
            with (readpast,
                  index(IX_Workers_DesktopGroupUid_Usage_DynamicSequence))
         where DesktopGroupUid = @DesktopGroupUid
           and LaunchReadiness >= @Readiness
           and SinBinReleaseTime is null;select * from chb_State.WorkerNames
where Uid = @Uid
go

This script will return the name of the machine that the site will use to satisfy the next pooled-random session launch request to the specified desktop group, and doesn’t care what’s going on with the hosting unit where the worker lives. The site is only concerned with the worker’s registration state, and could care less if the power state of the VM is On, Off, or Unknown, much less does it care about the load of the hosting unit where that machine is running.

To that point, if a worker continues to register when a hosting unit connection becomes inaccessible (vCenter is down, but not the ESX host, for example), the desktop will still be available for session launch, but not for power management. This scenario can cause problems, such as ‘tainted’ workers that don’t get powered off after use, and end up in the unfortunate sounding ‘SinBin’. This process is only temporary, and is only corrected after the machine is rebooted by the XenDesktop site (check out the CTX article I wrote for more info).

As far as a scenario where a session was ‘In Use’ when the host goes down, the broker reaper site service will eventually clean up the failed worker when the ‘DDC Ping’ times out (controlled by the ‘HeartbeatPeriodMs’ value on the DDC running the reaper service). So, by default, you could potentially get into a situation where reconnects for ‘In Use’ session will keep selecting the failed worker until the reaper cleans it up. While this shouldn’t take longer than 5 minutes with the default heartbeat value, it may cause problems if there are frequent outages or service interruptions between geographically dispersed datacenters.

To work around the ~5 minute functionality gap of hosting unit availability awareness, as it relates to session launch anyways, one could easily trigger a XenDesktop PoSH script in the event of an outage (and the reverse when the outage is recovered) to toggle the ‘maintenance mode’ flag on any workers on a failed host. I’d like to hope that the XenDesktop product team has at least considered the potential for expanding the site’s visibility into the status of a guest VM’s host, and would love to see ‘smarter’ brokering logic such as thing in future releases.