XenServer

High availability

The XenServer high availability (HA) feature ensures that your VMs keep running with minimal downtime and no data corruption. When issues such as host hardware failures and network disruption occur, the XenServer pool responds by restarting affected VMs on stable hosts in the pool. This feature enables you to keep your VMs running until you are able to rectify any hardware issues.

XenServer ensures VM high availability by the following means:

  • Using a network heartbeat to verify connectivity between hosts in the pool.
  • Using a storage heartbeat to verify connectivity between the hosts and the shared storage.
  • Detecting whether a host has failed.
  • Detecting whether a host or hosts has become unreachable.
  • Fencing hosts that cannot communicate with the largest partition of hosts in the pool. A fenced host restarts immediately, causing all VMs running on it to be stopped. When it has restarted it tries to rejoin the resource pool. This precaution prevents VMs from running on two hosts at once and risking data corruption.
  • Marking a failed, fenced, or unreachable host as no longer part of the live set of hosts in the pool.
  • Marking any VMs that were running on that host as halted.
  • If the failed, fenced, or unreachable host is the pool coordinator, reassigning the coordinator role to another host in the pool.
  • Restarting any halted VMs according to the failover plan you configured.
  • Monitoring any changes to the pool configuration to check that the configured failover plan can be enacted.

Because the HA feature automatically restarts VMs on other hosts in the pool, XenServer must be sure that the original (failed or unreachable) host of the VM is no longer running the VM. Two instances of the same VM running at the same time can cause VM data corruption. To guard against this possibility, XenServer hosts in an HA-enabled pool are proactive about self-fencing if they are in situations that might lead to two instances of the same VM running.

The HA feature keeps your key VMs running until you can resolve the underlying hardware or network issue. When you realize a HA event has occurred, investigate and resolve the underlying failure to return your pool to full capacity.

This article describes high availability concepts, requirements, and expected behaviors. For information about configuring and managing high availability, see Configure high availability.

Requirements

To use the high availability feature, you need the following items in your environment:

  • A XenServer pool: The HA feature operates within a single resource pool.

    • We recommend that the pool is homogenous. Each host in the pool exposes the same set of CPU features to the VMs and makes it easier for the VMs to be restarted anywhere in the pool.
    • For the heartbeat mechanism to work effectively, we recommend that the pool has at least 3 hosts.
    • Ensure that all hosts in the pool are online before enabling HA.
  • Shared storage for all hosts in the pool: To enable any VM in the pool to be restarted on any host in the pool after a failure, all hosts in the pool must have access to the SR where the VM disks are stored.

  • A heartbeat SR: This SR can be the same SR as the one where the VM disks are stored. The pool stores information that enables the pool to coordinate failure detection and recovery in the event of a failure.

    • The heartbeat SR must be on an iSCSI, NFS, or Fibre Channel LUN. Storage attached using either SMB or iSCSI when authenticated using CHAP cannot be used as the heartbeat SR. We recommend that this SR is highly reliable with low latency.
    • XenServer 8.4 requires 4 GB for the heartbeat SR.

      The information stored on the heartbeat SR includes:

      • 4 MB heartbeat volume: Provides the storage heartbeat, which verifies that hosts in the pool have access to the storage.
      • The metadata volume: Stores the pool coordinator metadata to be used if there is a failover of the pool coordinator. This volume takes up the rest of the required space.
  • Reliable and redundant storage communication for the heartbeat SR: For the HA feature to have the most accurate view of which hosts can access the shared storage, configure your environment to ensure that storage traffic is reliable. For iSCSI and Fibre Channel SRs, configure multipathing. For NFS SRs, use a resilient bonded network as your storage network.

  • Static IP addresses for all hosts: HA treats a change of host IP address as the host losing connection and assumes that the host’s network has failed. As a result, the host can fence. Avoid this by only using static IPs in your pool.

  • A dedicated bonded interface on the management network: For the HA feature to have the most accurate view of the pool state, you require reliable and redundant network communications between hosts.

  • The management network allows network heartbeat UDP traffic over port 694: The network heartbeat verifies that the hosts in the pool are live and can contact each other.

To protect a VM running in your high availability pool, set up your VM with the following configuration:

  • Store the VM disks on shared storage available to all hosts in the pool.
  • Set up its virtual network interfaces on pool-wide networks.
  • Ensure that the VM can use live migration. For more information, see Migration compatibility requirements.
  • Do not connect the VM to a local DVD drive.

A VM that fulfills all these criteria is called agile.

A VM that uses NVIDIA vGPU or GPU pass-through cannot be protected by HA. However, the HA mechanism can attempt to restart this VM on a best-effort basis.

Requirements for clustered pools

The high availability behavior for clustered pools uses a different underlying mechanism and as such has some different requirements and behaviors. For more information, see Clustered pools.

HA failover plan

The HA mechanism calculates a pool-wide failover plan based on the following criteria:

  • VM recovery requirements: Each VM can have a restart priority and start order defined.
  • Available pool resources: The main resource that is considered is host memory.
  • The number of host failures to tolerate: After you enable HA in your pool, XenServer can compute the maximum number of hosts that can fail in the pool before protected VMs cannot be restarted. You can set the number of host failures to tolerate to less than or equal to this value.

If a failover plan meeting these criteria cannot be calculated, the pool is considered overcommitted. If the protected VMs cannot be restarted in the pool, XenServer raises a system alert. This alert is also shown in the XenCenter Notifications panel.

For every VM in your pool, you can define its recovery behavior.

Restart priority

You can assign a VM one of the following restart priorities:

  • Protected: If the VM or its host goes unexpectedly offline, HA restarts the VM on another host. This restart is guaranteed, provided the pool isn’t overcommitted and the VM is agile. If the VM restart fails, HA attempts to start the VM when there is extra capacity in the pool. This value is restart on the xe CLI and Restart in XenCenter.
  • Best-effort: If the host running the VM goes unexpectedly offline, HA attempts to restart the VM on another host. It makes this attempt only after all protected VMs have been successfully restarted. High availability makes only one attempt to restart a best-effort VM. If this attempt fails, high availability does not make further attempts to restart the VM. This value is best-effort on the xe CLI and Restart if possible in XenCenter.
  • Unprotected: If the VM or its host goes unexpectedly offline, HA does not attempt to restart the VM. This is the default setting. This value is an empty string on the xe CLI and Do not restart in XenCenter.

High availability never stops or migrates a running VM to free resources to restart a VM with a higher restart priority.

Start order

The start order is the order in which XenServer high availability attempts to restart protected VMs when a failure occurs. This value is used for protected VMs only. The default value is 0, which is the highest priority. Protected VMs with a start order value of 0 are restarted first. The higher the start order value, the later in the sequence the VM is restarted.

Pool behavior

After you have enabled HA in your XenServer pool, the pool exhibits the following behaviors.

Behavior during setup

When you enable HA in a pool, the pool coordinator performs the following setup:

  • Calculates the initial failover plan.
  • Configures the database to write updates to the heartbeat SR. This setting ensures that VM configuration changes are not lost when a host fails.
  • Sets up the pool coordinator metadata on the heartbeat SR.

All pool members:

  • Send network heartbeats to each other. As a result, there is a small increase in management network traffic as hosts in the pool verify that they can communicate with each other. This network traffic continues while HA is enabled.

Behavior during normal operation

During normal operation, the pool coordinator of an HA pool performs the following actions (in addition to its usual functions):

  • Dynamically maintains a failover plan. This plan details what to do when a set of hosts in a pool fail at any given time. This plan takes into account the maximum number of host failures that can be tolerated and ensures that all protected VMs can be restarted. The plan is dynamically recalculated based on VM lifecycle operations and movement. If changes (for example, the addition of new VMs to the pool) mean that all protected VMs can no longer be restarted after the maximum number of host failures, a plan cannot be calculated and the pool is overcommitted. When the pool becomes overcommitted, XenServer raises an alert via XenCenter, email, SNMP trap, or NRPE alert.

During normal operation, each member of an HA pool performs the following actions (in addition to their usual functions):

  • Checks that the pool coordinator is alive. The host does this by attempting to acquire a “master lock” on the shared storage. If a pool coordinator already exists, this attempt fails
  • Send a network heartbeat. This network heartbeat is sent using UDP over port 694 on the management network to all other hosts in the pool.
  • Maintains a record of the liveset of hosts in the pool. The liveset of hosts according to each individual host is the set of other hosts it believes to be live. If a host has not received a network heartbeat from another host within the period specified by the HA timeout (by default, 60 seconds), it communicates with the other hosts in the pool to agree whether the liveset must be updated.
  • Writes to the state file on the storage heartbeat volume. This action verifies that the host still has access to the storage. It also allows the hosts to communicate their state to one another (in addition to the communication with the network heartbeat).
  • Updates the database on the heartbeat SR. The host records any changes to VM configurations for the VMs they are hosting.

Some pool operations are blocked or not advised when HA is enabled. Temporarily disable high availability to perform these operations:

  • Adding a host to the pool.
  • Removing a host from the pool. Blocked if this action can cause the pool to become overcommitted.
  • Shutting down a host in the pool. Blocked if this action can cause the pool to become overcommitted.
  • Changing the management network.
  • Changing the SR attached to the pool.
  • Enabling clustering. Some high availability behavior and requirements are different for clustered pools. For more information, see Clustered pools.

During normal operation, performing these actions in the pool does not activate the HA failover plan:

  • VM clean shutdown from XenCenter or the xe CLI. The HA mechanism does not consider this VM to have failed and does not attempt to restart it. For more information about this action, see Shut down a VM protected by high availability
  • Host clean shutdown from XenCenter or the xe CLI. The HA mechanism does not consider this host to have failed and does not attempt to restart any VMs that were hosted on it. However, if this action causes the pool to become overcommitted, it is blocked by XenServer. For more information about this action, see Shut down a host when high availability is enabled

Behavior during hardware failures or infrastructure instability

During this phase, all hosts in the pool are responsible for detecting their own connectivity status and agreeing the connectivity status of other hosts in the pool.

XenServer HA detects and handles the following types of failure:

  • Failed host or hosts: In this situation, all remaining hosts notice very quickly that the failed host or hosts have stopped updating the state file and are no longer sending network heartbeats. After an appropriate delay, these hosts are removed from the liveset.
  • Network partition: In this situation, one or more hosts cannot communicate with one or more other hosts. A host notices that it has not received network heartbeats from one or more other hosts within the defined timeout and starts a fault handler. This fault handler process communicates through the state file and working network heartbeat, and uses that information to determine what network partitions (groups of hosts which can talk to each other) exist. The hosts in the largest partition are the liveset and survive. If there are equal sized partitions, the hosts in the partition containing the host with the lowest host UUID survive.
  • Failed storage connection: In this situation, a host notices that it cannot reach the storage or other hosts notice that its updates are not present on the storage. The hosts communicate though the network heartbeat communications to check whether other hosts have lost storage access:
    • If all hosts have lost storage, but not network, this is considered to be a temporary loss of storage and the hosts remain up to wait for the storage to come back. Any further failures and all hosts in the pool fence. This rule prevents storage being a single point of failure.
    • If only some hosts have lost storage access, but all hosts still have network access, these hosts are removed from the liveset.

If a host knows that it will appear failed or unreachable to the majority of the pool, that host self-fences. Fencing is an expected behavior designed as a protective measure for VM data. It ensures that a VM is not running in two places at once. A host uses the following criteria to decide that it needs to self-fence:

  • If the host’s toolstack is not running and cannot be restarted, the host self-fences.
  • If the host has lost both the network and storage heartbeats, the host considers itself unreachable and self-fences.
  • If the host has lost the storage heartbeat, but is still receiving network heartbeats:
    • If the host can still contact all other pool members and all those members have also lost the storage heartbeat, the host stays alive. This case prevents the storage acting as a single point of failure and fencing the whole pool.
    • If the host can’t contact one or more other hosts in the pool, it self-fences.
  • If the host has lost any network heartbeats, but still has the storage heartbeat, it determines whether it is in the largest network partition. If it is not, the host self-fences.
  • There is the chance that a network communication failure might split the pool into partitions of equal size. If, using the information in the state file on the heartbeat SR, a host knows that it is in such a network partition:
    • If the partition contains the host with the lowest UUID, the host stays alive.
    • If the partition does not contain the host with the lowest UUID, the host self-fences.

When a fence action is taken, the host restarts immediately and abruptly, causing all VMs running on it to be stopped. The fenced host enters a reboot sequence, and when it has restarted it tries to rejoin the resource pool.

Behavior during recovery

If the pool coordinator is the host that has failed, fenced, or become unreachable, other hosts attempt to get the master lock. The host that succeeds becomes the new coordinator.

Hosts that have self-fenced restart and attempt to rejoin the pool.

When a host is marked as dead and its VMs halted, the pool coordinator is responsible for the following recovery actions.

  • Restart all protected VMs according to the failover plan.
  • If there is not enough resource to start all protected VMs, the pool coordinator waits until resource becomes available (for example, if previously fenced hosts rejoin the pool) and then attempts to start the protected VMs.
  • After all protected VMs are successfully started, the pool coordinator makes one attempt to restart each best-effort VM.
High availability