Configure high availability
High availability attempts to protect your VM workload in the case of host or hardware failure.
This article describes the tasks to perform to configure high availability. For information about high availability concepts, requirements, and expected behaviors, see High availability.
1. Configure your pool for high availability
For a pool to be compatible with high availability, it must fulfill certain requirements:
-
Ensure that your pool is homogenous, contains three or more hosts, and all the hosts are online. For more information, see Create a pool.
-
Set up a dedicated, bonded interface for the pool management network and ensure that this network allows UDP traffic over port 694. For more information, see Create NIC bonds in resource pools.
-
Configure static IP addresses for all hosts in the pool.
-
Set up shared storage for all hosts in the pool. For more information, see Create an SR.
-
Ensure that one shared storage repository in the pool is on an iSCSI, NFS, or Fibre Channel LUN that meets the following requirements:
- 4 GB or more in size.
- Has resilient storage communication: For iSCSI and Fibre Channel SRs, configure multipathing. For NFS SRs, use a resilient bonded network as your storage network.
This SR is used as the heartbeat SR.
2. Prepare your VMs for high availability
Complete these steps for any VMs in your pool that you want to protect with high availability:
-
Ensure that the VM is agile:
-
Ensure that its virtual disks are on shared storage.
If the disks are not on shared storage, use the following command to move them:
xe vm-copy uuid=<vm_uuid> new-name-label=<name_for_copy> sr-uuid=<uuid_of_sr>
Set the
sr-uuid
to the UUID of the pool shared storage. -
Ensure that the VM is not connected to a local DVD drive.
-
Identify whether the VM does not have empty CD/DVD drives by typing the following:
xe vbd-list type=CD empty=false vm-uuid=<vm_uuid> <!--NeedCopy-->
If there is an attached drive, this returns its information. Note the
uuid
, which is the first item in the list. -
To empty the CD/DVD drives of the VM, type the following:
xe vbd-eject uuid=<uuid> <!--NeedCopy-->
-
-
Ensure that its virtual network interfaces are on pool-wide networks.
-
Ensure that the VM meets the requirements for live migration. For more information, see Migration requirements.
-
-
Specify a set of priorities that determine which VMs are given the highest restart priority when a pool is overcommitted.
Set the value of
ha-restart-priority
to one of the following options:-
restart
to have the VM restarted on another host in the pool, provided the pool isn’t overcommitted. XenServer retries this restart until it succeeds. -
best-effort
to make a single attempt to restart the VM on another host in the pool. XenServer makes this attempt only after all protected VMs have been successfully restarted.
When setting the value for
order
of the VMs, set the highest priority VMs to have the lowest start order. The default value for this setting is 0 and VMs with this start order are started first.xe vm-param-set uuid=<vm_uuid> ha-restart-priority=<priority> order=<start_order> <!--NeedCopy-->
You can instead use XenCenter to configure these settings on your VMs. For more information, see Start options.
-
3. Enable high availability on your XenServer pool
You can enable high availability on a pool by using either XenCenter or the command-line interface (CLI). For information about using XenCenter to enable high availability, see Enable high availability.
To enable high availability by using the xe CLI:
-
To enable high availability on the pool and, optionally, specify a timeout, run the following command:
xe pool-ha-enable heartbeat-sr-uuids=<sr uuid> ha-config:timeout=<timeout_in_seconds> <!--NeedCopy-->
The timeout configured by this command only applies to this enablement of high availability. If you don’t specify a time-out, the default is 60 seconds. To change this default timeout for your pool, see Configure the high availability timeout.
-
Compute the maximum number of hosts that can fail before there are insufficient resources to run all the protected VMs in the pool by running the following command:
xe pool-ha-compute-max-host-failures-to-tolerate <!--NeedCopy-->
The number returned is how many host failures are currently possible in the pool without loss of the liveness guarantee for protected VMs. This value can change as conditions in the pool change.
-
Specify the number of host failures to tolerate that is less than or equal to the value given by the previous step.
xe pool-param-set ha-host-failures-to-tolerate=<failure_value> uuid=<pool uuid> <!--NeedCopy-->
The number of failures to tolerate determines when an alert is sent. The system recomputes a failover plan as the state of the pool changes. It uses this computation to identify the pool capacity and how many more failures are possible without loss of the liveness guarantee for protected VMs. A system alert is generated when this computed value falls below the specified value.
Configure the high availability timeout
The timeout is the period during which networking or storage is not accessible by the hosts in your pool. If any XenServer host is unable to access networking or storage within the timeout period, it can self-fence and restart. The default timeout is 60 seconds. However, you can change this value by setting a default high availability timeout for your pool:
xe pool-param-set uuid=<pool uuid> other-config:default_ha_timeout=<timeout in seconds>
<!--NeedCopy-->
If you enable high availability by using XenCenter instead of the xe CLI, this default still applies.
Remove high availability protection from a VM
To disable high availability features for a VM, use the following command:
xe vm-param-set uuid=<vm_uuid> ha-restart-priority=
This command retains the start order settings. You can enable high availability for this VM again by setting the ha-restart-priority
parameter to restart
or best-effort
as appropriate.
Recover an unreachable host
If for some reason, a host cannot access the high availability state file, it is possible that a host might become unreachable. To recover your XenServer installation, you might have to disable high availability using the host-emergency-ha-disable
command on the host:
xe host-emergency-ha-disable --force
<!--NeedCopy-->
If the host was the pool coordinator, it starts up as normal with high availability disabled. Pool members reconnect and automatically disable high availability. If the host was a pool member and cannot contact the pool coordinator, you might have to take one of the following actions:
-
Force the host to reboot as a pool coordinator (
xe pool-emergency-transition-to-master
)xe pool-emergency-transition-to-master uuid=<host uuid> <!--NeedCopy-->
-
Tell the host where the new pool coordinator is (
xe pool-emergency-reset-master
):xe pool-emergency-reset-master master-address=<new pool coordinator hostname> <!--NeedCopy-->
When all hosts have successfully restarted, re-enable high availability:
xe pool-ha-enable heartbeat-sr-uuid=<sr uuid>
<!--NeedCopy-->
Shut down a host when high availability is enabled
Take special care when shutting down or rebooting a host to prevent the high availability mechanism from assuming that the host has failed. To shut down a host cleanly when high availability is enabled, disable the host, evacuate the host, and finally shutdown the host by using either XenCenter or the CLI. To shut down a host in an environment where high availability is enabled, run these commands:
xe host-disable host=<host name>
xe host-evacuate uuid=<host uuid>
xe host-shutdown host=<host name>
<!--NeedCopy-->
Shut down a VM protected by high availability
When a VM is protected under a high availability plan and set to restart automatically, it cannot be shut down while this protection is active. To shut down a VM, first disable its high availability protection and then run the CLI command.
xe vm-param-set uuid=<vm_uuid> ha-restart-priority=
xe vm-shutdown uuid=<vm_uuid>
<!--NeedCopy-->
XenCenter offers you a dialog box to automate disabling the protection when you select the Shutdown button of a protected VM.
Note:
If you shut down a VM from within the guest, and the VM is protected, it is automatically restarted under the high availability failure conditions. The automatic restart helps ensure that an operator error doesn’t result in a protected VM being left shut down accidentally. If you want to shut down this VM, disable its high availability protection first.
In this article
- 1. Configure your pool for high availability
- 2. Prepare your VMs for high availability
- 3. Enable high availability on your XenServer pool
- Configure the high availability timeout
- Remove high availability protection from a VM
- Recover an unreachable host
- Shut down a host when high availability is enabled
- Shut down a VM protected by high availability