Thin-provisioned shared GFS2 block storage
Thin provisioning better utilizes the available storage by allocating disk storage space to VDIs as data is written to the virtual disk, rather than allocating the full virtual size of the VDI in advance. Thin provisioning enables you to significantly reduce the amount of space required on a shared storage array, and with that your Total Cost of Ownership (TCO).
Thin provisioning for shared block storage is of particular interest in the following cases:
- You want increased space efficiency. Images are sparsely and not thickly allocated.
- You want to reduce the number of I/O operations per second on your storage array. The GFS2 SR is the first SR type to support storage read caching on shared block storage.
- You use a common base image for multiple virtual machines. The images of individual VMs will then typically utilize even less space.
- You use snapshots. Each snapshot is an image and each image is now sparse.
- You want to create VDIs that are greater than 2 TiB in size. The GFS2 SR supports VDIs up to 16 TiB in size.
- Your storage doesn’t support NFS or SMB3 and only supports block storage. If your storage supports NFS or SMB3, we recommend you use these SR types instead of GFS2.
- Your storage doesn’t support thin provisioning of LUNs. If your storage does thin provision LUNs, you can encounter problems and run out of space when combining it with GFS2. Combining GFS2 with a thin-provisioned LUN does not provide many additional benefits and is not recommended.
Note:
We recommend not to use a GFS2 SR with a VLAN due to a known issue where you cannot add or remove hosts on a clustered pool if the cluster network is on a non-management VLAN.
The shared GFS2 type represents disks as a filesystem created on an iSCSI or HBA LUN. VDIs stored on a GFS2 SR are stored in the QCOW2 image format.
This article describes how to set up your GFS2 environment by using the xe CLI. To set up a GFS2 environment by using XenCenter, see the XenCenter product documentation.
1. Plan your GFS2 environment
To provide the benefits of thin provisioning on shared block storage without risk of data loss, your pool must deliver a good level of reliability and connectivity. It is crucial that the hosts in the resource pool that uses GFS2 can reliably communicate with one another. To ensure this, XenServer requires that you use a clustered pool with your GFS2 SR. We also recommend that you design your environment and configure XenServer features to provide as much resiliency and redundancy as possible.
Before setting up your XenServer pool to work with GFS2 SRs, review the following requirements and recommendations for an ideal GFS2 environment:
-
Recommended: Configure redundant networking infrastructure.
-
Recommended: Create a dedicated bonded network
-
Required: Set up a clustered pool
-
Optional Increase your control domain memory
-
Recommended: Configure storage multipathing
-
Required: Create a GFS2 SR
A clustered pool with GFS2 SRs has some differences in behavior to other types of pool and SR. For more information, see Constraints.
2. Configure redundant networking infrastructure
A bonded network links two or more NICs together to create a single channel for network traffic. We recommend that you use a bonded network for your clustered pool traffic. However, before you set up your bonded network, ensure that your network hardware configuration promotes redundancy in the bonded network. Consider implementing as many of these recommendations as is feasible for your organization and environment.
The following best practices add resiliency against software, hardware, or power failures that can affect your network switches.
- Ensure that you have separate physical network switches available for use in the bonded network, not just ports on the same switch.
- Ensure that the separate switches draw power from different, independent power distribution units (PDUs).
- If possible, in your data center, place the PDUs on different phases of the power feed or even feeds provided by different utility companies.
- Consider using uninterruptible power supply units to ensure that the network switches and servers can continue to function or perform an orderly shutdown in the event of a power failure.
3. Create a dedicated bonded network
It is important to ensure that hosts in a clustered pool can communicate reliably with one another. Creating a bonded network for this pool traffic increases the resiliency of your clustered pool.
A bonded network creates a bond between two or more NICs to create a single, high-performing channel that your clustered pool can use for cluster heartbeat traffic. We strongly recommend that this bonded network is not used for any other traffic. Create a separate network for the pool to use for management traffic.
Warning:
If you choose not to follow this recommendation, you are at a higher risk of losing cluster management network packets. Loss of cluster management network packets can cause your clustered pool to lose quorum and some or all hosts in the pool will self-fence.
If your cluster is fencing or facing a problem in this unrecommended configuration, XenServer Support might ask you to reproduce the same problem on a recommended configuration during the course of investigation.
To create a bonded network to use as the clustering network:
-
If you have a firewall between the hosts in your pool, ensure that hosts can communicate on the cluster network using the following ports:
- TCP: 8892, 8896, 21064
- UDP: 5404, 5405
For more information, see Communication ports used by XenServer.
-
Open a console on the XenServer host that you want to act as the pool coordinator.
-
Create a network for use with the bonded NIC by using the following command:
xe network-create name-label=bond0 <!--NeedCopy-->
The UUID of the new network is returned.
-
Find the UUIDs of the PIFs to use in the bond by using the following command:
xe pif-list <!--NeedCopy-->
-
Create your bonded network in either active-active mode, active-passive mode, or LACP bond mode. Depending on the bond mode you want to use, complete one of the following actions:
-
To configure the bond in active-active mode (default), use the
bond-create
command to create the bond. Using commas to separate the parameters, specify the newly created network UUID and the UUIDs of the PIFs to be bonded:xe bond-create network-uuid=<network_uuid> / pif-uuids=<pif_uuid_1>,<pif_uuid_2>,<pif_uuid_3>,<pif_uuid_4> <!--NeedCopy-->
Type two UUIDs when you are bonding two NICs and four UUIDs when you are bonding four NICs. The UUID for the bond is returned after running the command.
-
To configure the bond in active-passive or LACP bond mode, use the same syntax, add the optional
mode
parameter, and specifylacp
oractive-backup
:xe bond-create network-uuid=<network_uuid> / pif-uuids=<pif_uuid_1>,<pif_uuid_2>,<pif_uuid_3>,<pif_uuid_4> / mode=balance-slb | active-backup | lacp <!--NeedCopy-->
-
After you create your bonded network on the pool coordinator, when you join other XenServer hosts to the pool, the network and bond information is automatically replicated to the joining server.
For more information, see Networking.
Note:
- Changing the IP address of the cluster network by using XenCenter requires clustering and GFS2 to be temporarily disabled.
- Do not change the bonding of your clustering network while the cluster is live and has running VMs. This action can cause hosts in the cluster to hard restart (fence).
- If you have an IP address conflict (multiple hosts having the same IP address) on your clustering network involving at least one host with clustering enabled, the cluster does not form correctly and the hosts are unable to fence when required. To fix this issue, resolve the IP address conflict.
To test your active-passive bonded network failover times:
For bonded networks that use active-passive mode, if the active link fails, there is a failover period when the network link is broken while the passive link becomes active. If the time it takes for your active-passive bonded network to fail over is longer than the cluster timeout, some or all hosts in your clustered pool might still fence.
You can test your bonded network failover time by forcing the network to fail over by using one of the following methods:
- By physically pulling out the network cables
- By disabling switch ports on one network link
Repeat the test a number of times to ensure the result is consistent.
The cluster timeout value of your pool depends on how many hosts are in your cluster. Run the following command to find the token-timeout
value in seconds for the pool:
xe cluster-param-get uuid=<cluster_uuid> param-name=token-timeout
If the failover time is likely to be greater than the timeout value, your network infrastructure and configuration might not be reliable enough to support a clustered pool.
4. Set up a clustered pool
To use shared GFS2 storage, the XenServer resource pool must be a clustered pool. Enable clustering on your pool before creating a GFS2 SR.
A clustered pool is a pool of XenServer hosts that are more closely connected and coordinated than hosts in non-clustered pools. The hosts in the cluster maintain constant communication with each other on a selected network. All hosts in the cluster are aware of the state of every host in the cluster. This host coordination enables the cluster to control access to the contents of the GFS2 SR. To ensure that the clustered pool always remains in communication, each host in a cluster must always be in communication with at least half of the hosts in the cluster (including itself). This state is known as a host having quorum. If a host does not have quorum, it hard restarts and removes itself from the cluster. This action is referred to as ‘fencing’.
For more information, see Clustered pools.
Before you start setting up your clustered pool, ensure that the following prerequisites are met:
-
Plan to create a pool of between 3 and 16 hosts.
Where possible, use an odd number of hosts in a clustered pool as this ensures that hosts are always able to determine if they have quorun. We recommend that you use clustering only in pools containing at least three hosts, as pools of two hosts are sensitive to self-fencing the entire pool.
Clustered pools only support up to 16 hosts per pool.
- All XenServer hosts in the clustered pool must have at least 2 GiB of control domain memory.
- All hosts in the cluster must use static IP addresses for the cluster network.
- If you are clustering an existing pool, ensure that high availability is disabled. You can enable high availability again after clustering is enabled.
To use the xe CLI to create a clustered pool:
-
Create a resource pool of at least three XenServer hosts.
Repeat the following steps on each joining XenServer host that is not the pool coordinator:
- Open a console on the XenServer host.
-
Join the XenServer host to the pool on the pool coordinator by using the following command:
xe pool-join master-address=<master_address> / master-username=<administrators_username> / master-password=<password> <!--NeedCopy-->
The value of the
master-address
parameter must be set to the fully qualified domain name of the XenServer host that is the pool coordinator. Thepassword
must be the administrator password set when the pool coordinator was installed.
For more information, see Hosts and resource pools.
-
For every PIF that belongs to this network, set
disallow-unplug=true
.-
Find the UUIDs of the PIFs that belong to the network by using the following command:
xe pif-list <!--NeedCopy-->
-
Run the following command on a XenServer host in your resource pool:
xe pif-param-set disallow-unplug=true uuid=<pif_uuid> <!--NeedCopy-->
-
-
Enable clustering on your pool. Run the following command on a XenServer host in your resource pool:
xe cluster-pool-create network-uuid=<network_uuid> <!--NeedCopy-->
Provide the UUID of the bonded network that you created in an earlier step.
5. Increase your control domain memory
If you have insufficient control domain memory on your hosts, your pool can experience network instabililty. Network instability can cause problems for a clustered pool with GFS2 SRs.
It is important to ensure that your clustered pool has an appropriate amount of control domain memory. For information about changing the amount of control domain memory and monitoring the memory behavior, see Memory usage.
6. Configure storage multipathing
Ensure that storage multipathing is set up between your clustered pool and your GFS2 SR.
Multipathing routes storage traffic to a storage device over multiple paths for redundancy. All routes can have active traffic on them during normal operation, which results in increased throughput.
Before enabling multipathing, verify that the following statements are true:
-
Your ethernet or fibre switch is configured to make multiple targets available on your storage server.
For example, an iSCSI storage back-end queried for
sendtargets
on a given portal returns multiple targets, as in the following example:iscsiadm -m discovery --type sendtargets --portal 192.168.0.161 192.168.0.161:3260,1 iqn.strawberry:litchie 192.168.0.204:3260,2 iqn.strawberry:litchie
However, you can perform additional configuration to enable iSCSI multipath for arrays that only expose a single target. For more information, see iSCSI multipath for arrays that only expose a single target.
-
For iSCSI only, the control domain (dom0) has an IP address on each subnet used by the multipathed storage.
Ensure that for each path to the storage, you have a NIC and that there is an IP address configured on each NIC. For example, if you want four paths to your storage, you must have four NICs that each have an IP address configured.
-
For iSCSI only, every iSCSI target and initiator has a unique IQN.
-
For iSCSI only, the iSCSI target ports are operating in portal mode.
-
For HBA only, multiple HBAs are connected to the switch fabric.
-
If possible, use multiple redundant switches.
To enable multipathing by using the xe CLI
We recommend that you enable multipathing for all hosts in your pool before creating the SR. If you create the SR before enabling multipathing, you must put your hosts into maintenance mode to enable multipathing.
-
Open a console on the XenServer host.
-
Unplug all PBDs on the host by using the following command:
xe pbd-unplug uuid=<pbd_uuid> <!--NeedCopy-->
You can use the command
xe pbd-list
to find the UUID of the PBDs. -
Set the value of the
multipathing
parameter totrue
by using the following command:xe host-param-set uuid=<host uuid> multipathing=true <!--NeedCopy-->
-
If there are existing SRs on the hosts running in single path mode that have multiple paths:
-
Migrate or suspend any running guests with virtual disks in the affected SRs.
-
Replug the PBD of any affected SRs to reconnect them using multipathing:
xe pbd-plug uuid=<pbd_uuid> <!--NeedCopy-->
-
-
Repeat these steps to enable multipathing on all hosts in the pool.
Ensure that you enable multipathing on all hosts in the pool. All cabling and, in the case of iSCSI, subnet configurations must match the corresponding NICs on each host.
For more information, see Storage multipathing.
7. Create a GFS2 SR
Create your shared GFS2 SR on an iSCSI or an HBA LUN that is visible to all XenServer hosts in your resource pool. We do not recommend using a thin-provisioned LUN with GFS2. However, if you do choose this configuration, you must ensure that the LUN always has enough space to allow XenServer to write to it.
You can add up to 62 GFS2 SRs to a clustered pool.
If you have previously used your block-based storage device for thick provisioning with LVM, this is detected by XenServer. XenCenter gives you the opportunity to use the existing LVM partition or to format the disk and set up a GFS2 partition.
Create a shared GFS2 over iSCSI SR
You can create GFS2 over iSCSI SRs by using XenCenter. For more information, see Software iSCSI storage in the XenCenter product documentation.
Alternatively, you can use the xe CLI to create a GFS2 over iSCSI SR.
Device-config parameters for GFS2 SRs:
Parameter Name | Description | Required? |
---|---|---|
provider |
The block provider implementation. In this case, iscsi . |
Yes |
target |
The IP address or hostname of the iSCSI filer that hosts | Yes |
targetIQN |
The IQN target of iSCSI filer that hosts the SR | Yes |
SCSIid |
Device SCSI ID | Yes |
You can find the values to use for these parameters by using the xe sr-probe-ext
command.
xe sr-probe-ext type=<type> host-uuid=<host_uuid> device-config:=<config> sm-config:=<sm_config>
<!--NeedCopy-->
-
Start by running the following command:
xe sr-probe-ext type=gfs2 device-config:provider=iscsi <!--NeedCopy-->
The output from the command prompts you to supply additional parameters and gives a list of possible values at each step.
-
Repeat the command, adding new parameters each time.
-
When the command output starts with
Found the following complete configurations that can be used to create SRs:
, you can locate the SR by using thexe sr-create
command and thedevice-config
parameters that you specified.Example output:
Found the following complete configurations that can be used to create SRs: Configuration 0: SCSIid : 36001405852f77532a064687aea8a5b3f targetIQN: iqn.2009-01.example.com:iscsi192a25d6 target: 198.51.100.27 provider: iscsi Configuration 0 extra information: <!--NeedCopy-->
To create a shared GFS2 SR on a specific LUN of an iSCSI target, run the following command on a server in your clustered pool:
xe sr-create type=gfs2 name-label="Example GFS2 SR" --shared \
device-config:provider=iscsi device-config:targetIQN=<target_iqns> \
device-config:target=<portal_address> device-config:SCSIid=<scsci_id>
<!--NeedCopy-->
If the iSCSI target is not reachable while GFS2 filesystems are mounted, some hosts in the clustered pool might hard restart (fence).
For more information about working with iSCSI SRs, see Software iSCSI support.
Create a shared GFS2 over HBA SR
You can create GFS2 over HBA SRs by using XenCenter. For more information, see Hardware HBA storage in the XenCenter product documentation.
Alternatively, you can use the xe CLI to create a GFS2 over HBA SR.
Device-config parameters for GFS2 SRs:
Parameter name | Description | Required? |
---|---|---|
provider |
The block provider implementation. In this case, hba . |
Yes |
SCSIid |
Device SCSI ID | Yes |
You can find the values to use for the SCSIid parameter by using the xe sr-probe-ext
command.
xe sr-probe-ext type=<type> host-uuid=<host_uuid> device-config:=<config> sm-config:=<sm_config>
<!--NeedCopy-->
-
Start by running the following command:
xe sr-probe-ext type=gfs2 device-config:provider=hba <!--NeedCopy-->
The output from the command prompts you to supply additional parameters and gives a list of possible values at each step.
-
Repeat the command, adding new parameters each time.
-
When the command output starts with
Found the following complete configurations that can be used to create SRs:
, you can locate the SR by using thexe sr-create
command and thedevice-config
parameters that you specified.Example output:
Found the following complete configurations that can be used to create SRs: Configuration 0: SCSIid : 36001405852f77532a064687aea8a5b3f targetIQN: iqn.2009-01.example.com:iscsi192a25d6 target: 198.51.100.27 provider: iscsi Configuration 0 extra information: <!--NeedCopy-->
To create a shared GFS2 SR on a specific LUN of an HBA target, run the following command on a server in your clustered pool:
xe sr-create type=gfs2 name-label="Example GFS2 SR" --shared \
device-config:provider=hba device-config:SCSIid=<device_scsi_id>
<!--NeedCopy-->
For more information about working with HBA SRs, see Hardware host bus adapters.
What’s next?
Now that you have your GFS2 environment set up, it is important that you maintain the stability of your clustered pool by ensuring it has quorum. For more information, see Manage your clustered pool.
If you encounter issues with your GFS2 environment, see Troubleshoot clustered pools.
You can manage your GFS2 SR the same way as you do other SRs. For example, you can add capacity to the storage array to increase the size of the LUN. For more information, see Live LUN expansion.
Constraints
Shared GFS2 storage currently has the following constraints:
-
As with any thin-provisioned SR, if the GFS2 SR usage grows to 100%, further writes from VMs fail. These failed writes can then lead to failures within the VM, possible data corruption, or both.
-
XenCenter shows an alert when your SR usage grows to 80%. Ensure that you monitor your GFS2 SR for this alert and take the appropriate action if seen. On a GFS2 SR, high usage causes a performance degradation. We recommend that you keep your SR usage below 80%.
-
VM migration with storage migration (live or offline) is not supported for VMs whose VDIs are on a GFS2 SR. You also cannot migrate VDIs from another type of SR to a GFS2 SR.
-
The Software FCoE transport is not supported with GFS2 SRs (for fully offloaded FCoE use HBA).
-
Trim/unmap is not supported on GFS2 SRs.
-
CHAP is not supported on GFS2 SRs.
-
You cannot export VDIs that are greater than 2 TiB as VHD or OVA/OVF. However, you can export VMs with VDIs larger than 2 TiB in XVA format.
-
We do not recommend using a thin-provisioned LUN with GFS2. However, if you do choose this configuration, you must ensure that the LUN always has enough space to allow XenServer to write to it.
-
We do not recommend using SAN deduplication with GFS2 SRs. However, if you do choose this configuration, you must use suitable external monitoring of your SAN utilization to ensure that there is always space for XenServer to write to.
-
Your GFS2 file system cannot be larger than 100 TiB.
-
You cannot have more than 62 GFS2 SRs in your pool.
-
Clustered pools only support up to 16 hosts per pool.
-
To enable HA on your clustered pool, the heartbeat SR must be a GFS2 SR.
-
For cluster traffic, we strongly recommend that you use a bonded network that uses at least two different network switches. Do not use this network for any other purposes.
-
Changing the IP address of the cluster network by using XenCenter requires clustering and GFS2 to be temporarily disabled.
-
Do not change the bonding of your clustering network while the cluster is live and has running VMs. This action can cause hosts in the cluster to hard restart (fence).
-
If you have an IP address conflict (multiple hosts having the same IP address) on your clustering network involving at least one host with clustering enabled, the cluster does not form correctly and the hosts are unable to fence when required. To fix this issue, resolve the IP address conflict.