Disaster Recovery

Disaster Recovery as a Service (DRaaS) for comprehensive infrastructure protection and automated failover.

Overview

Xelon HQ Disaster Recovery provides a fully managed DRaaS solution that protects your virtual infrastructure against site-level failures. By defining resource pools at a secondary site and importing VMs for DR protection, you can ensure rapid recovery with tested failover procedures and minimal data loss.

DRaaS components

Disaster Recovery builds on the replication engine and adds orchestrated failover, resource pool management, and test failover capabilities for enterprise-grade protection.

Setting Up Disaster Recovery

Access the DR dashboard

Navigate to Virtual Datacenter > Disaster Recovery in the sidebar to open the DR management interface.

Add a resource pool

Click Add Resource Pool to allocate compute and storage resources at the recovery site. Resource pools reserve capacity to ensure resources are available when a disaster occurs.

Import replicas

The dashboard displays available replicas from your primary site. Click on a replica to configure its DR settings, including network mapping and replication parameters.

Defining Resource Pools

Resource pools ensure that sufficient compute, memory, and storage resources are pre-allocated at the recovery site.

Resource Description Sizing Guidance
CPU Number of vCPU cores reserved for DR workloads. Match or exceed the total vCPUs of protected VMs.
Memory RAM allocated for DR VMs at the recovery site. Match the memory allocation of protected VMs.
Storage Disk capacity for replicated VM data. Account for current disk usage plus growth.
Resource pool sizing

Under-provisioned resource pools may prevent all VMs from starting during a failover. Regularly review and adjust resource pool allocations as your infrastructure grows.

Managing DR Replicas

To manage VMs in your disaster recovery plan:

View available replicas

The DR dashboard lists replicas from your primary site. Each replica shows its name, status, and resource pool ownership.

Configure replication settings

For each VM, configure the replication frequency and network mapping to define how the VM will be connected at the recovery site.

Start protection

Confirm the import to begin initial replication. The first sync may take longer depending on VM size and available bandwidth.

Testing Failover

Regular failover testing validates your DR plan without affecting production workloads.

Initiate a test failover

From the DR dashboard, select the VMs to test and click Test Failover. This creates temporary copies of the replicated VMs at the recovery site.

Validate the recovery

Connect to the test VMs and verify that applications, data, and network connectivity are functioning correctly.

Clean up test resources

After validation, click Clean Up Test to remove the temporary VMs and release resources. Production replication continues uninterrupted throughout the test.

Non-disruptive testing

Test failovers use isolated copies and do not affect your production VMs or ongoing replication. Schedule tests at least quarterly.

Recovery Procedures

In the event of a real disaster, follow these steps to activate your DR plan:

Assess the situation

Determine the scope of the failure and confirm that the primary site is unavailable. Verify that replication was current before the failure.

Declare a disaster

From the DR dashboard, click Failover to begin activating VMs at the recovery site.

Activate recovery VMs

The system will start the protected VMs at the recovery site using the most recent replicated data. Network configurations defined during import are applied automatically.

Verify and communicate

Validate that all services are running and notify stakeholders. Update DNS records or external configurations as needed.

Plan for failback

Once the primary site is restored, plan the failback process to return workloads and re-establish replication.

Post-failover replication

After a failover, replication from the original primary site stops. You must reconfigure replication once the primary site is available again to restore full DR protection.