XAVS — Advanced Virtualization Solutions
Product details on xloud.tech
Instance High Availability
User Guide
Understand protection segments, instance protection policies, and how to monitor
recovery workflows for your running workloads.
Admin Guide
Configure failover segments, host and instance monitors, notification drivers, and
integrate Instance HA with your compute cluster.
CLI Reference
Complete command reference for managing failover segments, hosts, and recovery
notifications using the openstack CLI.
Compute Service
Xloud Compute provides the hypervisor layer that Instance HA monitors and manages
during host failover events.
Key Capabilities
Host Failure Detection
IPMI and SSH-based monitors detect unreachable hosts in seconds and immediately
trigger evacuation of all protected instances.
Automatic Instance Recovery
Failed instances are automatically restarted on healthy hosts within the same
protection segment, respecting affinity rules.
Reserved Host Failover
Designate standby compute hosts that remain idle until a failover event occurs —
guaranteeing resource availability for recovery.
Protection Segments
Group hosts and instances into logical fault domains. Each segment has its own
recovery policy, monitors, and notification targets.
Notification Drivers
Integrate with IPMI, SSH, and custom notification sources to receive precise
fault signals from infrastructure monitoring tools.
Audit Trail
Every recovery event is logged with timestamps, affected instances, and resolution
outcomes — fully queryable via the Dashboard and CLI.
How It Works
Platform Resilience
Xloud-Developed — These resilience capabilities are developed by Xloud and ship with XAVS.
VM High Availability
Automatic VM restart on host failure. Configurable per-VM priority. Failover segments for per-group recovery policies. Requires XAVS.
Power Recovery Automation
9-phase automated recovery playbook. Target recovery time: 7-13 minutes. Sequential service startup with health gates between each phase. Requires XAVS.
Container Self-Healing
Three-tier autoheal daemon with dependency-aware restart ordering. Circuit breaker pattern prevents restart loops. Exponential backoff. Requires XAVS.
Proactive Monitoring
Pre-configured alert rules across 13 groups covering storage, database, message queue, compute, networking, containers, APIs, system resources, disk, memory, security, and capacity. Predictive alerts for capacity forecasting. Requires XAVS.
Network Resilience
L3 high availability and DHCP high availability with sub-3-second failover. Automatic ARP gratuitous announcements for fast VIP convergence. Requires XAVS.
Rolling Upgrades with Rollback
Per-service container upgrades with 2-10 second swap time. Canary deployment (first node only). Image tag rollback mechanism. Previous images cached locally. Requires XAVS.
Related Services
Xloud Compute
The hypervisor layer monitored and managed by Instance HA
Resource Optimizer
Rebalances workloads after recovery to restore cluster efficiency
Xloud Block Storage
Persistent volumes that survive host failover when using shared storage