Overview
Host monitors are the detection layer of Instance HA. They continuously poll registered compute hosts and emit fault notifications when a host becomes unreachable. Xloud Instance HA supports two monitor types: IPMI (out-of-band, recommended for production) and SSH (in-band, for environments without IPMI access). This page covers configuration for both types and the timing parameters that control detection sensitivity.Monitor Types
IPMI (Recommended)
Uses out-of-band management hardware. Detects failures even when the host OS,
kernel, or all network interfaces are completely unresponsive.
SSH
Attempts an SSH TCP connection to the host. Simpler to set up but dependent on
host network stack — may miss hardware failures that leave the SSH port unreachable.
IPMI Host Monitor
The IPMI monitor uses the host’scontrol_attributes JSON field, set when registering
the host in a segment.
Configure IPMI Credentials
- Dashboard
- CLI
When adding a host to a segment, set the Control Attributes field to a JSON
object with the IPMI endpoint:
IPMI control attributes
| Key | Description |
|---|---|
host | IPMI management IP address |
username | IPMI user with chassis status read permissions |
password | IPMI user password |
Validate IPMI Connectivity
Before registering hosts, validate IPMI access from the controller node:Test IPMI connectivity
System Power State: on confirms IPMI access is working.
SSH Host Monitor
The SSH monitor attempts a TCP connection to port 22. It uses the SSH key configured for the Instance HA service account — no password authentication is used.Deploy SSH Keys
Locate the service key
The Instance HA host monitor generates an SSH key pair at service startup. Locate
the public key on the controller node:
Find service public key
Deploy to compute hosts
Append the public key to the
authorized_keys of the user the monitor will connect
as (typically root or a dedicated service account):Deploy key to compute host
Register with SSH attributes
When registering the host in the segment, use the IP address only — no credentials:
Register host for SSH monitoring
Timing Parameters
Adjust monitoring sensitivity through the parameters below:| Section | Parameter | Default | Description |
|---|---|---|---|
[DEFAULT] | wait_period_after_service_update | 180 | Seconds to wait after a host enters maintenance before triggering recovery — prevents false alarms during planned restarts |
[DEFAULT] | long_rpc_timeout | 300 | Maximum seconds to wait for a Compute RPC call to complete before declaring it failed |
[host_failure] | host_failure_recovery_interval | 17 | Seconds between recovery retry attempts when the first evacuation attempt fails |
[host_failure] | ignore_lease_seconds | 0 | Seconds after host boot to suppress fault notifications — set to 60-120 to avoid startup noise |
- XDeploy
- CLI
Open Advanced Configuration
In XDeploy, navigate to Advanced Configuration. In the Service Tree,
select masakari.
Edit timing parameters
Select or create Click Save Current File.
instance-ha.conf in the Code Editor. Add or modify the
timing parameters:Host monitor timing in XDeploy Advanced Configuration
Monitor Health Check
Verify the host monitor is running and detecting hosts correctly:Check monitor service status
View monitor logs
UNREACHABLE log for a running host indicates a credential or network
configuration issue — not a genuine host failure.
Next Steps
Instance Monitors
Configure guest-level instance heartbeat monitoring for per-VM fault detection.
Failover Segments
Register and manage compute hosts within protection segments.
Engine Configuration
Tune recovery engine timing and retry parameters.
Security
Secure IPMI credentials and restrict access to Instance HA APIs.