Overview
Notification drivers are the bridge between external monitoring systems and the Instance HA recovery engine. They receive fault signals in various formats, translate them into structured Instance HA notifications, and route them to the Recovery Engine. Xloud Instance HA ships with theNovaNotificationDriver as the default. Custom drivers allow integration
with existing monitoring infrastructure such as Prometheus Alertmanager or Nagios.
Built-in Drivers
| Driver | Source | Protocol | When to Use |
|---|---|---|---|
NovaNotificationDriver | Xloud Compute message bus | AMQP | Default — all standard deployments |
TaskFlowDriver | TaskFlow workflow engine | Internal RPC | Advanced workflow orchestration |
| Custom webhook driver | Third-party tools (Prometheus, Nagios) | HTTP POST | Environments with existing monitoring infrastructure |
NovaNotificationDriver (Default)
TheNovaNotificationDriver is enabled by default in all Xloud Instance HA deployments.
It subscribes to the Xloud Compute AMQP message bus and listens for compute.host.error
and compute.instance.error notification events.
How it works
How it works
When a compute host enters a failure state, the Compute service publishes an error
notification on the AMQP message bus. The
NovaNotificationDriver receives this
message, extracts the affected host information, and creates an Instance HA
notification record to trigger the recovery workflow.This driver requires no additional configuration beyond what is provided by the
standard XDeploy deployment.Verify the driver is active
Verify the driver is active
Check driver configuration
Confirm AMQP connectivity
Webhook Notification Driver
For environments that use Prometheus, Nagios, or other external monitoring tools as the primary fault detection system, Instance HA exposes an HTTP notification endpoint that accepts structured fault payloads.Endpoint
Payload Format
Host fault notification payload
| Field | Type | Description |
|---|---|---|
hostname | string | The compute hostname as registered in the segment |
type | string | COMPUTE_HOST, COMPUTE_INSTANCE, or COMPUTE_PROCESS |
payload.event | string | STOPPED or STARTED |
payload.cluster_status | string | ONLINE or OFFLINE |
payload.host_status | string | NORMAL or UNKNOWN |
Example: Prometheus Alertmanager Webhook
Configure an Alertmanager receiver that calls the Instance HA notification endpoint:alertmanager.yml — webhook receiver
TaskFlowDriver
The TaskFlow driver enables advanced workflow orchestration for recovery actions. It is used internally when the default recovery workflow requires multi-step sequencing with retry and rollback support. This driver operates transparently alongside theNovaNotificationDriver and
does not require separate configuration in standard deployments. To customize
the TaskFlow task pipeline, implement the BaseTask interface and register the
plugin in the configuration.
- XDeploy
- CLI
Open Advanced Configuration
In XDeploy, navigate to Advanced Configuration. In the Service Tree,
select masakari.
Edit workflow targets
Select or create Click Save Current File.
instance-ha.conf in the Code Editor. Add the custom
workflow targets:TaskFlow workflow in XDeploy Advanced Configuration
Validation
- Dashboard
- CLI
Navigate to Admin → Compute → Instance HA → Notifications.Simulate a notification by creating one manually (test environments only):
- Click Create Notification (admin view)
- Set type to
COMPUTE_HOST, hostname to a registered host, event toSTOPPED - Confirm the notification appears and transitions to
running
Notification is received, logged, and triggers the recovery workflow.
Next Steps
Recovery Methods
Configure how instances are evacuated after a notification triggers recovery.
Instance Monitors
Configure guest-level monitoring independent of the notification driver.
Engine Configuration
Tune recovery engine timing, retries, and workflow task ordering.
Security
Secure the notification API endpoint and service account credentials.