Overview
This guide covers the most common operational issues encountered in Xloud Compute environments. Each section provides the diagnostic commands, root cause analysis, and resolution steps needed to restore normal operation.Prerequisites
- Admin credentials sourced from
admin-openrc.sh openstackCLI installed and configured- SSH access to compute nodes for log inspection when needed
Common Issues
Instance stuck in BUILD status
Instance stuck in BUILD status
Cause: The scheduler placed the instance on a host but the Compute Agent failed
to complete provisioning. Common causes include image download failure, networking
misconfiguration, or storage attachment failure.Diagnosis:Common causes and resolutions:
Check instance event log
Identify the target host
Check Compute Agent logs on the target host (via XDeploy terminal)
| Symptom in Event Log | Resolution |
|---|---|
Image download failed | Verify Xloud Image Service reachability from the compute node |
Quota exceeded on host | Check host capacity with openstack hypervisor show <host> |
Network interface allocation failed | Verify network agent status on the host |
Volume attachment failed | Check Xloud Block Storage service health |
Instance in ERROR state
Instance in ERROR state
Cause: A fatal error occurred during instance creation, a running operation, or
hypervisor interaction. The fault details are stored in the instance record.Diagnosis:Resolution:If the error is recoverable (e.g., a temporary network partition that has since
resolved), attempt to rebuild the instance from its original image:If the error is caused by a host-level hardware failure, migrate the instance to
a healthy host before attempting a rebuild. See
Live Migration for instructions.
Show error fault details
View full instance event log
Rebuild instance from original image
Live migration fails
Live migration fails
Cause: CPU compatibility mismatch, insufficient destination capacity, or a
network timeout during the migration data transfer.Diagnosis:Common errors and resolutions:
See Live Migration for a full walkthrough of
the migration procedure.
Check migration status and error message
Show detailed migration information
| Error Message | Root Cause | Resolution |
|---|---|---|
guest CPU doesn't match specification: missing features | CPU microarchitecture difference between hosts | Configure a common CPU baseline model on all hosts via XDeploy under Compute → Advanced Settings → CPU Compatibility |
No valid host found | Destination host lacks capacity or is disabled | Check destination host capacity with openstack hypervisor show <host>; verify host is enabled and up |
Connection timeout | Network disruption on migration network | Verify network connectivity between compute nodes; check firewall rules on the management interface |
Block migration disk copy failed | Insufficient free disk on destination | Check available disk with openstack hypervisor show <host> |
No valid host found (scheduling failure)
No valid host found (scheduling failure)
Cause: All compute hosts were eliminated by the scheduler filter chain — no
eligible host satisfies the combined instance requirements.Diagnosis:Common causes:
Check cluster-wide capacity
List all hosts with capacity details
| Cause | Resolution |
|---|---|
| All hosts at vCPU or RAM capacity | Scale out the cluster or increase over-commit ratios via XDeploy |
| Availability zone constraint too restrictive | Verify target AZ has active hosts with openstack availability zone list --long |
| Host aggregate metadata mismatch | Verify flavor extra specs match aggregate metadata keys on target hosts |
| Server group anti-affinity exhausted | Group has used all distinct hosts; scale out or remove anti-affinity constraint |
| Flavor requires PCI device not available | Verify PCI passthrough devices are configured on target hosts |
Console connection refused
Console connection refused
Cause: The VNC or SPICE console proxy service is not running, the firewall
is blocking the console port, or the console token has expired.Diagnosis:Resolution:See Console Access for firewall port requirements
and proxy configuration details.
Verify console proxy service status
Check all compute services for degraded state
- Verify ports 6080 (VNC), 6082 (SPICE), and 6083 (serial) are open in your firewall rules from the administrator’s workstation to the controller node.
-
If the console proxy service is
down, restart it through XDeploy under Compute → Services → Console Proxy. - If the connection is refused immediately after generating a URL, the token may have expired. Generate a new console URL:
Generate a fresh console URL
Console tokens expire after a short period. If the browser reports an
authentication error when accessing the console URL, always generate a new URL
rather than refreshing the page.
Quota exceeded errors
Quota exceeded errors
Cause: The project has reached its allocation limit for instances, vCPUs,
or RAM. New instance creation or resize operations are blocked until the quota
is increased or existing resources are released.Diagnosis:Resolution:Option 1 — Increase the project quota:Option 2 — Free capacity by removing unused instances:Coordinate with the project owner to identify and delete instances that are no
longer in use. Do not delete instances without explicit confirmation from the
project owner.
Show current quota usage
List instances consuming quota in the project
Increase quota for instances, vCPUs, and RAM
Next Steps
Compute Hosts
Monitor and manage hypervisor host health to prevent scheduling failures.
Live Migration
Move instances off degraded hosts before performing maintenance.
Admin Guide
Return to the Compute Administration Guide index.