Skip to main content

Overview

Live vCPU/RAM scaling allows users to adjust compute resources on running instances without downtime. As an administrator, you configure which flavors support live scaling by setting hot-add extra specs that define the resource envelope.
Xloud-Developed — Live vCPU/RAM Scaling is developed by Xloud and ships with XAVS / XPCI.
Prerequisites
  • Admin credentials sourced from admin-openrc.sh
  • Understanding of flavor extra specs and NUMA topology

Key Capabilities

Bidirectional CPU Scaling

Add and remove vCPUs on a running instance. Xloud is the only enterprise cloud platform that supports vCPU hot-remove through the management UI — other platforms support add-only.

Three Memory Methods

Balloon — fast cooperative resize within existing ceiling. DIMM Hotplug — 256 MB-aligned persistent modules. virtio-mem — 4 MB granularity with bidirectional add/remove support.

CPU Topology Awareness

Automatic thread alignment for SMT (hyperthreading) hosts. Five-method detection cascade ensures correct topology mapping across all CPU generations.

Configure a Live-Scalable Flavor

Before users can live-scale instances, you must create flavors with hot-add extra specs that define the min/max resource envelope.

Open Flavor Management

Navigate to Admin > Compute > Flavors.

Create or edit a flavor

Click Create Flavor (or select an existing flavor and click Update Metadata).Set the base resources — these are the starting values when an instance is launched:
FieldExampleDescription
Namem1.large-scalableDescriptive name indicating live scaling is enabled
vCPUs4Starting vCPU count
RAM (MB)8192Starting RAM in MB
Root Disk (GB)80Root disk (unchanged by live scaling)

Add hot-add extra specs

In the flavor’s Extra Specs panel, add the following keys:
KeyExample ValueDescription
hw:cpu_max_sockets16Maximum CPU sockets (sets vCPU ceiling)
hw:cpu_max_cores1Cores per socket (keep at 1 unless using NUMA)
hw:cpu_max_threads1Threads per core
hw:mem_max_mb65536Maximum RAM in MB (64 GB ceiling)
hw:mem_page_sizeanySet to any or large for memory balloon resizing
The ceiling for vCPUs = hw:cpu_max_sockets x hw:cpu_max_cores x hw:cpu_max_threads. For a flavor starting at 4 vCPUs scaling to 16, set hw:cpu_max_sockets=16, hw:cpu_max_cores=1, hw:cpu_max_threads=1.

Resource Bounds Reference

The allowed range for live scaling is determined by the flavor’s base values and hot-add extra specs:
ResourceMinimumMaximumConfigured By
vCPUsFlavor base vCPU counthw:cpu_max_sockets x hw:cpu_max_cores x hw:cpu_max_threadsFlavor extra specs
RAMFlavor base RAM (MB)hw:mem_max_mbFlavor extra spec
The minimum is the base flavor value — users cannot scale below the flavor’s original allocation via live scaling. To permanently reduce resources below the base, use a standard flavor resize (requires reboot).

API Reference

Live scaling is available via the Xloud Compute API extension:
Adjust vCPU and RAM on a running instance
curl -X POST \
  "https://api.<your-domain>/compute/v2.1/os-xloud-adjust/<server-id>" \
  -H "X-Auth-Token: $OS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "xloud_adjust": {
      "current_vcpus": 8,
      "current_memory_mb": 16384
    }
  }'
FieldTypeDescription
current_vcpusintegerNew desired vCPU count (within flavor envelope)
current_memory_mbintegerNew desired RAM in MB (within flavor envelope)
You can pass one or both fields — omit the field you do not want to change.
Get current live resource state
curl -X GET \
  "https://api.<your-domain>/compute/v2.1/servers/<server-id>/xloud-status" \
  -H "X-Auth-Token: $OS_TOKEN"

NUMA Considerations

When an instance has a NUMA topology defined (via hw:numa_nodes), vCPU changes must keep each NUMA node balanced:
  • A 2-NUMA-node instance with 4 vCPUs has 2 vCPUs per node
  • Scaling to 6 vCPUs is valid (3 per node); scaling to 5 vCPUs is not (uneven split)
  • Scaling to 8 vCPUs is valid (4 per node)
Check NUMA topology of a flavor
openstack flavor show m1.large-scalable -c properties | grep numa
For instances without an explicit hw:numa_nodes extra spec, the hypervisor uses a single NUMA cell and any valid vCPU count within the envelope is accepted.

Troubleshooting

Cause: The instance’s flavor does not have hot-add extra specs configured.Fix: Add hw:cpu_max_sockets and/or hw:mem_max_mb to the flavor. Existing instances on that flavor will immediately become live-scalable — no reboot required.
Check if flavor has hot-add specs
openstack flavor show <flavor-name> -c properties
Cause: The requested vCPU or RAM value exceeds the flavor’s configured maximum.Fix: Either adjust the requested value, or update the flavor’s max specs:
Increase vCPU ceiling to 32
openstack flavor set <flavor-name> --property hw:cpu_max_sockets=32
Cause: The requested vCPU count cannot be evenly distributed across the instance’s NUMA nodes.Fix: Choose a vCPU count that divides evenly across NUMA nodes. For a 2-NUMA instance: use 2, 4, 6, 8, 10…
Cause: Older Linux kernels (before 4.15) require manual CPU onlining.Fix: Run inside the guest:
Bring all hotplugged CPUs online
for cpu in /sys/devices/system/cpu/cpu*/online; do echo 1 > $cpu; done
Cause: The guest OS memory balloon driver (virtio_balloon) may not be loaded.Fix: Verify and load:
Check and load balloon driver
lsmod | grep virtio_balloon
modprobe virtio_balloon
For persistence, add virtio_balloon to /etc/modules.
Cause: Instances with hw:cpu_policy=dedicated (CPU pinning) do not support live vCPU scaling — this is by design to prevent pinning conflicts.Fix: Use hw:cpu_policy=shared (default) for instances that need live scaling. CPU-pinned instances must use standard flavor resize instead.

Next Steps

User Guide: Live Scaling

End-user guide for scaling vCPU and RAM from the Dashboard

Flavor Management

Create and manage flavors with extra specs

Advanced Features

CPU pinning, NUMA, huge pages, vTPM, and GPU passthrough

Compute Admin Guide

Return to the Compute Administration Guide index