Overview
Xloud Orchestration provides native auto-scaling through three coordinated resource types: a , a scaling policy, and an alarm trigger. When a metric threshold is breached — such as CPU utilization exceeding 80% — the alarm fires a webhook that activates the scaling policy, which adds or removes instances from the group. Prometheus is the monitoring backend for alarm-driven scaling. It replaces legacy telemetry stacks (Ceilometer/Aodh) and provides a more reliable, standards-based metrics pipeline with support for multi-dimensional labels, alerting rules, and long-retention storage.Prerequisites
- Xloud Orchestration enabled in your project
- A compatible image and flavor for scaled instances
- Prometheus deployed and scraping compute metrics (see Prometheus integration)
- Basic familiarity with orchestration templates
Architecture
Scaling Components
| Component | Resource Type | Purpose |
|---|---|---|
| Auto-Scaling Group | OS::Heat::AutoScalingGroup | Manages a pool of identical instances with min/max size constraints |
| Scaling Policy | OS::Heat::ScalingPolicy | Defines how to adjust the group — add N, remove N, or set exact size |
| Signal URL | Produced by ScalingPolicy | Webhook endpoint activated by Prometheus Alertmanager or manual POST |
| Prometheus | External | Scrapes instance metrics, evaluates alert rules, fires webhooks |
Use Cases
| Use Case | Description | Trigger |
|---|---|---|
| Elastic web / application tier | Scale web server VMs based on HTTP request rate or CPU utilization | Prometheus alert |
| CI/CD build farm | Add worker nodes during active builds, shrink on idle | Schedule or webhook |
| Batch processing cluster | Provision compute nodes for heavy batch jobs, release when complete | Manual or scheduled |
| Dev/test resource pools | Automatically scale out environments for short-lived test runs | On-demand webhook |
| Disaster recovery warm pool | Maintain standby instances that scale out during failover events | Alertmanager webhook |
Orchestration Templates
Static Cluster Template
Use this template when you need a fixed number of instances deployed as a named group. Each instance is declared as a discrete resource — suitable for small, stable clusters.static-cluster.yaml
Auto-Scaling Template
This template creates a web tier that scales between 1 and 10 instances. The scaling policy signal URLs are exposed as stack outputs and can be wired into Prometheus Alertmanager webhook receivers.autoscaling-stack.yaml
Adjustment Types
adjustment_type | Behavior | Example |
|---|---|---|
change_in_capacity | Add or remove N instances relative to current count | scaling_adjustment: 2 adds 2 instances |
exact_capacity | Set the group to exactly N instances | scaling_adjustment: 5 sets group size to 5 |
percent_change_in_capacity | Change capacity by a percentage of current size | scaling_adjustment: 25 adds 25% more instances |
Prometheus Integration
Prometheus Alertmanager delivers scaling signals by sending an HTTP POST to the policy signal URL. Configure a webhook receiver in your Alertmanager configuration:alertmanager.yml
alert-rules.yml
Deploy and Trigger Scaling
- Dashboard
- CLI
Deploy the auto-scaling stack
Navigate to Project → Orchestration → Stacks and click Launch Stack.Upload
Click Launch.
autoscaling-stack.yaml and fill in the parameters:| Parameter | Example Value | Description |
|---|---|---|
image | Ubuntu-22.04 | Base image for scaled instances |
flavor | m1.small | Instance size |
key_name | my-keypair | SSH key pair for access |
network | private | Network for instances |
min_size | 1 | Minimum instance count |
max_size | 10 | Maximum instance count |
Stack reaches Create Complete. The scaling group shows the initial instance count.
Retrieve webhook URLs
Open the stack detail page and select the Outputs tab. Copy the values for
scale_out_url and scale_in_url — these are used as Alertmanager webhook
receiver URLs.Manually trigger scale-out
To test scaling without waiting for an alert, send an HTTP POST to the signal URL:
Trigger scale-out via webhook
The scaling group adds one instance. Refresh the Topology view to confirm the new member.
Cooldown Periods
Cooldown prevents rapid successive scaling events from destabilizing your workload. Thecooldown value is specified in seconds per scaling policy.
| Scenario | Recommended Scale-Out Cooldown | Recommended Scale-In Cooldown |
|---|---|---|
| Fast-booting instances (cloud image, no init) | 30–60 s | 60–90 s |
| Instances with cloud-init provisioning | 90–120 s | 120–180 s |
| Instances requiring application warm-up | 120–180 s | 180–300 s |
Troubleshooting
Stack fails to create with CREATE_FAILED
Stack fails to create with CREATE_FAILED
Cause: Insufficient quota, unavailable flavor, or invalid image name.Resolution:Review the
Check stack events for the error message
resource_status_reason field. Common causes:- Compute quota exceeded — check with
openstack quota show - Image not found — verify with
openstack image list - Flavor not available in the target availability zone
Scale-out webhook returns 401 or 403
Scale-out webhook returns 401 or 403
Cause: The signal URL contains a temporary token that has expired, or the URL was
copied incorrectly.Resolution: Retrieve a fresh signal URL from the stack output:Signal URLs are valid as long as the stack exists. Update your Alertmanager config with
the current URL after any stack update.
Refresh signal URL
Auto-scaling group does not grow beyond min_size
Auto-scaling group does not grow beyond min_size
Cause: Alertmanager is not reaching the signal URL, or the Prometheus alert is not
firing.Resolution:
- Verify Alertmanager is running:
curl http://<alertmanager-host>:9093/-/healthy - Check alert state in Prometheus UI under Alerts
- Confirm the webhook receiver URL in Alertmanager config matches the stack output
- Test manually:
curl -X POST "<scale_out_url>"— if this works, the stack is healthy
Instances in the group show BUILD or ERROR
Instances in the group show BUILD or ERROR
Cause: Compute capacity exhausted on available hosts, or image boot failure.Resolution:Identify failed instances and check their events:
List instances in the scaling group
Check instance events
Next Steps
Template Guide
Learn intrinsic functions and conditions used in scaling templates
Manage Stacks
Update, suspend, and manage the auto-scaling stack lifecycle
Prometheus Integration
Configure Prometheus scrape targets and alert rules for scaling triggers
Xloud Load Balancer
Front auto-scaling groups with a load balancer for traffic distribution