Skip to main content

Overview

The Decision Engine relies on data sources to build its cluster model and collect performance metrics. The available data source determines which optimization strategies can be used. The Compute API data source is always active. Prometheus and Telemetry are optional and unlock additional strategies.

Data Source Overview

Data SourceProvidesRequired For
Compute APIHost inventory, vCPU/memory usage, VM placementAll strategies — always active
PrometheusInfrastructure metrics (node CPU, temperature)outlet_temperature, saving_energy
Telemetry (Ceilometer)Historical per-instance CPU, memory metricsworkload_stabilization, noisy_neighbor
IPMIServer power state, inlet temperatureoutlet_temperature (physical temperature)

Compute API Data Source

The Compute API data source is enabled by default and requires no additional configuration. It provides:
  • Real-time host inventory and hypervisor utilization
  • Current instance placement (which instance runs on which host)
  • vCPU and memory allocation per host
Verify Compute API data source is working
docker exec -it watcher_decision_engine python3 -c "
from watcher.decision_engine.model.collector.nova import NovaClusterDataModelCollector
print('Compute collector loaded successfully')
"

Prometheus Data Source

Enable Prometheus Monitoring

Open XDeploy and navigate to Configuration. Select the Monitoring tab and toggle Enable Prometheus to Yes.

Configure Custom Data Source Settings (Optional)

For advanced Prometheus configuration, navigate to Advanced Configuration. In the Service Tree (left panel), select watcher. Click New File or select an existing watcher.conf from the File Browser (right panel).Add the following in the Code Editor (center panel):
/etc/xavs/config/watcher/watcher.conf
[watcher_cluster_data_model_collectors.prometheus]
enabled = True
host = 10.0.1.71
port = 9291

[prometheus_client]
host = 10.0.1.71
port = 9291

Save and Apply

Click Save Current File. Return to Operations and run reconfigure to apply the changes to the Decision Engine.
Prometheus data source configured and applied via XDeploy.
Test Prometheus connectivity
curl -s "http://10.0.1.71:9291/api/v1/query?query=up" \
  | jq '.status'
Expected: "success"
Verify temperature metrics are available
curl -s "http://10.0.1.71:9291/api/v1/query?query=node_hwmon_temp_celsius" \
  | jq '.data.result | length'
Expected: a non-zero count of temperature sensor results.

Telemetry Data Source

Telemetry integration requires the Xloud Telemetry service (Ceilometer) to be deployed and collecting per-instance metrics.

Configuration

/etc/xavs/watcher/watcher.conf
[collector]
collector_plugins = compute, ceilometer

[ceilometer_client]
endpoint_type = internalURL
Check available metrics from Telemetry
openstack metric metric list --limit 20
Verify that per-instance CPU metrics are present:
Check cpu_util metrics
openstack metric resource list \
  --type instance \
  | head -5
For workload_stabilization, at least 2–4 hours of metric history is required before the strategy produces meaningful recommendations.

Data Source and Strategy Matrix

GoalCompute APIPrometheusTelemetry
Server ConsolidationRequired--
Energy SavingsRequiredOptional-
Zone RebalancingRequired--
Thermal OptimizationRequiredRequired (temp)-
Workload StabilizationRequired-Required
Noisy NeighborRequired-Required

Next Steps

Strategy Configuration

Tune strategy parameters for each configured data source.

Custom Strategies

Build strategies using data from these configured sources.

Troubleshooting

Diagnose data source connectivity failures.

Architecture

Review how data sources feed into the Decision Engine pipeline.