Skip to main content

Overview

Grafana connects to Prometheus and other Xloud data sources to provide operational dashboards for infrastructure metrics, service health, storage utilization, and network performance. Grafana is included in the XIMP monitoring stack and is pre-configured with a Prometheus data source pointing to the cluster’s Prometheus instance.
Prerequisites
  • Grafana 9.0 or later (included in XIMP stack)
  • Prometheus deployed and scraping targets (Prometheus integration)
  • Grafana admin credentials (default: sourced from XDeploy configuration)
  • Network access from the Grafana host to Prometheus on port 9090

Data Source Configuration

Open data source settings

Log in to Grafana and navigate to Configuration → Data Sources → Add data source. Select Prometheus from the time series databases section.

Configure connection

FieldValueDescription
NameXloud PrometheusDisplay name in dashboard queries
URLhttp://10.0.1.71:9090Prometheus server address
Scrape interval15sMust match global.scrape_interval in prometheus.yml
HTTP MethodPOSTRequired for long queries
Leave authentication blank if Prometheus has no auth configured. If basic auth is enabled, enter credentials under the Auth section.

Test and save

Click Save & Test. Grafana sends a test query to Prometheus.
Status shows Data source is working with a sample metric count.

Pre-Built Dashboard Templates

Node Exporter Full Dashboard

The Node Exporter Full dashboard (Grafana ID 1860) provides comprehensive per-host metrics including CPU, memory, disk I/O, network throughput, and system load.

Open import dialog

Navigate to Dashboards → Import. Enter dashboard ID 1860 in the Import via grafana.com field and click Load.

Select data source

Select Xloud Prometheus from the data source dropdown. Set the dashboard name and folder, then click Import.
The Node Exporter Full dashboard appears with live data for all scraped instances.
DashboardGrafana IDDescription
Node Exporter Full1860Complete per-host metrics — CPU, memory, disk, network
Ceph Cluster2842Ceph OSD, pool, and IOPS metrics
Prometheus Stats2Prometheus internal metrics and scrape health
Alertmanager9578Alert routing and notification delivery status

Custom Dashboard Configuration

Infrastructure Overview Panel

Create a summary row with stat panels for key fleet metrics:
CPU Usage Stat Panel (panel JSON fragment)
{
  "type": "stat",
  "title": "Average CPU Usage",
  "targets": [
    {
      "expr": "100 - avg(rate(node_cpu_seconds_total{mode='idle'}[5m])) * 100",
      "legendFormat": "CPU %"
    }
  ],
  "fieldConfig": {
    "defaults": {
      "unit": "percent",
      "thresholds": {
        "steps": [
          { "color": "green", "value": 0 },
          { "color": "yellow", "value": 60 },
          { "color": "red", "value": 80 }
        ]
      }
    }
  }
}

Auto-Scaling Group Size Panel

Track the current size of Orchestration auto-scaling groups using a time series panel:
Auto-scaling group size query
# Requires the Heat exporter or custom metric pushed from Orchestration stack outputs
xloud_asg_current_size{stack="web-asg-stack"}
Export stack outputs to Prometheus using a cron job that calls the Orchestration API and pushes a gauge metric via the Prometheus Pushgateway. This provides real-time ASG size visibility in Grafana without a dedicated Heat exporter.

Alerting in Grafana

Grafana can evaluate alert rules against Prometheus queries and route notifications independently of Alertmanager. Use this for dashboard-level alerts that notify specific teams via Slack, email, or PagerDuty.
Grafana alert rule example
# Configured via Grafana UI: Alerting → Alert rules → New alert rule
# Expression: avg(rate(node_cpu_seconds_total{mode!="idle"}[5m])) > 0.8
# For: 2m
# Labels: severity=warning, team=ops
# Annotations: summary="High CPU detected on {{ $labels.instance }}"
Configure notification channels:

Add a contact point

Navigate to Alerting → Contact points → Add contact point. Select the notification type (Email, Slack, PagerDuty, Webhook) and configure the destination.

Create a notification policy

Navigate to Alerting → Notification policies. Create a policy that matches your alert labels (e.g., severity=critical) and routes to the appropriate contact point.

Create an alert rule

Navigate to Alerting → Alert rules → New alert rule. Select the Prometheus data source, write the PromQL expression, set evaluation parameters, and assign the notification policy.

Dashboard Variables

Use template variables to make dashboards interactive across multiple instances, projects, and availability zones:
Instance variable query
label_values(node_uname_info, instance)
Job variable query
label_values(up, job)
Add variables under Dashboard Settings → Variables → Add variable. Set query type to Query, select the Prometheus data source, and enter the label values query.

Next Steps

Prometheus Integration

Configure Prometheus scrape targets and alert rules that feed Grafana data sources

XIMP Monitoring

Explore the built-in XIMP monitoring stack that includes pre-configured Grafana

Auto-Scaling

Visualize auto-scaling group size changes on Grafana time series dashboards

Wazuh SIEM

Integrate Wazuh security events into Grafana for unified security dashboards