Grafana Dashboards

Overview

Grafana connects to Prometheus and other Xloud data sources to provide operational dashboards for infrastructure metrics, service health, storage utilization, and network performance. Grafana is included in the XIMP monitoring stack and is pre-configured with a Prometheus data source pointing to the cluster’s Prometheus instance.

Prerequisites

Grafana 9.0 or later (included in XIMP stack)
Prometheus deployed and scraping targets (Prometheus integration)
Grafana admin credentials (default: sourced from XDeploy configuration)
Network access from the Grafana host to Prometheus on port 9090

Data Source Configuration

Dashboard
CLI (API)

Open data source settings

Log in to Grafana and navigate to Configuration → Data Sources → Add data source. Select Prometheus from the time series databases section.

Configure connection

Field	Value	Description
Name	`Xloud Prometheus`	Display name in dashboard queries
URL	`http://10.0.1.71:9090`	Prometheus server address
Scrape interval	`15s`	Must match `global.scrape_interval` in prometheus.yml
HTTP Method	`POST`	Required for long queries

Leave authentication blank if Prometheus has no auth configured. If basic auth is enabled, enter credentials under the Auth section.

Test and save

Click Save & Test. Grafana sends a test query to Prometheus.

Status shows Data source is working with a sample metric count.

Provision data sources programmatically using the Grafana HTTP API — useful for automated deployments:

Add Prometheus data source via API

curl -X POST http://admin:$GRAFANA_PASS@localhost:3000/api/datasources \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Xloud Prometheus",
    "type": "prometheus",
    "url": "http://10.0.1.71:9090",
    "access": "proxy",
    "isDefault": true,
    "jsonData": {
      "httpMethod": "POST",
      "scrapeInterval": "15s"
    }
  }'

Response returns {"message": "Datasource added", "id": 1} with the new data source ID.

Pre-Built Dashboard Templates

Node Exporter Full Dashboard

The Node Exporter Full dashboard (Grafana ID 1860) provides comprehensive per-host metrics including CPU, memory, disk I/O, network throughput, and system load.

Import via ID
Import via JSON

Open import dialog

Navigate to Dashboards → Import. Enter dashboard ID 1860 in the Import via grafana.com field and click Load.

Select data source

Select Xloud Prometheus from the data source dropdown. Set the dashboard name and folder, then click Import.

The Node Exporter Full dashboard appears with live data for all scraped instances.

For air-gapped environments, download and import the dashboard JSON directly:

Import dashboard JSON via API

curl -X POST http://admin:$GRAFANA_PASS@localhost:3000/api/dashboards/import \
  -H "Content-Type: application/json" \
  -d @node-exporter-full.json

Recommended Dashboard IDs

Dashboard	Grafana ID	Description
Node Exporter Full	`1860`	Complete per-host metrics — CPU, memory, disk, network
Ceph Cluster	`2842`	Ceph OSD, pool, and IOPS metrics
Prometheus Stats	`2`	Prometheus internal metrics and scrape health
Alertmanager	`9578`	Alert routing and notification delivery status

Custom Dashboard Configuration

Infrastructure Overview Panel

Create a summary row with stat panels for key fleet metrics:

CPU Usage Stat Panel (panel JSON fragment)

{
  "type": "stat",
  "title": "Average CPU Usage",
  "targets": [
    {
      "expr": "100 - avg(rate(node_cpu_seconds_total{mode='idle'}[5m])) * 100",
      "legendFormat": "CPU %"
    }
  ],
  "fieldConfig": {
    "defaults": {
      "unit": "percent",
      "thresholds": {
        "steps": [
          { "color": "green", "value": 0 },
          { "color": "yellow", "value": 60 },
          { "color": "red", "value": 80 }
        ]
      }
    }
  }
}

Auto-Scaling Group Size Panel

Track the current size of Orchestration auto-scaling groups using a time series panel:

Auto-scaling group size query

# Requires the Heat exporter or custom metric pushed from Orchestration stack outputs
xloud_asg_current_size{stack="web-asg-stack"}

Export stack outputs to Prometheus using a cron job that calls the Orchestration API and pushes a gauge metric via the Prometheus Pushgateway. This provides real-time ASG size visibility in Grafana without a dedicated Heat exporter.

Alerting in Grafana

Grafana can evaluate alert rules against Prometheus queries and route notifications independently of Alertmanager. Use this for dashboard-level alerts that notify specific teams via Slack, email, or PagerDuty.

Grafana alert rule example

# Configured via Grafana UI: Alerting → Alert rules → New alert rule
# Expression: avg(rate(node_cpu_seconds_total{mode!="idle"}[5m])) > 0.8
# For: 2m
# Labels: severity=warning, team=ops
# Annotations: summary="High CPU detected on {{ $labels.instance }}"

Configure notification channels:

Add a contact point

Navigate to Alerting → Contact points → Add contact point. Select the notification type (Email, Slack, PagerDuty, Webhook) and configure the destination.

Create a notification policy

Navigate to Alerting → Notification policies. Create a policy that matches your alert labels (e.g., severity=critical) and routes to the appropriate contact point.

Create an alert rule

Navigate to Alerting → Alert rules → New alert rule. Select the Prometheus data source, write the PromQL expression, set evaluation parameters, and assign the notification policy.

Dashboard Variables

Use template variables to make dashboards interactive across multiple instances, projects, and availability zones:

Instance variable query

label_values(node_uname_info, instance)

Job variable query

label_values(up, job)

Add variables under Dashboard Settings → Variables → Add variable. Set query type to Query, select the Prometheus data source, and enter the label values query.

Next Steps

Prometheus Integration

Configure Prometheus scrape targets and alert rules that feed Grafana data sources

XIMP Monitoring

Explore the built-in XIMP monitoring stack that includes pre-configured Grafana

Auto-Scaling

Visualize auto-scaling group size changes on Grafana time series dashboards

Wazuh SIEM

Integrate Wazuh security events into Grafana for unified security dashboards

Core Services

Other Services

Grafana Dashboards

Overview

Data Source Configuration

Open data source settings

Configure connection

Test and save

Pre-Built Dashboard Templates

Node Exporter Full Dashboard

Open import dialog

Select data source

Recommended Dashboard IDs

Custom Dashboard Configuration

Infrastructure Overview Panel

Auto-Scaling Group Size Panel

Alerting in Grafana

Add a contact point

Create a notification policy

Create an alert rule

Dashboard Variables

Next Steps

Prometheus Integration

XIMP Monitoring

Auto-Scaling

Wazuh SIEM

Core Services

Other Services

Documentation Index

​Overview

​Data Source Configuration

Open data source settings

Configure connection

Test and save

​Pre-Built Dashboard Templates

​Node Exporter Full Dashboard

Open import dialog

Select data source

​Recommended Dashboard IDs

​Custom Dashboard Configuration

​Infrastructure Overview Panel

​Auto-Scaling Group Size Panel

​Alerting in Grafana

Add a contact point

Create a notification policy

Create an alert rule

​Dashboard Variables

​Next Steps

Prometheus Integration

XIMP Monitoring

Auto-Scaling

Wazuh SIEM

Overview

Data Source Configuration

Pre-Built Dashboard Templates

Node Exporter Full Dashboard

Recommended Dashboard IDs

Custom Dashboard Configuration

Infrastructure Overview Panel

Auto-Scaling Group Size Panel

Alerting in Grafana

Dashboard Variables

Next Steps