> ## Documentation Index > Fetch the complete documentation index at: https://docs.xloud.tech/llms.txt > Use this file to discover all available pages before exploring further. # Troubleshooting > Diagnose XIMP administrative issues — high cardinality metric performance problems, log ingestion backlogs, missing dashboard data, and scrape target failures. ## Overview This page covers administrator-level XIMP troubleshooting. For user-facing issues such as alert delivery failures and missing metrics on dashboards, see the [XIMP User Guide Troubleshooting](/services/monitoring/user-guide/troubleshooting) page. **Administrator Access Required** — This operation requires the `admin` role. Contact your Xloud administrator if you do not have sufficient permissions. **Prerequisites** * Administrator credentials with the `admin` role * Access to XIMP CLI and management interfaces *** ## Common Issues **Cause**: Metric labels with unbounded values (e.g., request IDs, user IDs, or ephemeral container names) create millions of unique metric series, degrading query performance and consuming excessive storage. **Diagnosis**: ```bash title="List highest-cardinality metric series" theme={null} ximp metric cardinality top --limit 20 ``` **Resolution**: Drop or relabel high-cardinality labels in the scrape configuration: ```yaml title="Relabel config — drop high-cardinality label" theme={null} relabel_configs: - source_labels: [request_id] action: drop ``` Apply via: ```bash title="Apply relabel configuration" theme={null} ximp target update --relabel-file relabel.yaml ``` Dropping a label is irreversible for historical data. The label will be absent from future ingested metrics. Consider using `labelmap` to replace high-cardinality values with aggregate labels instead of dropping them entirely. **Cause**: Log volume exceeds the collector's processing capacity, causing a write backlog and delayed delivery to the search index. **Diagnosis**: ```bash title="Check ingestion queue depth" theme={null} ximp log ingest-status ``` **Resolution**: * Reduce log verbosity on high-volume services (set log level to `WARNING` instead of `DEBUG`): ```bash title="Example: reduce Nova log level" theme={null} docker exec nova_api crudini --set /etc/nova/nova.conf DEFAULT debug false ``` * Increase log collector worker count in the XIMP configuration: Navigate to **Monitor Center > Logging** (Collector Settings, admin view) * Add a second log collector node through XDeploy for horizontal scaling A single high-verbosity service at DEBUG level can generate more log volume than 100 services at INFO. Identify the top log emitters: `ximp log stats top-emitters --last 1h` **Cause**: The scrape target is unreachable — firewall blocking, service down, or authentication failure. **Diagnosis**: ```bash title="Check specific target health" theme={null} ximp target health --target --verbose ``` Common causes: | Symptom | Cause | Resolution | | ----------------------- | ---------------------------------- | ---------------------------------------------- | | Connection refused | Service not running on target port | Verify service is running; check port | | Timeout | Firewall blocking | Add inbound rule for XIMP collector IP | | 401 Unauthorized | Invalid auth credentials | Update auth config in target definition | | 503 Service Unavailable | Service overloaded | Review service health; reduce scrape frequency | **Cause**: Metric volume has exceeded the allocated storage for the metric store. This can occur from high cardinality, insufficient retention management, or unexpected metric bursts. **Diagnosis**: ```bash title="Check metric store disk usage" theme={null} ximp storage status ``` **Resolution** (in order of preference): 1. Reduce raw metric retention to free space immediately: ```bash title="Reduce raw retention to 15 days (emergency)" theme={null} ximp retention set --type metrics-raw --duration 15d ``` 2. Identify and drop high-cardinality series (see above) 3. Expand storage on the metric store node through XDeploy 4. Add a second metric store node for horizontal capacity **Cause**: The scrape target is down, the agent is offline, or the metric name has changed after a software update. **Diagnosis**: 1. Check target health: `ximp target health --target ` 2. Verify agent is active: `ximp agent list --node ` 3. Search for the metric by prefix to find renamed metrics: ```bash title="Search metrics by prefix" theme={null} ximp metric search --prefix xloud_compute_cpu ``` If the metric was renamed in a recent software update, update dashboard queries and alert rules to use the new metric name. *** ## Diagnostics Reference | Issue | Diagnostic Command | | ---------------- | ---------------------------------------- | | Cardinality | `ximp metric cardinality top --limit 20` | | Log backlog | `ximp log ingest-status` | | Target DOWN | `ximp target health --verbose` | | Storage usage | `ximp storage status` | | Agent offline | `ximp agent list --status offline` | | Top log emitters | `ximp log stats top-emitters --last 1h` | *** ## Next Steps Review and fix agent configuration that may be causing issues Adjust retention settings to address storage pressure Review and fix scrape target configurations User-facing issues — alerts not firing, log delays