Skip to main content

Overview

This guide covers platform-level DNS issues that require administrator access. For user-facing issues such as record set errors or propagation delays, see the DNS Troubleshooting guide.
Administrator Access Required — This operation requires the admin role. Contact your Xloud administrator if you do not have sufficient permissions.

Diagnostic Checklist

Before investigating individual components, run this quick health check:
Check DNS service container status
docker ps --filter name=dns
Verify DNS API is responding
openstack zone list --limit 1
Check nameserver is authoritative for a zone
dig @<nameserver-ip> example.com. SOA +norecurse

Platform Issues

DNS API not responding

Cause: The DNS API service container may be unavailable or crashed.Resolution:
Check DNS service containers
docker ps --filter name=dns
Review container logs
docker logs dns-api --tail 100
If the container exited unexpectedly, check for configuration errors in the startup log. Restart affected containers:
Restart DNS API container
docker restart dns-api
If the container fails to start, verify the database connection and configuration files deployed by XDeploy.
Cause: The DNS worker may be unable to push updates to backend nameservers.Resolution:
  1. Verify the worker container is running and connected to the message queue:
    Check DNS worker container
    docker ps --filter name=dns-worker
    docker logs dns-worker --tail 50
    
  2. Check for push errors in the worker logs — look for connection refused or timeout errors to pool target addresses
  3. Verify network connectivity from the worker host to each nameserver’s management port:
    Test connectivity to nameserver target
    curl -s http://<nameserver-ip>:<target-port>/
    
  4. If a nameserver is unreachable, temporarily remove it from the pool via XDeploy while investigating
Cause: Zone synchronization lag — one nameserver received the update while another did not.Diagnosis:
Check zone serial on each nameserver
dig @<ns1-ip> example.com. SOA +short
dig @<ns2-ip> example.com. SOA +short
SOA serial numbers should match. A lagging nameserver indicates a synchronization failure.Resolution: Check the DNS worker logs for push errors to the specific nameserver’s target configuration. If the push is failing due to authentication or connectivity, correct the target configuration in XDeploy and redeploy.
Cause: The worker cannot process zone updates as fast as they are being created — typically due to a slow or overloaded nameserver backend.Resolution:
Check message queue depth
docker logs dns-central --tail 50 | grep queue
  • If a specific nameserver backend is slow, investigate its resource usage
  • Consider adding worker replicas via XDeploy to parallelize processing
  • If the backlog grew due to a temporary nameserver outage, it should drain automatically once connectivity is restored
Cause: The DNS service cannot reach the database, or the database schema is not initialized.Resolution:
Check DNS API logs for DB errors
docker logs dns-api --tail 100 | grep -i "error\|database\|connect"
Verify the database service is running and that the DNS service configuration in XDeploy has the correct connection credentials and host address.

Log Locations

ComponentLog Command
DNS APIdocker logs dns-api
DNS Centraldocker logs dns-central
DNS Workerdocker logs dns-worker
DNS MDnsdocker logs dns-mdns

Next Steps

DNS Troubleshooting (User)

User-facing DNS issues — zone errors, propagation, CNAME conflicts

Architecture

Understand component roles to narrow down failure scope

Backend Configuration

Verify and update pool target configuration

Pool Management

Remove or replace unhealthy nameservers from pools