Overview
This guide covers administrator-level troubleshooting for the K8SaaS platform — from Conductor startup failures and Heat stack errors to certificate issues and quota enforcement problems. For user-facing issues such as individual cluster access failures, see the Kubernetes User Troubleshooting guide.Common Issues
Clusters fail to create across projects
Clusters fail to create across projects
Cause: The K8SaaS Conductor cannot reach the Orchestration service, or the
cluster template references an image or flavor that does not exist.Resolution:Look for If the image is missing, upload it and ask project teams to retry cluster creation.
Check Conductor logs
ConnectionError, NotFound, or AuthenticationRequired messages.Verify Orchestration service is healthy
Verify node image exists
Heat stack in FAILED state
Heat stack in FAILED state
Cause: The Orchestration template failed during resource creation — quota
exhaustion, a dependency failure (LB or DNS), or a template rendering error.Resolution:Address the root cause (quota, network, service availability) and then delete
the failed cluster before retrying:
Show failed stack events
Find the stack name for a cluster
Delete failed cluster
Certificate errors after node replacement
Certificate errors after node replacement
Cause: Replaced nodes received new TLS certificates that do not match the
cluster CA recorded in the K8SaaS database.Resolution: Rotate the cluster CA to regenerate consistent certificates:Notify all project users to refresh their kubeconfig after the rotation completes.
Rotate cluster CA
DELETE_FAILED — cluster cannot be removed
DELETE_FAILED — cluster cannot be removed
Cause: The underlying Heat stack has resources in a failed state that prevent
cleanup, or a resource dependency is blocking deletion.Resolution:Manually delete the blocking resource (e.g., a floating IP still attached to a
deleted VM):After manual cleanup, delete the cluster record from K8SaaS:
Show stack deletion error
List stack resources
Force-delete the Heat stack
Force delete cluster record
Conductor not processing cluster tasks
Conductor not processing cluster tasks
Cause: The Conductor is overloaded, has lost database connectivity, or crashed
due to an unhandled exception.Resolution:Restart the Conductor if it shows as unhealthy or has no recent log output:Increase the worker count if the Conductor is consistently behind:
Check Conductor status and logs
Restart Conductor
/etc/xavs/kubernetes/kubernetes.conf
Diagnostic Commands Reference
Check all K8SaaS container statuses
List all clusters across all projects
Show cluster with full detail
Show Orchestration stack events
Check K8SaaS API logs
Next Steps
Monitoring
Monitor all clusters for failed and stuck lifecycle states.
Certificates
Resolve certificate errors with CA rotation.
Quotas
Resolve quota-related cluster creation failures.
User Troubleshooting
User-facing guide for individual cluster access and health issues.