Skip to main content

Overview

This page covers common issues encountered when using XSDS block, object, and shared file storage — including diagnosis steps and resolutions for each scenario.
Prerequisites
  • Access to the Xloud Dashboard and CLI (openstack CLI authenticated)
  • For advanced diagnostics, contact your storage administrator. Your administrator can configure this through XDeploy.

Common Issues

Cause: The storage scheduler could not place the volume on a suitable backend, or the backend is temporarily unavailable.Diagnosis:
Check volume status and fault message
openstack volume show <VOLUME_NAME> -c status -c fault
Common causes and resolutions:
CauseResolution
Insufficient capacity in the target poolChoose a different volume type or contact your administrator to expand capacity. Your administrator can configure this through XDeploy.
Volume type references an unavailable backendTry a different volume type; contact your administrator if the issue persists. Your administrator can configure this through XDeploy.
Storage service temporarily unhealthyWait 2–3 minutes and check status again; contact your administrator if it persists beyond 5 minutes. Your administrator can configure this through XDeploy.
Quota exceededCheck quota with openstack quota show --volume and request an increase from your administrator
Contact your storage administrator if the volume remains in creating state for more than 5 minutes. Your administrator can configure this through XDeploy.
Cause: A snapshot derived from this volume still exists, or there is an ongoing operation that holds a lock on the volume.Diagnosis:
List snapshots from this volume
openstack volume snapshot list --volume <VOLUME_NAME_OR_ID>
Resolution:
  1. Delete all snapshots that were created from this volume:
    Delete child snapshot
    openstack volume snapshot delete <SNAPSHOT_ID>
    
  2. After all child snapshots are deleted, retry the volume deletion:
    Retry volume deletion
    openstack volume delete <VOLUME_NAME_OR_ID>
    
Cause: The source volume is attached and has in-flight I/O, or the snapshot quota for the project has been reached.Diagnosis:
Check project snapshot quota
openstack quota show --volume
Look for snapshots in the output. If used >= limit, request a quota increase.Resolution:
  • For quota exceeded: contact your administrator to increase the snapshot quota. Your administrator can configure this through XDeploy.
  • For consistency issues: flush application writes before taking a snapshot of a database volume (see Snapshots — Consistency)
Crash-consistent snapshots capture the on-disk state at the moment of the snapshot request. For databases and stateful applications, coordinate with application-level freeze/thaw procedures to ensure data integrity.
Cause: Large numbers of small objects, high latency between the client and the gateway, or network routing through the public internet for intra-cluster traffic.Resolution:
  • Use multi-part upload for objects larger than 100 MB:
    boto3 multi-part upload
    s3.upload_file(
        'large_file.tar.gz', 'my-bucket', 'large_file.tar.gz',
        Config=boto3.s3.transfer.TransferConfig(
            multipart_threshold=1024*1024*100,  # 100 MB
            multipart_chunksize=1024*1024*50    # 50 MB chunks
        )
    )
    
  • For small-object workloads, batch objects into larger archives where the application permits
  • Verify network path to the storage endpoint — avoid routing through the public internet for intra-cluster traffic
Use the S3 API endpoint local to your region for lowest latency. Check Project → Object Store → Endpoints in the Dashboard for your regional endpoint URL.
Cause: Firewall rules blocking NFS traffic, incorrect export path, or the NFS gateway service is unhealthy.Diagnosis:
Test NFS gateway connectivity
showmount -e <gateway-ip>
Check mount connectivity
rpcinfo -p <gateway-ip>
Resolution:
  • Ensure security group rules on the client instance permit outbound traffic to the NFS gateway on ports 111 (portmapper) and 2049 (NFS)
  • Verify the export path matches exactly what was provided in the Dashboard
  • If showmount hangs, the NFS gateway may be temporarily unavailable — contact your storage administrator
NFS port 2049 must be open in the security group applied to client instances. Navigate to Project → Network → Security Groups and verify the rule exists.
Cause: The access key or secret key is incorrect, expired, or belongs to a different project.Resolution:
  1. Verify credentials in the Dashboard under Project → Object Store → Access Keys
  2. If the key was deleted or lost, generate a new key pair:
    • Navigate to Project → Object Store → Access Keys → Create Key
    • Update all applications and configuration files using the old key
  3. Confirm the endpoint URL matches your region:
    Verify S3 endpoint
    openstack catalog show object-store
    

Diagnostics Reference

IssueFirst Diagnostic Command
Volume not creatingopenstack volume show <VOL> -c status -c fault
Quota checkopenstack quota show --volume
Snapshot list for volumeopenstack volume snapshot list --volume <VOL>
Object storage endpointopenstack catalog show object-store
NFS gateway reachabilityshowmount -e <gateway-ip>

When to Contact Support

Contact support@xloud.tech if:
  • A volume has been stuck in creating or deleting state for more than 10 minutes
  • The storage administrator cannot resolve the issue from the cluster admin CLI
  • You observe data inconsistency after a snapshot restore
  • Object storage bucket contents are missing unexpectedly
When opening a support ticket, include the output of openstack volume show <VOLUME_ID> or openstack volume snapshot show <SNAPSHOT_ID> — the fault and migration_status fields are particularly useful for diagnosis.

Next Steps

XSDS Admin Troubleshooting

Cluster-level diagnostics for storage administrators — OSD failures, slow requests

Data Protection

Configure replication and erasure coding to reduce exposure to hardware failures

Snapshots

Best practices for creating consistent snapshots to minimize recovery risk

Support

Contact Xloud support for issues that require cluster-level investigation