Object Storage Admin Troubleshooting

507 Insufficient Storage on object PUT

Cause: One or more storage nodes targeted by the ring have insufficient free space to accept the write.Diagnosis:

Check disk usage per node

xavs-storage-recon --diskusage

Resolution:

Identify nodes or drives above 85% utilization
Expand storage capacity by adding new drives (see Ring Management)

Alternatively, rebalance the ring to shift weight toward nodes with available space:

Adjust weight on high-capacity node

xavs-ring-builder object.builder set_weight <device-id> <reduced-weight>
xavs-ring-builder object.builder rebalance
xavs-ring-builder object.builder write_ring

Slow reads or high proxy latency

Cause: Replication traffic competing with foreground I/O, or a storage node with degraded drives experiencing high read latency.Diagnosis:

Check replication load

xavs-storage-recon --replication --verbose

If a specific node shows high replication_time, inspect that node’s disk I/O:

Check disk I/O on storage node (SSH required)

iostat -x 1 5

Resolution:

Consider throttling the replicator with --concurrency 1 during peak hours
If a specific drive is degraded, reduce its ring weight to shift load away
Replace drives showing high latency or recurring I/O errors in dmesg

Ring inconsistency between nodes

Cause: The updated ring file was not distributed to all nodes after a rebalance.Diagnosis:

Check MD5 of ring files on all nodes

xavs-storage-recon --md5

Resolution: Nodes with mismatched MD5 hashes have stale ring files. Redistribute the ring to affected nodes:

Copy ring files to an affected node

scp /etc/xavs-object-storage/*.ring.gz <node-ip>:/etc/xavs-object-storage/

Restart the object-server and replicator on the affected node after distribution.

High quarantine count on a specific node

Cause: Drive failure, bit-rot, or network errors during replication causing data corruption detected by the auditor.Diagnosis:

Check quarantine counts

xavs-storage-recon --quarantined --verbose

Check drive health (SSH to node)

smartctl -a /dev/<device>
dmesg | grep -i "error\|fail\|ata"

Resolution:

If the drive is failing, set its ring weight to 0 and rebalance to drain data
Replace the physical drive
Add the replacement drive to the ring and rebalance
The replicator will restore the quarantined objects from healthy replicas

Do not simply delete quarantined objects — they may be the only remaining copy if other replicas are also corrupted. Always verify healthy replicas exist before any quarantine cleanup.

Proxy service returns 503 Service Unavailable

Cause: The proxy-server cannot reach a quorum of storage nodes for an operation.Diagnosis:

Check proxy container status

docker ps --filter name=swift-proxy
docker logs swift-proxy --tail 50

Verify storage nodes are reachable from proxy

xavs-storage-recon --all | grep -i "error\|fail"

Resolution:

Verify storage node containers are running: docker ps --filter name=swift
Check network connectivity from proxy hosts to storage nodes on ports 6200, 6201, 6202
If nodes are degraded, the proxy will still serve reads from available replicas but writes require the configured replica quorum

Component	Log Command
Proxy server	`docker logs swift-proxy`
Object server	`docker logs swift-object`
Container server	`docker logs swift-container`
Account server	`docker logs swift-account`
Replicator	`docker logs swift-object-replicator`

Object Storage Troubleshooting (User)

User-facing issues — 403 errors, upload timeouts, versioning failures

Ring Management

Add capacity and redistribute data after failures

Replication

Monitor and restore data durability

Monitoring

Proactively catch issues before they become outages

Object Storage Admin Troubleshooting

Overview

Diagnostic Checklist

Platform Issues

Log Locations

Next Steps

Object Storage Troubleshooting (User)

Ring Management

Replication

Monitoring

​Overview

​Diagnostic Checklist

​Platform Issues

​Log Locations

​Next Steps

Object Storage Troubleshooting (User)

Ring Management

Replication

Monitoring

Overview

Diagnostic Checklist

Platform Issues

Log Locations

Next Steps