Skip to main content

Overview

The replicator daemon continuously ensures each object has the configured number of replicas across different zones, detecting and repairing divergence. Monitoring replication health is essential for maintaining data durability guarantees.
Administrator Access Required — This operation requires the admin role. Contact your Xloud administrator if you do not have sufficient permissions.

Monitor Replication

Check replication status across all nodes
xavs-storage-recon --replication
Verbose replication check
xavs-storage-recon --replication --verbose
Key replication metrics:
MetricHealthy ValueConcern Threshold
replication_time< 60 seconds> 300 seconds
replication_lastRecent timestampOlder than 600 seconds
object_countConsistent across replicasDivergence > 1%

Quarantined Objects

The auditor daemon detects data corruption (bit-rot, write errors) through checksum verification. Corrupted objects are moved to a quarantine directory and excluded from reads until a healthy replica is served instead.
A high quarantine count indicates data corruption — potentially caused by drive failures, bit rot, or network errors during replication. Investigate and replace affected drives promptly. Quarantined objects are excluded from reads until a healthy replica is found.
Check quarantine counts by node
xavs-storage-recon --quarantined --verbose
View quarantined objects on a node (SSH to node)
ls /var/lib/xavs-object-storage/quarantined/
If quarantine counts are high on a specific node:
  1. Check drive health with smartctl or the hardware vendor tool
  2. Replace failing drives and add replacement devices to the ring
  3. Remove the degraded device from the ring to allow data to drain

Replication Configuration

Key replication parameters configurable through XDeploy:
ParameterDescriptionDefault
concurrencyNumber of parallel replication threads per daemon1
intervalSeconds between replication passes30
node_timeoutSeconds before marking a replica push as failed10
Adjust concurrency during off-peak hours to accelerate replication after large ring changes:
Temporarily increase replication concurrency (XDeploy config)
# Edit object storage configuration → replicator section
# Set concurrency = 4, then deploy
xavs-ansible deploy -t swift

Next Steps

Ring Management

Add or remove drives that affect replication targets

Monitoring

Set up capacity and replication health monitoring

Admin Troubleshooting

Diagnose replication failures and high-latency nodes

Storage Policies

Review replication factors for each storage policy