XSDS Architecture

Overview

The XSDS distributed storage cluster is composed of several service types that work together to provide unified block, object, and file storage. Services are deployed and managed by XDeploy.

XDeploy GUI — The distributed storage cluster can be bootstrapped and managed through the XSDS Storage interface. Storage tiers, CRUSH rules, and Ceph configuration are all accessible from the GUI. No manual file editing required.

Prerequisites

Administrator credentials with the admin role
Familiarity with distributed storage concepts — replication, erasure coding, and data placement

Architecture Diagram

Service Components

Service	Role	Minimum Count
Monitor (MON)	Maintains authoritative cluster state and quorum. Must have an odd number for majority voting.	3
Manager (MGR)	Provides metrics, orchestration API, and dashboard. Active-standby.	2
OSD	Object Storage Daemon — one per physical storage device. Handles I/O, replication, and recovery.	3 per replica factor
MDS	Metadata Server — required for shared file storage. Active-standby.	2
RGW	RADOS Gateway — provides the S3-compatible object storage API.	2 (HA pair)

Component Deep Dive

Monitor (MON)

Monitors maintain the authoritative cluster state map, which includes:

OSD map: Which OSDs are up, down, in, or out
CRUSH map: Placement topology and rules
PG map: State of all placement groups
MDS map: Metadata server state (if using shared file storage)

Monitors use Paxos consensus to agree on cluster state. A majority (quorum) of monitors must be reachable for the cluster to accept writes. With 3 monitors, the cluster survives 1 monitor failure. With 5 monitors, it survives 2.

Never run fewer than 3 monitors in production. A 2-monitor cluster loses quorum if either monitor fails, halting all write operations.

Object Storage Daemon (OSD)

Each OSD manages one physical storage device. OSDs are responsible for:

Serving client read/write requests
Replicating data to peer OSDs according to the CRUSH map
Running scrub operations to detect and repair data corruption
Reporting health status to monitors

OSD state has two dimensions:

up/down: Whether the OSD process is running
in/out: Whether the OSD is participating in data distribution

An OSD that is down but in triggers recovery. An OSD that is out has its data redistributed to remaining OSDs.

RADOS Gateway (RGW)

RGW provides the S3-compatible object storage API. It translates S3 API requests into RADOS operations against the underlying storage cluster.RGW is stateless — all state is stored in the cluster. Deploy at least 2 RGW instances behind a load balancer for high availability. XDeploy configures the HAProxy frontend automatically.RGW supports:

S3-compatible API (buckets, objects, ACLs, lifecycle policies)
Multi-site replication between XSDS clusters
Pre-signed URLs for time-limited object access

Metadata Server (MDS)

MDS manages the metadata for the distributed file system. Each client inode, directory, and file name is tracked by an active MDS instance.MDS instances are either active (serving metadata requests) or standby (ready to take over). On active MDS failure, a standby takes over within seconds.Multiple active MDS instances (multi-active MDS) can be configured for large deployments with high metadata operation rates. Contact your Xloud support team for multi-active MDS configuration guidance.

Deployment Architecture

Single-Site
Multi-Site

A standard single-site XSDS cluster distributes OSDs across at least 3 hosts, with monitor and manager services co-located on the same hosts:

Next Steps

Cluster Management

Monitor cluster health, manage services, and perform operational procedures

Pool Management

Create and configure storage pools with the appropriate protection scheme

CRUSH Maps

Define failure domains and device class rules for data placement

Capacity Planning

Monitor utilization and plan cluster expansion before capacity is exhausted

Troubleshooting

Cluster Management

​Overview

​Architecture Diagram

​Service Components

​Component Deep Dive

​Deployment Architecture

​Next Steps

Cluster Management

Pool Management

CRUSH Maps

Capacity Planning

Overview

Architecture Diagram

Service Components

Component Deep Dive

Deployment Architecture

Next Steps