Overview
Scaling a Kubernetes cluster adjusts the worker node count in the default node group. Scale up to handle increased workload demand; scale down to reduce infrastructure cost during low-utilization periods. Node scaling is non-disruptive for scale-up operations. Scale-down operations drain and remove nodes — ensure workloads are properly distributed before reducing node count.Prerequisites
- A running cluster in
CREATE_COMPLETEorUPDATE_COMPLETEstatus - Sufficient compute quota for the new node count (scale-up only)
Scale Up (Add Nodes)
- Dashboard
- CLI
Navigate to the cluster
Log in to the Xloud Dashboard (
https://connect.<your-domain>) and navigate to
Project → Containers → Clusters.Resize the cluster
Click Actions → Resize Cluster next to your cluster.
Enter the new total node count in the Node Count field.
Scale Down (Remove Nodes)
- Dashboard
- CLI
Prepare workloads
Before scaling down, verify that no stateful or single-replica workloads are
running on the nodes that will be removed. Use the Kubernetes Dashboard or
kubectl to inspect current pod placement.Resize the cluster
Navigate to Project → Containers → Clusters → Actions → Resize Cluster.
Enter the reduced node count and click Resize.
Best Practices
Maintain Minimum Replicas
Configure
PodDisruptionBudget resources for stateful workloads to ensure a
minimum number of replicas remain available during node removal.Use Node Groups for Fine-Grained Scaling
Use separate node groups for different workload types to scale them independently
without affecting other workloads on the cluster.
Scale During Low-Traffic Windows
Perform scale-down operations during low-traffic periods when evicted pods have
minimal user impact.
Check Compute Quota Before Scale-Up
Verify sufficient compute quota before adding nodes to avoid partial scale-up failures.
Next Steps
Node Groups
Create and manage separate node pools for specialized workloads.
Access Cluster
Verify cluster access after scaling via kubectl.
Cluster Upgrades
Upgrade your cluster to a newer Kubernetes version.
Troubleshooting
Resolve scaling failures and node health issues.