Verify

Vertical autoscaling

Vertical autoscaling resizes an existing instance, adding CPU, memory, or storage, rather than adding more instances. Useful for workloads that don't parallelise well (single-threaded servers, stateful databases) and for right-sizing chronically over-provisioned services. Less responsive than horizontal scaling because resize typically requires a restart.

May 23, 2026

Vertical scaling is the simpler design, no need for load balancing, no concerns about session affinity, no distributed state coordination. The trade-offs: there's a hard ceiling on how big an instance can get (cloud providers cap at ~128 vCPU / 2TB RAM); resize is slow (seconds to minutes) and disruptive (most VPAs evict the pod); cost scales superlinearly above mid-range instance sizes. The most common use case is stateful databases that can't be sharded, the operator runs them on the biggest instance the workload needs and resizes annually. For stateless services, horizontal scaling almost always wins on flexibility and cost.