All glossary terms
Verify

Vertical autoscaling

Vertical autoscaling resizes an existing instance — adding CPU, memory, or storage — rather than adding more instances. Useful for workloads that don't parallelise well (single-threaded servers, stateful databases) and for right-sizing chronically over-provisioned services. Less responsive than horizontal scaling because resize typically requires a restart.

Vertical scaling is the simpler design — no need for load balancing, no concerns about session affinity, no distributed state coordination. The trade-offs: there's a hard ceiling on how big an instance can get (cloud providers cap at ~128 vCPU / 2TB RAM); resize is slow (seconds to minutes) and disruptive (most VPAs evict the pod); cost scales superlinearly above mid-range instance sizes. The most common use case is stateful databases that can't be sharded — the operator runs them on the biggest instance the workload needs and resizes annually. For stateless services, horizontal scaling almost always wins on flexibility and cost.

Related terms