Scaling deployments

Scalability is a key part, and benefit, of using Kubernetes and GKE. Scalability is simply the ability to match capacity to demand and it's inextricably linked with resiliency. A complementary term to scalability, elasticity is the ability to increase or decrease resources as needed to meet the current capacity needs of your application or services.

A scalable web application is one that works well with one user or 1 million users, and gracefully handles peaks and dips in traffic automatically. By adding and removing nodes only when needed, scalable apps only consume the resources necessary to meet demand.

Kubernetes provides for scalability at the pod level, allowing for more pods to be added to a cluster, as needed, based on load. The maximum number of pods possible within a cluster is based on the compute, memory, and storage resources allocated to the cluster.

GKE provides another level of scalability with nodes and node pools, allowing for additional virtual machines to be added to the underlying pool supporting your container cluster node pools.