Automatically scaling deployments

The combination of being able to auto scale at the pod level and then the container cluster level is extremely powerful and immensely useful. This diagram shows an example workflow of how these two auto scaling mechanisms work in concert to give you a very scalable overall solution:

This can be boiled down to the following: GKE autoscaling creates additional nodes when Kubernetes autoscaling has created pods and they don't have enough resources to run. Conversely, GKE autoscaling also deletes nodes when pods on them are underutilized.

Given the power and usefulness of autoscaling, one would think that configuring and implementing it for your workloads would require a lot of work. Fortunately, with GKE and Kubernetes that's not the case at all. Of course, you can configure autoscaling using both the cloud console and the CLI. GKE and Kubernetes also give you the flexibility to update your autoscaling configuration for a running cluster. This is especially useful when performing research and development around the load and scale needs for your services and applications.

GKE's Workload Dashboard shows us the details for our deployed workloads and also provides functionality to update the autoscaling configuration for those workloads:

From here, you can easily either update your minimum and maximum pods for autoscaling or disable autoscaling altogether for the current workload.