real-time, based on in-flight requests ◦ designed with spiky workloads in mind ◦ request buffering during autoscaling • Kubernetes: ◦ we all typically over-provision for unpredicted load ◦ CPU/memory metrics are a side-effect of load, and often collected w/ a delay ◦ without a buffering meat shield, excessive load will crash Pods. Kubernetes+Cloud Run=?