Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Exploring best practices for a greener Kubernet...

Exploring best practices for a greener Kubernetes [Cloud Native Bergen]

Did you know that, on average, 37% of CPUs provisioned for cloud-native applications are never used, and the gap between resources allocated and resources used is widening? This inefficiency is causing cloud waste, and is negatively impacting the environment.

Sustainability of cloud infrastructure is a growing concern, not just for people inside the IT sector. Knowledge around what is environmentally sustainable within IT is still somehow limited, with persisting misconceptions, such as the belief that running cheap automatically equals running green. With factors like committed spend discounts and cloud regions running on cheap but dirty energy, that is often not correct.

Kubernetes has become the orchestration platform of choice. While its capabilities have been increasing with each passing year, a system can only be as efficient as its users let it, and Kubernetes on its own is not a silver bullet for solving environmental sustainability challenges.

In this talk, I want to highlight areas of running workloads on Kubernetes that organizations can improve on. We'll take a starting point in the "reuse, reduce, recycle" principle of environmental protection, and see how we can apply it to our applications running on Kubernetes.

Attendees will leave with actionable insights on how to directly improve the sustainability and efficiency of the Kubernetes clusters they manage. Join me to learn how to employ Kubernetes for a greener cloud future.

Avatar for Marta Paciorkowska

Marta Paciorkowska

October 28, 2025
Tweet

More Decks by Marta Paciorkowska

Other Decks in Technology

Transcript

  1. From waste management to workload orchestration Brought to you by

    the CNCF’s Technical Advisory Group ENV project Sustainable Kubernetes.
  2. Refuse: you can say no to Pods Use admission controllers

    to enforce good practices. Use-case #1: you offer node pools tailored to specific workloads. Use-case #2: You want to enforce better utilization from the start.
  3. Refuse: you can say no to Pods Benefits? - Potentially

    better node/resource utilization. Drawbacks? - Setting requests & limits isn’t a one-time action. - Will only fit particular use-cases. Use admission controllers to enforce good practices.
  4. Reduce: run what you truly need Turn off unneeded workloads,

    permanently or temporarily. The most environmentally- -friendly code is the code we choose not to write. Identify unnecessary work. Prevent zombie drift. Employ the scream test.
  5. Reduce: run what you truly need Turn off unneeded workloads,

    permanently or temporarily. Few services need to be always-on. - Employ kube-green (namespaces on a schedule). - Leverage KEDA (event-driven autoscaler).
  6. Repair: broken workloads are a waste Workloads can be visibly

    or invisibly broken, and they waste resources. A Pod that doesn’t heal will reserve resources on a Node each time it’s restarted and scheduled. Mutating webhooks can scale down Crash-looping workloads down to 0.
  7. Repair: broken workloads are a waste Workloads can be visibly

    or invisibly broken, and they waste resources. Invisible for Kubernetes, visible for well-defined alerts that recognize apps running idle but using a lot of resources.
  8. Resize: adjust capacity as needs change Rightsizing is at the

    heart of sustainable system architecture. Reasons for cloud waste: - Lack of visibility, - Overprovisioning (wrong request/limit settings), - Leaving cloud resources idle, - Low usage of spot machines.
  9. Resize: adjust capacity as needs change Optimize by rightsizing. Ensure

    visibility. Pick newer machine types. Employ advanced autoscaling. Periodically review resource requests/limits. Use leftover capacity by spot nodes (temporary solution).
  10. Reschedule: even out usage spikes Shift demand in time or

    in space. Distribute your workloads around the clock based on resource usage (easier) or grid carbon intensity (harder). - move around job schedules - Karpenter, WattTime, …
  11. Reschedule: even out usage spikes Shift demand in time or

    in space. Spacial shifting might be the single most impactful decision you make, but has challenges: - latency, - feature availability.
  12. Repeat: instead of running non-stop Identify workloads that can run

    on a schedule. 200MB of RAM 24/24h (Deployment) vs 2/24h (CronJob) Good use-case: - Exports, batch jobs, database backups. Bad use-case: - API server.
  13. Repeat: instead of running non-stop Identify workloads that can run

    on a schedule. Benefits? - Cut down on idling workloads. Drawbacks? - Potentially time-intensive changes to code & architecture.
  14. Full talk at the GSF Oslo Github Kristina Devochko Platform

    Engineer GSF Oslo group founder https://github.com/gsf-oslo
  15. CREDITS: This presentation template was created by Slidesgo , including

    icons by Flaticon , infographics & images by Freepik Thank you! Marta Paciorkowska Platform Engineer @ Oda [email protected] Mastodon: @[email protected]