Scalability and Versatility of Energy-Aware Workload Allocation Optimizer (WAO) for Kubernetes | IEEE JCC 2025

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Scalability and
Versatility of Energy-Aware Workload Allocation Optimizer (WAO) for Kubernetes Shunsuke Ise, Chizuko Mizumoto, Ying-Feng Hsu, Kazuhiro Matsuda and Morito Matsuoka IEEE JCC 2025 | 21‒24 July 2025 | Tucson, AZ, USA

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction
2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 2

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Why a
software-based energy-aware approach? n Data center energy use is soaring n Hardware upgrades are costly and slow n Built on Kubernetes → widely usable Motivation 4 International Energy Agency, "Electricity 2024," [Online]. Available: https://www.iea.org/reports/electricity-2024 Estimated electricity demand from traditional data centres, dedicated AI data centres and cryptocurrencies, 2022 and 2026, base case IEA. CC BY 4.0.

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Energy-aware workload
placement n Predicted incremental power draw as a placement criterion n Uses per-server power models and environmental metrics n Applicable to both workload placement and load balancing WAO Concept 5 Y. -F. Hsu, C. Mizumoto, K. Matsuda, and M. Matsuoka, "Sustainable data center energy management through server workload allocation optimization and HVAC system," in Proc. IEEE Cloud Summit, 2024, pp. 17-23.

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n How
well does WAO scale? n Can it save power without slowing jobs? n What happens in heterogeneous environments? Research Questions 6

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Container
data center with 4 server models n Environmental and system metrics u CPU utilization (from OS) u Ambient (inlet) temperature (via Redfish or IPMI) u Static pressure differential (via front/rear sensors) Testbed & Instrumentation 8 Server CPU Number of Threads A Intel Xeon Bronze 12 B Intel Xeon Silver 32 C Intel Xeon Gold 96 D AMD EPYC 96

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Bare-metal
worker nodes n Customized kube-scheduler and kube-proxy with WAO logic n Custom resources for WAO configuration and control Kubernetes Integration 9

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Train
per-server models from measured metrics n Power Consumption Linearity: PCL = !"!!"#$%& ! represents the potential for power optimization Power Consumption Models: Fundamentals and PCL 11 Server fan rotation management policy CPU frequency governor Ambient Temperature Server Fan Rotation Server Fan Management Policy CPU Management Policy Server Air Conditioner Explanatory variables Objective variable *DVFS: Dynamic Voltage and Frequency Scaling **CS: Context Switching * **

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Power
profiles vary across CPU frequency governors n Mode-aware modeling improves accuracy with per-governor training Power Consumption Models: Adapting to CPU Operating Modes 12

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Savings
visible even with 2 servers, saturates at 5~10 servers n Peak: >40 % (Xeon Gold), >30 % (EPYC) Scalability: Small-Scale 14

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Peak
savings are similar regardless of server count u Fewer servers reach peak at lower occupied threads u More servers need higher occupied threads Scalability: Small-Scale 15 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 Number of Servers Power Saving with WAO (%) #96 #384 #960 #1536 #3072 C (Intel Xeon Gold: 96)

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Caching
reduces score computation from 𝒪 𝑁×𝑃 to 𝒪 𝑃 𝑁: Number of Nodes, 𝑃: Number of Pods n Maintains constant latency as cluster scales in node count Scalability: Large-Scale 16

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Latency
remains low regardless of Pod or Node count n Same performance as kube-scheduler across all cases Scalability: Large-Scale 17

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Score:
µ = α ∗ 𝑝𝑜𝑤𝑒𝑟_𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛 + β ∗ 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔_𝑡𝑖𝑚𝑒 n Tuning α and β controls the energy‒performance trade-off n α = 0.5 achieves good balance Computational Performance: Energy‒Performance Trade-off 19

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 30 35
40 45 50 55 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Uniform WAO (α=0.5, balanced) WAO (α = 1.0; energy-aware) CPU Utilization (%) Processing Time (s) Evaluation Score: μ = α (Power Consumption) + β (Processing Time) D (AMD EPYC: 96) 30 35 40 45 50 55 6000 8000 10000 12000 14000 16000 18000 CPU: 10% CPU: 20% CPU: 50% CPU: 80% Uniform WAO (α = 0.5, balanced) WAO (α = 1.0, energy-aware) Total Power Consumption (W) Processing Time (s) Evaluation Score: μ = α (Power Consumption) + β (Processing Time) D (AMD EPYC: 96) n Similar trends on both Model C (Xeon Gold) and Model D (EPYC) n Performance degrades beyond 50% CPU utilization likely due to resource contention under SMT Computational Performance: Energy‒Performance Trade-off 20

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Mixing
Models A/B/D into Model C servers n Similar PCL but different curve shape → “boost effect” n Large PCL gap (Model A) → diminished power-saving effect Heterogeneous Environment: Mixed Servers 22

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Extend
WAO scoring to support heterogeneous clusters n Evaluate groups of servers with per-model weighted scores n Enables consistent power-aware placement across mixed hardware Heterogeneous Environment: Extended Scoring 23 Workload allocation score (μ) for homogeneous servers: µ = α + β Workload allocation score (μ) for heterogeneous servers: µ = & ! ζ! α! + β! where: 𝑖: index over server models 𝜁!: contribution factor for server model 𝑖 𝛼!: power consumption score for server model 𝑖 𝛽!: computational performance score for server model 𝑖

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Research Questions
(Recap) ü How well does WAO scale? • Effect saturates at 5~10 servers • Matches kube-scheduler performance at large scale, thanks to caching ü Can it save power without slowing jobs? • α = 0.5 is the sweet-spot: big savings, minimal slowdown ü What happens in heterogeneous environments? • “Boost effect” observed in moderately mixed setups • Works with extended scoring Conclusion 25

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. This work
is supported by the New Energy and Industrial Technology Development Organization (NEDO) under its "Program to Develop and Promote the Commercialization of Energy Conservation Technologies to Realize a Decarbonized Society" (JPNP21005). Acknowledgement 27

Scalability and Versatility of Energy-Aware Wor...

Scalability and Versatility of Energy-Aware Workload Allocation Optimizer (WAO) for Kubernetes | IEEE JCC 2025

Shunsuke Ise

More Decks by Shunsuke Ise

Other Decks in Technology

Featured

Transcript

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Scalability and

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Why a

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Energy-aware workload

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n How

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Container

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Bare-metal

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Train

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Power

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Savings

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Peak

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Caching

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Latency

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Score:

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 30 35

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Mixing

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Extend

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Research Questions

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n WAO

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. This work

Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Thank you!