Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scalability and Versatility of Energy-Aware Wor...

Scalability and Versatility of Energy-Aware Workload Allocation Optimizer (WAO) for Kubernetes | IEEE JCC 2025

Avatar for Shunsuke Ise

Shunsuke Ise

July 23, 2025
Tweet

More Decks by Shunsuke Ise

Other Decks in Technology

Transcript

  1. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Scalability and

    Versatility of Energy-Aware Workload Allocation Optimizer (WAO) for Kubernetes Shunsuke Ise, Chizuko Mizumoto, Ying-Feng Hsu, Kazuhiro Matsuda and Morito Matsuoka IEEE JCC 2025 | 21‒24 July 2025 | Tucson, AZ, USA
  2. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 2
  3. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 3
  4. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Why a

    software-based energy-aware approach? n Data center energy use is soaring n Hardware upgrades are costly and slow n Built on Kubernetes → widely usable Motivation 4 International Energy Agency, "Electricity 2024," [Online]. Available: https://www.iea.org/reports/electricity-2024 Estimated electricity demand from traditional data centres, dedicated AI data centres and cryptocurrencies, 2022 and 2026, base case IEA. CC BY 4.0.
  5. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Energy-aware workload

    placement n Predicted incremental power draw as a placement criterion n Uses per-server power models and environmental metrics n Applicable to both workload placement and load balancing WAO Concept 5 Y. -F. Hsu, C. Mizumoto, K. Matsuda, and M. Matsuoka, "Sustainable data center energy management through server workload allocation optimization and HVAC system," in Proc. IEEE Cloud Summit, 2024, pp. 17-23.
  6. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n How

    well does WAO scale? n Can it save power without slowing jobs? n What happens in heterogeneous environments? Research Questions 6
  7. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 7
  8. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Container

    data center with 4 server models n Environmental and system metrics u CPU utilization (from OS) u Ambient (inlet) temperature (via Redfish or IPMI) u Static pressure differential (via front/rear sensors) Testbed & Instrumentation 8 Server CPU Number of Threads A Intel Xeon Bronze 12 B Intel Xeon Silver 32 C Intel Xeon Gold 96 D AMD EPYC 96
  9. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Bare-metal

    worker nodes n Customized kube-scheduler and kube-proxy with WAO logic n Custom resources for WAO configuration and control Kubernetes Integration 9
  10. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 10
  11. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Train

    per-server models from measured metrics n Power Consumption Linearity: PCL = !"!!"#$%& ! represents the potential for power optimization Power Consumption Models: Fundamentals and PCL 11 Server fan rotation management policy CPU frequency governor Ambient Temperature Server Fan Rotation Server Fan Management Policy CPU Management Policy Server Air Conditioner Explanatory variables Objective variable *DVFS: Dynamic Voltage and Frequency Scaling **CS: Context Switching * **
  12. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Power

    profiles vary across CPU frequency governors n Mode-aware modeling improves accuracy with per-governor training Power Consumption Models: Adapting to CPU Operating Modes 12
  13. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 13
  14. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Savings

    visible even with 2 servers, saturates at 5~10 servers n Peak: >40 % (Xeon Gold), >30 % (EPYC) Scalability: Small-Scale 14
  15. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Peak

    savings are similar regardless of server count u Fewer servers reach peak at lower occupied threads u More servers need higher occupied threads Scalability: Small-Scale 15 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 Number of Servers Power Saving with WAO (%) #96 #384 #960 #1536 #3072 C (Intel Xeon Gold: 96)
  16. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Caching

    reduces score computation from 𝒪 𝑁×𝑃 to 𝒪 𝑃 𝑁: Number of Nodes, 𝑃: Number of Pods n Maintains constant latency as cluster scales in node count Scalability: Large-Scale 16
  17. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Latency

    remains low regardless of Pod or Node count n Same performance as kube-scheduler across all cases Scalability: Large-Scale 17
  18. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 18
  19. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Score:

    µ = α ∗ 𝑝𝑜𝑤𝑒𝑟_𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛 + β ∗ 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔_𝑡𝑖𝑚𝑒 n Tuning α and β controls the energy‒performance trade-off n α = 0.5 achieves good balance Computational Performance: Energy‒Performance Trade-off 19
  20. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 30 35

    40 45 50 55 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Uniform WAO (α=0.5, balanced) WAO (α = 1.0; energy-aware) CPU Utilization (%) Processing Time (s) Evaluation Score: μ = α (Power Consumption) + β (Processing Time) D (AMD EPYC: 96) 30 35 40 45 50 55 6000 8000 10000 12000 14000 16000 18000 CPU: 10% CPU: 20% CPU: 50% CPU: 80% Uniform WAO (α = 0.5, balanced) WAO (α = 1.0, energy-aware) Total Power Consumption (W) Processing Time (s) Evaluation Score: μ = α (Power Consumption) + β (Processing Time) D (AMD EPYC: 96) n Similar trends on both Model C (Xeon Gold) and Model D (EPYC) n Performance degrades beyond 50% CPU utilization likely due to resource contention under SMT Computational Performance: Energy‒Performance Trade-off 20
  21. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 21
  22. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Mixing

    Models A/B/D into Model C servers n Similar PCL but different curve shape → “boost effect” n Large PCL gap (Model A) → diminished power-saving effect Heterogeneous Environment: Mixed Servers 22
  23. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n Extend

    WAO scoring to support heterogeneous clusters n Evaluate groups of servers with per-model weighted scores n Enables consistent power-aware placement across mixed hardware Heterogeneous Environment: Extended Scoring 23 Workload allocation score (μ) for homogeneous servers: µ = α + β Workload allocation score (μ) for heterogeneous servers: µ = & ! ζ! α! + β! where: 𝑖: index over server models 𝜁!: contribution factor for server model 𝑖 𝛼!: power consumption score for server model 𝑖 𝛽!: computational performance score for server model 𝑖
  24. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. 1. Introduction

    2. Experimental Procedure 3. Results and Discussion 1. Power Consumption Models 2. Scalability 3. Computational Performance 4. Heterogeneous Environments 4. Summary Outline 24
  25. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. Research Questions

    (Recap) ü How well does WAO scale? • Effect saturates at 5~10 servers • Matches kube-scheduler performance at large scale, thanks to caching ü Can it save power without slowing jobs? • α = 0.5 is the sweet-spot: big savings, minimal slowdown ü What happens in heterogeneous environments? • “Boost effect” observed in moderately mixed setups • Works with extended scoring Conclusion 25
  26. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. n WAO

    is open-source (github.com/waok8s/waok8s) and runs on K8s v1.30+ n Contributions are welcomed GitHub 26
  27. Copyright © 2025 Bitmedia, Inc. All Rights Reserved. This work

    is supported by the New Energy and Industrial Technology Development Organization (NEDO) under its "Program to Develop and Promote the Commercialization of Energy Conservation Technologies to Realize a Decarbonized Society" (JPNP21005). Acknowledgement 27