GreenOps ・I gave a presentation introducing Kepler at Kubernetes Meetup Tokyo #58 KubeCon EU 2023 Recap - Title: Sustainability Through Accountability in a CNCF Ecosystemのご紹介
due to using of a generative AI(*1) - Global consuming estimated 460TWh in 2022 → exceed 1000TWh in 2026 This figure is comparable to the total electricity consumption of Japan ・When using services like ChatGPT, we rarely consider the electricity consumption or carbon emissions ・The use of local LLMs is increasing, so it's important to understand the electricity consumption and carbon emissions associated with local LLMs. ・How can we calculate it in the case of running a local LLM on k8s? (*1) https://www.iea.org/reports/electricity-2024/executive-summary
Exporter - to measure power consumption of workload by tracing cpu performance counter and linux kernel tracepoint ・uses eBPF to probe energy-related system stats and exports them as Prometheus metrics. ・can visualise Kepler metrics with Grafana
・Model 2 : gemma2 2b ・Model 3 : gemma2 9b Benchmark question: I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. I also gave 3 bananas to my brother. How many apples did I remain with? Let's think step by step.(私は市場へ行 きリンゴを10個買いました。私はリンゴを2個隣の人に、もう2個を修理屋さんにあげました。そ れからリンゴをさらに5個買い、1個食べました。また、バナナを3本兄にあげました。リンゴは何 個残ったでしょう?段階的に考えてみましょう。) (*2) Measuring the Power Consumption and Carbon Emissions of each LLM models (*2)Asking 60+ LLMs a set of 20 questions(https://benchmarks.llmonitor.com/) The subject of measurement is ollama pod
・By using kepler, you can measure the consumption of LLM pod on kubernetes. You only need to deploy kepler and copy kepler_dashboard.json to grafana. the Power Consumption and Carbon Emissions by LLM models ・For the same model, a larger model size consumes more power. ・For different models, even a smaller model size can sometimes consume more power.