Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Billing the Cloud
Search
Pierre-Yves Ritschard
December 15, 2016
Technology
7
2.2k
Billing the Cloud
This talk describes how Exoscale approaches usage metering and billing with Apache Kafka
Pierre-Yves Ritschard
December 15, 2016
Tweet
Share
More Decks by Pierre-Yves Ritschard
See All by Pierre-Yves Ritschard
Meetup Camptocamp: Exoscale SKS
pyr
0
450
The (long) road to Kubernetes
pyr
0
310
From vertical to horizontal: The challenges of scalability in the cloud
pyr
0
71
Change Management at Scale
pyr
0
110
5 years of Clojure
pyr
2
1k
Taming Jenkins
pyr
0
51
Init: then and now
pyr
1
200
Billing the Cloud
pyr
0
310
From Vertical to Horizontal
pyr
2
140
Other Decks in Technology
See All in Technology
RSCの時代にReactとフレームワークの境界を探る
uhyo
10
3.5k
Generative AI Japan 第一回生成AI実践研究会「AI駆動開発の現在地──ブレイクスルーの鍵を握るのはデータ領域」
shisyu_gaku
0
290
株式会社ログラス - 会社説明資料【エンジニア】/ Loglass Engineer
loglass2019
4
64k
Modern Linux
oracle4engineer
PRO
0
100
react-callを使ってダイヤログをいろんなとこで再利用しよう!
shinaps
1
250
Firestore → Spanner 移行 を成功させた段階的移行プロセス
athug
1
490
新規プロダクトでプロトタイプから正式リリースまでNext.jsで開発したリアル
kawanoriku0
1
140
5分でカオスエンジニアリングを分かった気になろう
pandayumi
0
250
CDK CLIで使ってたあの機能、CDK Toolkit Libraryではどうやるの?
smt7174
4
190
Evolución del razonamiento matemático de GPT-4.1 a GPT-5 - Data Aventura Summit 2025 & VSCode DevDays
lauchacarro
0
210
プラットフォーム転換期におけるGitHub Copilot活用〜Coding agentがそれを加速するか〜 / Leveraging GitHub Copilot During Platform Transition Periods
aeonpeople
1
100
AIエージェント開発用SDKとローカルLLMをLINE Botと組み合わせてみた / LINEを使ったLT大会 #14
you
PRO
0
130
Featured
See All Featured
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.9k
Making the Leap to Tech Lead
cromwellryan
135
9.5k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
The World Runs on Bad Software
bkeepers
PRO
70
11k
BBQ
matthewcrist
89
9.8k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.9k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
GraphQLの誤解/rethinking-graphql
sonatard
72
11k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
33
2.4k
RailsConf 2023
tenderlove
30
1.2k
Site-Speed That Sticks
csswizardry
10
820
Navigating Team Friction
lara
189
15k
Transcript
1 Billing the cloud Real world stream processing
2 . 1 @pyr Co-Founder, CTO at Exoscale Open source
developer
3 . 1 Tonight Problem domain Scaling methodologies Our approach
None
4 . 1
5 . 1
6 . 1 7 . 1 Infrastructure isn't free!
8 . 1 Business Model Provide cloud infrastructure ??? Pro
t!
None
9 . 1
10 . 1 11 . 1 10000 mile high view
None
12 . 1 Quantities Resources
13 . 1 14 . 1 Quantities 10 megabytes have
been sent from 159.100.251.251 over the last minute
15 . 1 Resources Account geneva-jug started instance foo with
pro le large today at 12:00 Account geneva-jug stopped instance foo today at 12:15
16 . 1 A bit closer to reality {:type :usage
:entity :vm :action :create :time #inst "2016-12-12T15:48:32.000-00:00" :template "ubuntu-16.04" :source :cloudstack :account "geneva-jug" :uuid "7a070a3d-66ff-4658-ab08-fe3cecd7c70f" :version 1 :offering "medium"}
17 . 1 A bit closer to reality message IPMeasure
{ /* Versioning */ required uint32 header = 1; required uint32 saddr = 2; required uint64 bytes = 3; /* Validity */ required uint64 start = 4; required uint64 end = 5; }
18 . 1 Theory
19 . 1 Quantities are simple
None
20 . 1 21 . 1 Resources are harder
None
22 . 1 23 . 1 This is per-account
None
24 . 1 25 . 1 Solving for all events
resources = {} metering = [] def usage_metering(): for event in fetch_all_events(): uuid = event.uuid() time = event.time() if event.action() == 'start': resources[uuid] = time else: timespan = duration(resources[uuid], time) usage = Usage(uuid, timespan) metering.append(usage) return metering
26 . 1 Practical matters This is a never-ending process
Minute precision billing Only apply once an hour Avoid over billing at all cost Avoid under billing (we need to eat!)
27 . 1 Practical matters Keep a small operational footprint
28 . 1 A naive approach
32 * * * * usage-metering >/dev/null 2>&1
29 . 1
30 . 1
31 . 1 32 . 1 Advantages
Low operational overhead Simple functional boundaries Easy to test
33 . 1 34 . 1 Drawbacks High pressure on
SQL server Hard to avoid overlapping jobs Overlaps result in longer metering intervals
You are in a room full of overlapping cron jobs.
You can hear the screams of a dying MySQL server. An Oracle vendor is here. To the West, a door is marked "Map/Reduce" To the East, a door is marked "Streaming"
35 . 1 36 . 1 > Talk to Oracle
You have been eaten by a grue.
37 . 1 38 . 1 > Go West
None
39 . 1 Conceptually simple Spreads easily Data-locality aware processing
40 . 1 ETL High latency High operational overhead
41 . 1
42 . 1 43 . 1 > Go East
None
44 . 1 Continuous computation on an unbounded stream
45 . 1 Each event processed as it comes in
Very low latency A never ending reduce
46 . 1 (reductions + [1 2 3 4]) ;;
=> (1 3 6 10)
47 . 1 Conceptually harder Where do we store intermediate
results? How does data ow between computation steps?
48 . 1
49 . 1 50 . 1 Deciding factors
51 . 1 Our shopping list
Operational simplicity Integration through our whole stack Going beyond billing
Room to grow
52 . 1 53 . 1 Operational simplicity Experience matters
Spark and Storm are intimidating Hbase & Hive discarded
54 . 1 Integration HDFS would require simple integration Spark
usually goes hand in hand with Cassandra Storm tends to prefer Kafka
55 . 1 Room to grow A ton of logs
A ton of metrics
56 . 1 Thursday confessions Previously knew Kafka
None
57 . 1
58 . 1 Publish & Subscribe Processing Store
59 . 1 60 . 1 Publish & Subscribe Messages
are produced to topics Topics have a prede ned number of partitions Messages have a key which determines its partition
Consumers get assigned a set of partitions Consumers store their
last consumed offset Brokers own partitions, handle replication
61 . 1
62 . 1 Stable consumer topology Memory desaggregation Can rely
on in-memory storage
63 . 1 64 . 1 Stream expiry
None
65 . 1
66 . 1
67 . 1
68 . 1 69 . 1 Problem solved?
Process crashes Undelivered message? Avoiding double billing
70 . 1 71 . 1 Process crashes Triggers a
rebalance Loss of in-memory cache No initial state!
72 . 1 Reconciliation Snapshot of full inventory Converges stored
resource state if necessary Handles failed deliveries as well
73 . 1 Avoiding double billing Reconciler acts as logical
clock When supplying usage, attach a unique transaction ID Reject multiple transaction attempts on a single ID
74 . 1 Looking back Things stay simple (roughly 600
LoC) Room to grow Stable and resilient DNS, Logs, Metrics, Event Sourcing
75 . 1 What about batch Streaming doesn't work for
everything Sometimes throughput matters more than latency Building models in batch, applying with stream processing
76 . 1 Questions? Thanks!