Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RICON West 2012: Bringing Consistency to Riak (...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Joseph Blomstedt
October 30, 2013
Programming
800
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
RICON West 2012: Bringing Consistency to Riak (Part 2)
Joseph Blomstedt
October 30, 2013
More Decks by Joseph Blomstedt
See All by Joseph Blomstedt
Data Structure Adventures for Fun, Profit, and Performance
jtuple
2
390
Riak -- Seattle Scalability
jtuple
2
480
Hansei: Property-based Development of Concurrent Systems
jtuple
3
2.3k
Riak 1.2 Webinar Preview
jtuple
1
290
Test First Construction of Distributed Systems
jtuple
3
1.1k
Other Decks in Programming
See All in Programming
Make SRE Operations Easier with Azure SRE Agent
kkamegawa
0
5.9k
ローカルLLMでどこまでコードが書けるか -拡張版 / How much code can be written on a local LLM Extended
kishida
10
4.1k
正しくソフトウェアを作る、前提を疑うための認知の視点 / doubt-premise
minodriven
21
6.6k
Signal Forms: Beyond the Basics @ngBaguette 2026 in Paris
manfredsteyer
PRO
0
250
JJUG CCC 2026 Spring: JSpecify で実現する Kotlin フレンドリーな Java API 設計
ternbusty
1
170
Snowflake Summitでの新機能 CoCo / CoWork / snowflake-summit-2026-overall-what-new-coco
tatsuhiro
1
130
Spec Driven Development | AI Summit Lisbon
danielsogl
PRO
0
190
さぁV100、メモリをお食べ・・・
nilpe
0
140
コンテキストの使い捨てをやめる — ビジネスルール駆動開発と miko —
ioki
0
200
Lessons from Spec-Driven Development
simas
PRO
0
190
AI時代の仕事技芸論 — ソフトウェア開発で「遊ぶように働く」職人的熟達のすすめ
kuranuki
2
670
LLMによるContent Moderationの本番運用の裏側と品質担保への挑戦
suikabar
2
640
Featured
See All Featured
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
590
Leo the Paperboy
mayatellez
7
1.8k
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
530
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
2k
The Limits of Empathy - UXLibs8
cassininazir
1
360
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
280
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
250
sira's awesome portfolio website redesign presentation
elsirapls
0
280
SEO for Brand Visibility & Recognition
aleyda
0
4.6k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.9k
Six Lessons from altMBA
skipperchong
29
4.3k
Building an army of robots
kneath
306
46k
Transcript
Joseph Blomstedt (@jtuple) Basho Technologies Bringing Consistency To Riak (Part
2) Tuesday, October 29, 13
CAP Theorem 2 Tuesday, October 29, 13
3 Partition-tolerance Consistency Availability Tuesday, October 29, 13
4 Partition-tolerance Consistency Availability Tuesday, October 29, 13
5 Partition-tolerance Consistency Availability CP AP Tuesday, October 29, 13
6 Partition-tolerance Consistency Availability CP AP Tuesday, October 29, 13
7 Partition-tolerance Consistency Availability CP AP Tuesday, October 29, 13
8 C/P Strict Quorum A/P Sloppy Quorum A/P Tuesday, October
29, 13
9 C/P Strict Quorum A/P Sloppy Quorum A/P Tuesday, October
29, 13
10 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
11 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
12 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
13 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
14 C/P Strict Quorum A/P Sloppy Quorum A/P Tuesday, October
29, 13
15 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
16 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
17 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
18 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
19 C/P Strict Quorum A/P Sloppy Quorum A/P Tuesday, October
29, 13
20 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
21 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
22 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
23 Node 1 Node 2 Node 3 Node 4 Node
5 client client client Tuesday, October 29, 13
24 Node 1 Node 2 Node 3 Node 4 Node
5 client client client client client Tuesday, October 29, 13
Eventual Consistency 25 Tuesday, October 29, 13
26 A A A Tuesday, October 29, 13
27 A A A Tuesday, October 29, 13
28 A A A B Tuesday, October 29, 13
29 A A A B Tuesday, October 29, 13
30 A A A B B B B Tuesday, October
29, 13
31 A A A Tuesday, October 29, 13
32 A A A B C Tuesday, October 29, 13
33 A A A B C Tuesday, October 29, 13
34 A A A B {B,C} {B,C} {B,C} C Tuesday,
October 29, 13
35 Write Once Immutable Last Write Wins Business Rules CRDTs/Monotonicity
Tuesday, October 29, 13
36 Write Once Immutable Last Write Wins Business Rules CRDTs/Monotonicity
Tuesday, October 29, 13
37 Write Once Immutable Last Write Wins Business Rules CRDTs/Monotonicity
Tuesday, October 29, 13
38 Write Once Immutable Last Write Wins Business Rules CRDTs/Monotonicity
Tuesday, October 29, 13
39 Write Once Immutable Last Write Wins Business Rules CRDTs/Monotonicity
Tuesday, October 29, 13
40 Write Once Immutable Last Write Wins Business Rules CRDTs/Monotonicity
Tuesday, October 29, 13
Strong Consistency 41 Tuesday, October 29, 13
Strong Consistency 42 Why? Tuesday, October 29, 13
Strong Consistency 43 Recency Tuesday, October 29, 13
Strong Consistency 44 Recency Partial Writes Tuesday, October 29, 13
Strong Consistency 45 Recency Partial Writes Atomicity Tuesday, October 29,
13
46 Recency Partial Writes Atomicity Tuesday, October 29, 13
47 Recency Partial Writes Atomicity Tuesday, October 29, 13
48 Eventual consistency is great Tuesday, October 29, 13
49 But, when is eventual? Tuesday, October 29, 13
50 Do I have the most recent value? Tuesday, October
29, 13
51 CRDTs don’t help Tuesday, October 29, 13
52 (a,1) (a,1) (a,1) =1 Tuesday, October 29, 13
53 (a,1) (a,1) (a,1) Tuesday, October 29, 13
54 (a,1) +1 +3 (a,1) (a,1) (a,2) (a,1),(b,3) =2 =4
Tuesday, October 29, 13
55 (a,1) +1 +3 (a,1) (a,1) (a,2) (a,1),(b,3) Tuesday, October
29, 13
56 (a,1) +1 +3 (a,1) (a,1) (a,2) (a,1),(b,3) (a,2),(b,3) (a,2),(b,3)
(a,2),(b,3) Tuesday, October 29, 13
57 (a,1) +1 +3 (a,1) (a,1) (a,2) (a,1),(b,3) (a,2),(b,3) (a,2),(b,3)
(a,2),(b,3) =5 Tuesday, October 29, 13
58 (a,1) +1 +3 (a,1) (a,1) (a,2) (a,1),(b,3) =2 =4
Tuesday, October 29, 13
59 Recency Partial Writes Atomicity Tuesday, October 29, 13
60 A write B (fail) A A B A A
Tuesday, October 29, 13
61 B A A Tuesday, October 29, 13
62 B A A read A read A read A
Tuesday, October 29, 13
63 B A A read A read A read A
Tuesday, October 29, 13
64 B A A read A read A read A
read B Tuesday, October 29, 13
65 Recency Partial Writes Atomicity Tuesday, October 29, 13
Strong Consistency 66 Tuesday, October 29, 13
Strong Consistency 67 What does mean for Riak 2.0? Tuesday,
October 29, 13
68 Conditional single key atomic operations Tuesday, October 29, 13
69 No siblings Tuesday, October 29, 13
70 get sees most recent put Tuesday, October 29, 13
71 get/modify/put fails if object changed Tuesday, October 29, 13
72 get/modify/put fails if object changed (eg. concurrent put) Tuesday,
October 29, 13
73 puts w/o vclock fails if object exists Tuesday, October
29, 13
74 partial writes resolved on read Tuesday, October 29, 13
75 Consensus Tuesday, October 29, 13
76 Paxos Tuesday, October 29, 13
77 1RGH 1RGH
1RGH 1 SUHSDUH 1 SURPLVH 1 9 % SURPLVH 1 9 & 9 1 I 9 $ 9 % 9 & FRPPLW 1 9 1 DFFHSW 1 Tuesday, October 29, 13
78 Rinse/repeat for each request Tuesday, October 29, 13
79 2 round trips/request Tuesday, October 29, 13
80 Multi-Paxos Tuesday, October 29, 13
81 First Request Tuesday, October 29, 13
82 1RGH 1RGH
1RGH 1 , SUHSDUH 1 , SURPLVH 1 , 9 % SURPLVH 1 , 9 & 9 1 I 9 $ 9 % 9 & FRPPLW 1 , 9 1 DFFHSW 1 , Tuesday, October 29, 13
83 Each Additional Request Tuesday, October 29, 13
84 1RGH 1RGH
1RGH , FRPPLW 1 , 9 DFFHSW 1 , Tuesday, October 29, 13
85 1 round trip/request (common case) Tuesday, October 29, 13
86 Problem Shipping entire state each request is expensive Tuesday,
October 29, 13
87 Solution Paxos + Replicated Log Tuesday, October 29, 13
88 Problem Now I have N problems Tuesday, October 29,
13
89 Log recovery Log trimming Rollup Snapshots Fault Recovery Tuesday,
October 29, 13
90 Choose your own adventure... Tuesday, October 29, 13
91 Better Solution Build log replication into protocol Tuesday, October
29, 13
92 Better Solution ZK Atomic Broadcast Raft Tuesday, October 29,
13
Zab 93 Tuesday, October 29, 13
94 Tuesday, October 29, 13
95 Tuesday, October 29, 13
96 Tuesday, October 29, 13
97 Tuesday, October 29, 13
Raft 98 Tuesday, October 29, 13
99 Tuesday, October 29, 13
100 raftconsensus.github.io Tuesday, October 29, 13
101 Text Tuesday, October 29, 13
Back to Riak 102 Tuesday, October 29, 13
103 Key/Value Keys are independent Active Anti-Entropy Tunable backends Tuesday,
October 29, 13
104 Each key is independent state Tuesday, October 29, 13
105 Simple multi-paxos per key Tuesday, October 29, 13
106 1B keys = 1B consensus groups? Tuesday, October 29,
13
107 No Tuesday, October 29, 13
108 Consensus group per preflist (replica set) Tuesday, October 29,
13
109 Emulate paxos per key Tuesday, October 29, 13
Node 0 Node 1 Node 2 Tuesday, October 29, 13
111 1 2 3 4 5 6 7 123 Tuesday,
October 29, 13
112 1 2 3 4 5 6 7 123 234
Tuesday, October 29, 13
113 1 2 3 4 5 6 7 123 234
345 Tuesday, October 29, 13
114 1 2 3 4 5 6 7 123 234
345 456 Tuesday, October 29, 13
115 1 2 3 4 5 6 7 123 234
345 456 567 ... Tuesday, October 29, 13
116 1 2 3 4 5 6 7 123 234
345 456 567 Ensembles ... Tuesday, October 29, 13
117 64 partition ring = 64 ensembles Tuesday, October 29,
13
118 Each Ensemble Elects leader Establishes epoch Supports get/put ops
Tuesday, October 29, 13
119 Establish a new epoch Tuesday, October 29, 13
120 1RGH 1RGH
1RGH 1 , SUHSDUH 1 , SURPLVH 1 , 9 % SURPLVH 1 , 9 & 9 1 I 9 $ 9 % 9 & FRPPLW 1 , 9 1 DFFHSW 1 , Tuesday, October 29, 13
121 consensus state epoch sequence membership leader Tuesday, October 29,
13
122 K/V objects epoch sequence key value Tuesday, October 29,
13
123 GET leader reads local object if obj.epoch old: refresh
reply w/ val Tuesday, October 29, 13
124 1RGH 1RGH
1RGH REMHSRFK HSRFK JHW .H\ UHSO\ (SRFK % 6HT % 9DO % UHSO\ (SRFK & 6HT & 9DO & 9DO ODWHVW 9DO $ 9DO % 9DO & 9DOHSRFK HSRFK ZULWH (SRFK 6HT 9DO DFN (SRFK 6HT Tuesday, October 29, 13
125 1RGH 1RGH
1RGH REMHSRFK HSRFK 5HSO\ ORFDOBJHW .H\ Tuesday, October 29, 13
126 2 roundtrips/get (worst) 0 roundtrip/get (best) Tuesday, October 29,
13
127 PUT leader reads local object if obj.epoch old: refresh
if modify(obj) false: fail commit modified obj reply ok Tuesday, October 29, 13
128 1RGH 1RGH
1RGH REMHSRFK HSRFK JHW .H\ UHSO\ (SRFK % 6HT % 9DO % UHSO\ (SRFK & 6HT & 9DO & /DWHVW ODWHVW 9DO $ 9DO % 9DO & 9DO PRGLI\ /DWHVW ZULWH (SRFK 6HT 9DO DFN (SRFK 6HT Tuesday, October 29, 13
129 1RGH 1RGH
1RGH REMHSRFK HSRFK /DWHVW ORFDOBJHW .H\ 9DO PRGLI\ /DWHVW ZULWH (SRFK 6HT 9DO DFN (SRFK 6HT Tuesday, October 29, 13
130 2 roundtrips/put (worst) 1 roundtrip/put (best) Tuesday, October 29,
13
131 Leader abandons leadership if any quorum operation ever fails
Tuesday, October 29, 13
132 Which forces new epoch to be established Tuesday, October
29, 13
133 Partial Writes Tuesday, October 29, 13
failed partial write X (2) X (2) X (2) X
(2) X (2) Y (2) epoch 2 epoch 3 Tuesday, October 29, 13
read / rewrite / reply X X (2) X (2)
Y (2) X (3) X (3) Y (2) epoch 3 epoch 3 Tuesday, October 29, 13
X (3) X (3) Y (2) X (3) X (3)
X (3) read / repair / reply X epoch 3 epoch 3 Tuesday, October 29, 13
Usage 137 Tuesday, October 29, 13
138 AP or CP per bucket type Tuesday, October 29,
13
139 consistent = true Tuesday, October 29, 13
140 $ riak-admin bucket-type create strong \ '{"props": {"consistent": true}}'
strong created Tuesday, October 29, 13
141 $ riak-admin bucket-type activate strong strong has been activated
Tuesday, October 29, 13
142 > riakc_pb_socket:get(Socket, {<<"strong">>, <<"bucket">>}, <<"key">>). {error,notfound} Tuesday, October 29,
13
143 > Obj = riakc_obj:new({<<"strong">>, <<"bucket">>}, <<"key">>, <<"1">>)). > riakc_pb_socket:put(Socket,
Obj). ok Tuesday, October 29, 13
144 > Obj2 = riakc_obj:new({<<"strong">>, <<"bucket">>}, <<"key">>, <<"2">>)). > riakc_pb_socket:put(Socket,
Obj2). {error, failed} Tuesday, October 29, 13
145 {ok, Obj3} = riakc_pb_socket:get(Socket, {<<"strong">>, <<"bucket">>}, <<"key">>). Tuesday, October
29, 13
146 Obj4 = riakc_obj:update_value(Obj3, <<"2">>). Tuesday, October 29, 13
147 Obj5 = riakc_obj:update_value(Obj3, <<"22">>). Tuesday, October 29, 13
148 > riakc_pb_socket:put(Socket, Obj4). ok Tuesday, October 29, 13
149 > riakc_pb_socket:put(Socket, Obj5). {error,<<"failed">>} Tuesday, October 29, 13
150 Your client may vary Tuesday, October 29, 13
151 Your client may vary We’re working on it Tuesday,
October 29, 13
Tech Preview 152 Tuesday, October 29, 13
153 No AAE syncing No 2i No stats Tuesday, October
29, 13
154 Will be in 2.0 final Tuesday, October 29, 13
Coming Soon 155 Tuesday, October 29, 13
156 Datatypes Multi-DC Lightweight Tx? Perf benchmarks Tuesday, October 29,
13
157 Datatypes Multi-DC Lightweight Tx? Perf benchmarks Tuesday, October 29,
13
158 Datatypes Multi-DC Lightweight Tx? Perf benchmarks Tuesday, October 29,
13
159 Datatypes Multi-DC Lightweight Tx? Perf benchmarks Tuesday, October 29,
13
160 Datatypes Multi-DC Lightweight Tx? Perf benchmarks Tuesday, October 29,
13
Questions? 161 Tuesday, October 29, 13