Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Notes on Eventual Consistency
Search
UENISHI Kota
December 11, 2014
Technology
5
4.1k
Notes on Eventual Consistency
結果整合性などの復習
UENISHI Kota
December 11, 2014
Tweet
Share
More Decks by UENISHI Kota
See All by UENISHI Kota
Metadata Management in Distributed File Systems
kuenishi
2
480
Behind The Scenes: Cloud Native Storage System for AI
kuenishi
2
350
Apache Ozone behind Simulation and AI Industries
kuenishi
0
330
Distributed Deep Learning with Chainer and Hadoop
kuenishi
3
1.2k
A Few Ways to Accelerate Deep Learning
kuenishi
0
1k
Introducing Retz
kuenishi
5
1.1k
Introducing Retz and how to develop practical frameworks
kuenishi
3
700
Formalization and Proof of Distributed Systems (ja)
kuenishi
10
6.3k
Mesos Frameworkの作り方 (How to Make Mesos Framework)
kuenishi
7
2.3k
Other Decks in Technology
See All in Technology
リーダブルテストコード 〜メンテナンスしやすい テストコードを作成する方法を考える〜 #DevSumi #DevSumiB / Readable test code
nihonbuson
11
7.3k
『衛星データ利用の方々にとって近いようで触れる機会のなさそうな小話 ~ 衛星搭載ソフトウェアと衛星運用ソフトウェア (実物) を動かしながらわいわいする編 ~』 @日本衛星データコミニティ勉強会
meltingrabbit
0
150
あれは良かった、あれは苦労したB2B2C型SaaSの新規開発におけるCloud Spanner
hirohito1108
2
600
Oracle Base Database Service 技術詳細
oracle4engineer
PRO
6
57k
ハッキングの世界に迫る~攻撃者の思考で考えるセキュリティ~
nomizone
13
5.2k
スタートアップ1人目QAエンジニアが QAチームを立ち上げ、“個”からチーム、 そして“組織”に成長するまで / How to set up QA team at reiwatravel
mii3king
2
1.5k
CZII - CryoET Object Identification 参加振り返り・解法共有
tattaka
0
370
2.5Dモデルのすべて
yu4u
2
860
データの品質が低いと何が困るのか
kzykmyzw
6
1.1k
なぜ私は自分が使わないサービスを作るのか? / Why would I create a service that I would not use?
aiandrox
0
740
レビューを増やしつつ 高評価維持するテクニック
tsuzuki817
1
720
運用しているアプリケーションのDBのリプレイスをやってみた
miura55
1
720
Featured
See All Featured
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
4
330
Building Your Own Lightsaber
phodgson
104
6.2k
Raft: Consensus for Rubyists
vanstee
137
6.8k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.1k
Building Flexible Design Systems
yeseniaperezcruz
328
38k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
27
1.9k
Agile that works and the tools we love
rasmusluckow
328
21k
Optimising Largest Contentful Paint
csswizardry
34
3.1k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
10
1.3k
Git: the NoSQL Database
bkeepers
PRO
427
64k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3k
Building Adaptive Systems
keathley
40
2.4k
Transcript
݁Ռ߹ੑͳͲͷ෮श 2014/12/11 Ϗοάσʔλج൫ษڧձ @NTTଂ௨ݚ Bashoδϟύϯɹ্
ࣗݾհ • @kuenishi • Github, Twitter, etc • ࢄγεςϜྺ6 •
Bashoδϟύϯͷํ͔Βདྷ·ͨ͠ • Riak CSͷ։ൃ • ͦͷଞຊͷ͜ͱ • msgpack-erlang ϝϯςφ
BashoͱRiak •ࢄσʔλϕʔεʁ •RiakΛ͍ͬͯΔʁ •BashoΛ͍ͬͯΔʁ
͋Β͢͡ •݁Ռ߹ੑڧ߹ੑͷྼԽ൛Ͱͳ͘ɺ ผछͷΛղͨ͘ΊͷҟͳΔఆٛ •ผछͷ is Մ༻ੑ •ηϚϯςΟΫε͕ҟͳΔͷͰΞϓϦͷઃܭ ͷํ͕ͪΐͬͱมΘΔ
݁Ռ߹ੑ͏ݹ͍ʁ •2006ͷٕज़Ͱ͠ΐ •Ϗοάσʔλؔͳ͘Ͷʁ •DynamoDBڧ߹ੑΛఏڙ͍ͯ͠Δ •ωοτϫʔΫΕͳ͍Ͱ͠ΐʁ •ͦΜͳͷͬͯΔਓ͍Δͷʁ •ΞϓϦ͕࡞Γʹ͍͘…
Ϗοάσʔλج൫ݚڀձͱ ݁Ռ߹ੑ •ϏοάσʔλΛѻ͏େنͳγεςϜʹͳ ΕͳΔ΄ͲյΕΔ෦ଟ͍ •͕͔͔͍ۚͬͯΔͷͰɺٻΊΒΕΔՄ༻ੑ ߴ͍ •ӡ༻ָ͕
None
݁Ռ߹ੑͷ࣮༻ྫʢۙʣ •σʔλϕʔεͷόοΫΞοϓ •rsyncͰͷϑΝΠϧͷόοΫΞοϓ •Google Wave aaaa bbbb y x
CAP Theorem • Consistent: ෳͷAtomic Objectʹର͢Δ ࿈ଓͨ͠ૢ࡞ (w1, w3, w4,
….) ͕શͯಉҰ Ͱ͋Δ͜ͱ (linearizable) • Available: Atomic Objectʹૢ࡞ w1, w2, …Λ࣮ߦͯ͠Ϩεϙϯε͕ಘΒΕΔ͜ͱ • Partition Tolerant: ૹͬͨϝοηʔδ͕ ࣦͯ͠ਖ਼͍͠ʢatomicʣͳϨεϙϯε͕ ಘΒΕΔ͜ͱ G1 G2 write read Gilbert and Lynch, Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services
CAPఆཧ͕ఆٛ͢Δ߹ੑ •CAPఆཧ͕ఆٛ͢Δ߹ੑ㲈Linearizability •શͯͷෳͰɺશͯͷʢߋ৽ʣૢ࡞͕ॱ൪௨Γ ϦϓϨΠ͞ΕΔ͜ͱΛอো͢Δ •ACIDͷͦΕͱͪΐͬͱҧ͏
ͳͥڧ߹ੑͷ࣮ݱ͕͍͠ͷ͔ • ଟܾͱ͔ atomic broadcast Λ͏ͱͯ͠ύ ϑΥʔϚϯεͷϖφϧςΟ͕͋Δ • asynchrony +
partial failureͷ͠͞ • ࢮ׆ࢹ is hard => Downtime • ӡ༻ੑ • Γସ͑ɺΓ͠ɺ༳Ε·͢༳Ε·͢
Consistency͍͠ •ߋ৽ΛࢭΊΔʢAvailabilityΛԼ͛Δʣ͔ɺߋ৽ͷ্ॻ͖Λ ڐ͢ʢσʔλΛࣦ͏ʣ͔͔͠બࢶ͕ͳ͍ Server2 Server1 Server3 PUT V=42 PUT V=0
V=?
Atomic Broadcasting is Difficult • ϨϓϦέʔγϣϯॱ൪͕ೖΕସΘΔ • CPUͷΞτΦϒΦʔμʔ࣮ߦͱಉ͡ w1 w1
w1 w2 w2 w2 Actor 0 Actor 1 Actor 2 w2 w2 w1
Consensus Based Replication • ϨϓϦέʔγϣϯͷϦʔμʔΛଟܾͰબग़ • or ϨϓϦέʔγϣϯຖʹଟܾ w1 w1
w1 w2 w2 w2 Actor 0 Actor 1 Actor 2 w2 w2 w1
݁Ռ߹ੑ •Eventual Consistency •Ͳ͏͍͏ܦ࿏ΛḷΔʹͤΑɺෳ͕ ࠷ऴతʹಉ͡ঢ়ଶʹऩଋ͢Δ͜ͱ •Read Repair •AAE •CRDT v0
v1 •(Vector Clocks)
Siblings •ͱΓ͋͑ͣෳͷόʔδϣϯͷڞଘΛڐ͢ •Ͳͷόʔδϣϯ͕ਖ਼͍͔͠ɺ͘͠Ϛʔδ͢Δ͔ΛRead࣌ʹܾఆ Server2 Server1 Server3 PUT V=42 PUT V=0
V=0 or 42 V=0 V=0 or 42 V=42
APΛ࣮ݱ •ωοτϫʔΫஅ͕ى͖͍ͯͯͱΓ͋͑ͣॻ͖ࠐΈΛڐ͢ Server2 Server1 Server3 PUT V=42 PUT V=0 Server4
෮چͨ͠Βॻ͖͢ ྆ํ͓࣋ͬͯ͘
γϣοϐϯάΧʔτͷྫ •UnionΛͱΕΑ͍ Server2 Server1 Server3 PUT cart=[a,b,d] PUT cart=[a,b,c] union([a,b,c],
[a,b,d]) => [a,b,c,d] [a,b,c] [a,b,c] or [a,b,d] [a,b,d]
Read Repair v2 v2 get(“conferences/thoughtworks”) Get Handler (FSM) client Riak
Coordinating node Cluster 6 7 8 9 10 11 12 13 14 15 16 R=2 v1 v2 v2 v1 v2 v1 v1 v2 v2
Active Anti Entropy • APࢦͷDBͷσʔλྼԽΛ͙ ͨΊͷόοΫάϥϯυॲཧ • Merkle-TreeΛͬͯύʔςΟγϣ ϯຖͷʮνΣοΫαϜʯΛܭࢉ •
ࠩΛݟ͚ͭͨΒͦ͜ΛRead Repair͢Δ hash(vnode=0, pid=0) hash(vnode=1, pid=0) hash(vnode=2, pid=0)
CRDT • ॱ൪͕ೖΕସΘͬͯ݁Ռ͕มΘΒͳ͍ܕ • update(w1, update(w2, Data0) = update(w2, update(w1,
Data0) = Data w1 w1 w1 w2 w2 w2 Actor 0 Actor 1 Actor 2 w1(w2(Data0)) => Data w1(w2(Data0)) => Data w2(w1(Data0)) => Data
CRDT: PN-Counter • merge • {a: {1,-1}, b: {1,0}, c:
{2,0}} • {a: {0,0}, b: {2, 0}, c: {0, -2}} • => {a: {1,-1}, b:{2,0}, c:{2,-2}} => 2 • update • a͕ {increment, 3} Λड͚͚Δͱ • {a: {4,-1}, b: {1,0}, c: {2,0}}
CRDT: OR-Sets • merge • {a:{“foo”:true}, b:{“bar”:false}} • + {a:{“foo”:true},
b:{“foo”:false, “bar”:false}} • => {a:{“foo”:true}, b:{“foo”:false, “bar”:true}} • => [“bar”] • update • add: {a:{}} => +”foo” => {a:{“foo”:false}} • remove: {a: {“foo”:false}} => {a: {“foo”:true}}
ӡ༻ָ͕ • ΧδϡΞϧʹϊʔυωοτϫʔΫΛ্͛Լ͛Ͱ͖Δ • ߹ੑΛอͭͨΊͷϚελʔ͕୭͔Λؾʹ͢Δඞཁ͕ͳ͍ • ڧ߹ੑΛอͭͨΊͷΦϖϨʔγϣϯ͕μϯλΠϜʹͳ Βͳ͍ • ߹ੑνΣοΫɺϦΧόϦɺόοΫΞοϓ
• ނো࣌ͷΦϖϨʔγϣϯ͕͔ͳΓ୯७
݁Ռ߹ੑΛ࠾༻ͨ͠ ߹ͷ՝ •;ͭ͏ͷϓϩάϥϛϯάͱҟͳΔηϚϯςΟ ΫεʹͳΔ •ΞϓϦέʔγϣϯʹ͜Ε·ͰͱҟͳΔલఏΛཁ ٻ͢Δ͜ͱʹͳΔ •CRDTͰҰ෦ղܾɺ͚ͩͲ…
Ԡ༻ྫ
League of Legends •MMORPGͷνϟοτՄ༻ੑͱϨε ϙϯελΠϜ໋͕ •10ms ͕ੜࢮΛ͚Δ (C) Riot Games
•Riak্Ͱಈ͘ “ߴՄ༻” Ϋϥυ ετϨʔδ •ΦϒδΣΫτͷϝλσʔλ݁Ռ ߹తσʔλߏ •: 5GBͷσʔλ͕Concurrentʹ Ξοϓϩʔυ͞Ε͖ͯͨΒʁ •:
͔ͦ͠Ε͕ผͷେͷ ΞοϓϩʔυͩͬͨΒʁ /foo.bar
σʔληϯλʔؒϨϓϦέʔγϣϯ •DCؒωοτϫʔΫଓੑ ଳҬ·ͰؚΊͯৗʹਖ਼͘͠ӡ ༻͢Δͷ͕͍͠ •CAPఆཧͷཁ͔Βɺಉظత ϨϓϦέʔγϣϯ͍͠ •Մ༻ੑΛอͭͨΊʹɺ݁Ռ ߹͢ΔσʔλϞσϧΛ࠾༻
·ͱΊ •݁Ռ߹ੑڧ߹ੑͷྼԽ൛Ͱͳ͘ɺผछͷ Λղͨ͘ΊͷҟͳΔఆٛ •ผछͷ is Մ༻ੑ •ηϚϯςΟΫε͕ҟͳΔͷͰΞϓϦͷઃܭͷํ͕ ͪΐͬͱมΘΔ •݁Ռ߹ੑΛอͭͨΊͷ͍͔ͭ͘ͷٕज़Λհ
We are hiring. •࣮ੈքͷࢄγεςϜͷ ʹڵຯ͋Δਓʂ •@BashoJapan •
[email protected]
Questions?