Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Scaling MongoDB | Sergey Gavruk
Search
Minsk MongoDB User Group
October 04, 2012
Programming
2
170
Scaling MongoDB | Sergey Gavruk
Sergey Gavruk
Meetup #7
Minsk MongoDB User Group
October 04, 2012
Tweet
Share
More Decks by Minsk MongoDB User Group
See All by Minsk MongoDB User Group
MongoDB by Chef | Yauhen Artsiukhou
bymongo
0
120
MongoDB at IronMQ | Alexander Kolesen
bymongo
0
840
Event sourcing + CQRS + MongoDB | Alex Shkor
bymongo
1
640
How it works. Indexes | Kirill Duborenko
bymongo
5
270
Aggregation Framework | Mikhail Burtylev
bymongo
1
88
MongoDB 2.2: Release update + Roadmap | Alvin Richards
bymongo
1
89
Meetup#6 Intro | Alex Litvinok
bymongo
1
44
Deploying MongoDB on Amazon WS | Michael Karpitsky
bymongo
2
110
About the problem of DBMS choice & what to do if you have gone the wrong way | Roman Bugaev
bymongo
3
120
Other Decks in Programming
See All in Programming
PagerDuty を軸にした On-Call 構築と運用課題の解決 / PagerDuty Japan Community Meetup 4
horimislime
1
110
プロジェクト新規参入者のリードタイム短縮の観点から見る、品質の高いコードとアーキテクチャを保つメリット
d_endo
1
1k
C#/.NETのこれまでのふりかえり
tomokusaba
1
160
Go言語でターミナルフレンドリーなAIコマンド、afaを作った/fukuokago20_afa
monochromegane
2
140
CPython 인터프리터 구조 파헤치기 - PyCon Korea 24
kennethanceyer
0
250
Progressive Web Apps für Desktop und Mobile mit Angular (Hands-on)
christianliebel
PRO
0
110
WEBエンジニア向けAI活用入門
sutetotanuki
0
300
From Subtype Polymorphism To Typeclass-based Ad hoc Polymorphism- An Example
philipschwarz
PRO
0
170
Kubernetes for Data Engineers: Building Scalable, Reliable Data Pipelines
sucitw
1
200
Kaigi on Rails 2024 - Rails APIモードのためのシンプルで効果的なCSRF対策 / kaigionrails-2024-csrf
corocn
5
3.4k
カラム追加で増えるActiveRecordのメモリサイズ イメージできますか?
asayamakk
4
1.6k
Jakarta Concurrencyによる並行処理プログラミングの始め方 (JJUG CCC 2024 Fall)
tnagao7
1
240
Featured
See All Featured
10 Git Anti Patterns You Should be Aware of
lemiorhan
654
59k
Designing for humans not robots
tammielis
249
25k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5k
What's new in Ruby 2.0
geeforr
342
31k
Building a Scalable Design System with Sketch
lauravandoore
459
33k
Fireside Chat
paigeccino
32
3k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
37
1.8k
Designing the Hi-DPI Web
ddemaree
280
34k
Being A Developer After 40
akosma
86
590k
KATA
mclloyd
29
13k
The Cult of Friendly URLs
andyhume
78
6k
The World Runs on Bad Software
bkeepers
PRO
65
11k
Transcript
Scaling Sergey Gavruk @gavruk
Scaling • Ver2cal • Horizontal • By op2miza2on – Op2mize
your queries, schema, indexes – Tune you file system – Choose right disks
Share nothing architecture • Michael Stonebraker First
implementa2on in 1983 Google calls this “Sharding”
Sharding goals • App doesn’t know about clusters • Cluster
should always be available for reads and writes • Cluster should grow easily
Sharding features • Range-‐based data par22oning • Automa2c
data volume distribu2on • Transparent query rou2ng
[“a”, “g”) [“g”, “m”) [“m”, “s”) [“s”,
“z”)
[“a”, “g”) [“g”, “m”) [“m”, “s”) [“s”,
“z”) [“d”, “g”) 100 GB 500 GB 100 GB 100 GB 100 GB 400 GB 200 GB 100 GB
[“a”, “g”) [“g”, “m”) [“m”, “s”) [“s”,
“z”)
[“a”, “d”) 300 [“g”, “k”) 300 [“m”,
“s”) [“s”, “z”) 400 GB 400 GB 100 GB 100 GB [“d”, “g”) 100 [“k”, “m”) 100
None
Chunks -‐∞ +∞
Chunks -‐∞ +∞
null Numbers Strings Objects Arrays
binary data ObjectIds booleans Dates regular expressions smaller bigger
Balancing mongos balancer Config server Config
server Config server Shard 1 Shard 2
Balancer goals • keep data distributed • minimize the amount
of data transfered
Balancing mongos balancer Config server Config
server Config server Shard 1 Shard 2 Move chunk X to shard 2
Balancing Number of chunks Migra:on threshold <
20 2 21-‐80 4 80+ 8
Balancing schedule db.seangs.update({ _id : "balancer" },
{ $set : { ac2veWindow : { start : "23:00", stop : "6:00" } } }, true )
Routed Request mongos Shard 1 Shard 2
Shard 3
mongos Shard 1 Shard 2 Shard 3
Request without shard key
Without shard key + sor2ng mongos Shard 1
Shard 2 Shard 3
Consider the shard cluster if: • Data exceeds the
storage capacity of a single node • Size of working set will soon exceed your RAM • Large amount of writes
Restric2ons • You cannot update a shard key
• You must use a shard key for a single update • Index on shard key
Ideal shard key • easily divisible. • will
distribute write opera2ons among the cluster • will make it possible for the mongos to return most query opera2ons directly from a single specific mongod instance
Choosing a shard key {
_id: "1", user_id: "2345652221", date_2me: "2012-‐10-‐04“, tweet_text: “Hello world” } Reliability
Choosing a shard key Ascending { TimeStamp:
12355232, … }
Choosing a shard key Low-‐cardinality key {
Con:nent: “Europe”, Name: “Tom”, … } Zip code?
Demo
Any questions? mailto:
[email protected]