Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
MongoDB for Analytics
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
John Nunemaker
PRO
October 18, 2011
Programming
16
30k
MongoDB for Analytics
Presented at Mongo Chicago 2011.
John Nunemaker
PRO
October 18, 2011
Tweet
Share
More Decks by John Nunemaker
See All by John Nunemaker
AI: The stuff that nobody shows you
jnunemaker
PRO
3
450
Atom
jnunemaker
PRO
10
4.8k
MongoDB for Analytics
jnunemaker
PRO
11
1k
Addicted to Stable
jnunemaker
PRO
32
2.8k
MongoDB for Analytics
jnunemaker
PRO
21
2.3k
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.8k
Why NoSQL?
jnunemaker
PRO
10
990
Don't Repeat Yourself, Repeat Others
jnunemaker
PRO
7
3.5k
I Have No Talent
jnunemaker
PRO
14
1k
Other Decks in Programming
See All in Programming
AIとペアプロして処理時間を97%削減した話 #pyconshizu
kashewnuts
1
250
GC言語のWasm化とComponent Modelサポートの実践と課題 - Scalaの場合
tanishiking
0
120
encoding/json/v2のUnmarshalはこう変わった:内部実装で見る設計改善
kurakura0916
0
420
エンジニアの「手元の自動化」を加速するn8n 2026.02.27
symy2co
0
160
The Ralph Wiggum Loop: First Principles of Autonomous Development
sembayui
0
3.7k
TipKitTips
ktcryomm
0
170
LangChain4jとは一味違うLangChain4j-CDI
kazumura
1
190
メタプログラミングで実現する「コードを仕様にする」仕組み/nikkei-tech-talk43
nikkei_engineer_recruiting
0
190
ポーリング処理廃止によるイベント駆動アーキテクチャへの移行
seitarof
3
1.1k
「やめとこ」がなくなった — 1月にZennを始めて22本書いた AI共創開発のリアル
atani14
0
400
Rで始めるML・LLM活用入門
wakamatsu_takumu
0
190
[SF Ruby Feb'26] The Silicon Heel
palkan
0
110
Featured
See All Featured
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
Utilizing Notion as your number one productivity tool
mfonobong
4
260
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.9k
GitHub's CSS Performance
jonrohan
1032
470k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.2k
Visualization
eitanlees
150
17k
Mind Mapping
helmedeiros
PRO
1
120
Google's AI Overviews - The New Search
badams
0
930
Deep Space Network (abreviated)
tonyrice
0
92
Claude Code のすすめ
schroneko
67
220k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
199
73k
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
260
Transcript
Ordered List John Nunemaker MongoChi 2011 October 18, 2011 MongoDB
for Analytics A loving conversation with @jnunemaker
Background As presented through interpretive dance
None
None
None
~1 month Of evenings and weekends
~4 dog years Since public launch
~6 tiny servers 2 web, 2 app, 2 db
~1-2 Million Page views per day
None
None
Implementation Imma show you how we do what we do
baby
Doing It Live No aggregate querying
get('/track.gif') do Hit.record(...) TrackGif end
class Hit def record site.atomic_update(site_updates) Resolution.record(self) Technology.record(self) Location.record(self) Referrer.record(self) Content.record(self)
Search.record(self) Notification.record(self) View.record(self) end end
class Resolution def record(hit) query = {'_id' => "..."} update
= {'$inc' => {}} update['$inc']["sx.#{hit.screenx}"] = 1 update['$inc']["bx.#{hit.browserx}"] = 1 update['$inc']["by.#{hit.browsery}"] = 1 collection(hit.created_on) .update(query, update, :upsert => true) end end end
Pros
Pros Space
Pros Space RAM
Pros Space RAM Reads
Pros Space RAM Reads Live
Cons
Cons Writes
Cons Writes Constraints
Cons Writes Constraints More Forethought
Cons Writes Constraints More Forethought No raw data
Time Frame Minute, hour, month, day, year, forever?
# of Variations One document vs many
Single Document Per Time Frame
None
{ "t" => 336381, "u" => 158951, "2011" => {
"02" => { "18" => { "t" => 9, "u" => 6 } } } }
{ '$inc' => { 't' => 1, 'u' => 1,
'2011.02.18.t' => 1, '2011.02.18.u' => 1, } }
Single Document For all ranges in time frame
None
{ "_id" =>"...:10", "bx" => { "320" => 85, "480"
=> 318, "800" => 1938, "1024" => 5033, "1280" => 6288, "1440" => 2323, "1600" => 3817, "2000" => 137 }, "by" => { "480" => 2205, "600" => 7359,
"600" => 7359, "768" => 4515, "900" => 3833, "1024"
=> 2026 }, "sx" => { "320" => 191, "480" => 179, "800" => 195, "1024" => 1059, "1280" => 5861, "1440" => 3533, "1600" => 7675, "2000" => 1279 } }
{ '$inc' => { 'sx.1440' => 1, 'bx.1280' => 1,
'by.768' => 1, } }
Many Documents Search terms, content, referrers...
None
[ { "_id" => "<oid>:<hash>", "t" => "ruby class variables",
"sid" => BSON::ObjectId('<oid>'), "v" => 352 }, { "_id" => "<oid>:<hash>", "t" => "ruby unless", "sid" => BSON::ObjectId('<oid>'), "v" => 347 }, ]
Writes {'_id' => "#{site_id}:#{hash}"}
Reads [['sid', 1], ['v', -1]]
Growth The best laid plans of mice and men
Partition Hot Data Currently using collections for time frames
Bigger, Faster Server More CPU, RAM, Disk Space
Users Sites Content Referrers Terms Engines Resolutions Locations Users Sites
Content Referrers Terms Engines Resolutions Locations
Partition by Function Spread writes across a few servers
Users Sites Content Referrers Terms Engines Resolutions Locations
Partition by Server Spread writes across a ton of servers,
way down the road, not worried yet
Ordered List Thank you!
[email protected]
John Nunemaker MongoChi 2011 October
18, 2011 @jnunemaker