Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Hardest Problem in Data
Search
Ronnie Chen
August 24, 2017
Technology
0
210
The Hardest Problem in Data
Ronnie Chen
August 24, 2017
Tweet
Share
More Decks by Ronnie Chen
See All by Ronnie Chen
ChaosConf 2018
ronnieftw
4
1.8k
devopsdays MSP 2018: Staying Alive
ronnieftw
1
570
Luck Driven Development: Building for Serendipity in Slack's Data Platform
ronnieftw
1
440
Staying Alive: Patterns for Failure Management From the Bottom of the Ocean
ronnieftw
0
230
Scaling Data at Slack: A Series of Unfortunate Events
ronnieftw
0
1.5k
Other Decks in Technology
See All in Technology
次世代KYC活動報告 / 20250219-BizDay17-KYC-nextgen
oidfj
0
260
Developer Summit 2025 [14-D-1] Yuki Hattori
yuhattor
19
6.2k
データ資産をシームレスに伝達するためのイベント駆動型アーキテクチャ
kakehashi
PRO
2
550
2.5Dモデルのすべて
yu4u
2
880
データの品質が低いと何が困るのか
kzykmyzw
6
1.1k
株式会社EventHub・エンジニア採用資料
eventhub
0
4.3k
2025-02-21 ゆるSRE勉強会 Enhancing SRE Using AI
yoshiiryo1
1
370
AndroidXR 開発ツールごとの できることできないこと
donabe3
0
130
Amazon S3 Tablesと外部分析基盤連携について / Amazon S3 Tables and External Data Analytics Platform
nttcom
0
140
【Developers Summit 2025】プロダクトエンジニアから学ぶ、 ユーザーにより高い価値を届ける技術
niwatakeru
2
1.4k
RECRUIT TECH CONFERENCE 2025 プレイベント【高橋】
recruitengineers
PRO
0
160
開発スピードは上がっている…品質はどうする? スピードと品質を両立させるためのプロダクト開発の進め方とは #DevSumi #DevSumiB / Agile And Quality
nihonbuson
2
3k
Featured
See All Featured
[RailsConf 2023] Rails as a piece of cake
palkan
53
5.2k
Rails Girls Zürich Keynote
gr2m
94
13k
Designing on Purpose - Digital PM Summit 2013
jponch
117
7.1k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.1k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
366
25k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
The Cost Of JavaScript in 2023
addyosmani
47
7.3k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
45
9.4k
Documentation Writing (for coders)
carmenintech
67
4.6k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
Music & Morning Musume
bryan
46
6.3k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
10
1.3k
Transcript
The Hardest Problem in Data Ronnie Chen @rondoftw Data Engineering
Slack 1 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
2 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
→ Machine learning → Predictive modeling → Neural networks →
Artificial intelligence 3 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting ?! 4 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
5 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
A simple counting problem 6 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
The Rules: 1. Only one number 2. Convince me it's
correct 7 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
How many friends do you have? 8 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
Will I get the same number if... !"#$ I ask
every person you know if they consider you their friend? 9 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Will I get the same number if... ! " I
ask every person that knows you if they think you would consider them a friend? 10 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Is this the number of people that you'd tell a
secret to? 11 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
But it depends!! 12 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
How many users do we have? 13 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users 14 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
user_id name email deleted 1 Alice alice@*** 2 Bob bob@***
true 3 Carol 15 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE deleted != true AND email
!= null 16 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE last_active > 2017-07-24 17 —
WriteSpeakCode 2017 | Ronnie Chen @rondoftw
user_id email 12334
[email protected]
38602
[email protected]
52981
[email protected]
67640
[email protected]
18 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
¯\_(ϑ)_/¯ 19 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What are you not even aware of? 20 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw
Okay, I get it. But what's the big deal? 21
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
26% of professional computing jobs were held by women in
2016 22 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
23 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Numbers give you authority and the appearance of objectivity 24
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting is power. 25 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
26 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counts can determine funding, set agendas, and shift priorities 27
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Machine learning is like money laundering for bias — Maciej
Cegłowski, founder of @Pinboard 28 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
29 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
30 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What you count determines what is important. 31 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw