Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Hardest Problem in Data
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Ronnie Chen
August 24, 2017
Technology
240
0
Share
The Hardest Problem in Data
Ronnie Chen
August 24, 2017
More Decks by Ronnie Chen
See All by Ronnie Chen
ChaosConf 2018
ronnieftw
4
1.8k
devopsdays MSP 2018: Staying Alive
ronnieftw
1
680
Luck Driven Development: Building for Serendipity in Slack's Data Platform
ronnieftw
1
510
Staying Alive: Patterns for Failure Management From the Bottom of the Ocean
ronnieftw
0
280
Scaling Data at Slack: A Series of Unfortunate Events
ronnieftw
0
1.7k
Other Decks in Technology
See All in Technology
Babylon.js Japan Activities (2026/4)
limes2018
0
170
I ran an automated simulation of fake news spread using OpenClaw.
zzzzico
1
930
【PHPカンファレンス小田原2026】Webアプリケーションエンジニアにも知ってほしい オブザーバビリティ の本質
fendo181
0
190
主催・運営として"場をつくる”というアウトプットのススメ
_mossann_t
0
110
制約を設計する - 非決定性との境界線 / Designing constraints
soudai
PRO
6
1.8k
Oracle Cloud Infrastructure:2026年3月度サービス・アップデート
oracle4engineer
PRO
0
380
Kubernetes基盤における開発者体験 とセキュリティの両⽴ / Balancing developer experience and security in a Kubernetes-based environment
chmikata
0
170
【関西電力KOI×VOLTMIND 生成AIハッカソン】空間AIブレイン ~⼤阪おばちゃんフィジカルAIに続く道~
tanakaseiya
0
160
Cortex Code君、今日から内製化支援担当ね。
coco_se
0
270
2026-04-02 IBM Bobオンボーディング入門
yutanonaka
0
210
TanStack Start エコシステムの現在地 / TanStack Start Ecosystem 2026
iktakahiro
1
300
ログ基盤・プラグイン・ダッシュボード、全部整えた。でも最後は人だった。
makikub
2
230
Featured
See All Featured
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.4k
Designing for Timeless Needs
cassininazir
0
180
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
1
2.5k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
199
73k
Navigating Team Friction
lara
192
16k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
The Invisible Side of Design
smashingmag
302
51k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
170
A designer walks into a library…
pauljervisheath
211
24k
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.5k
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
430
The Pragmatic Product Professional
lauravandoore
37
7.2k
Transcript
The Hardest Problem in Data Ronnie Chen @rondoftw Data Engineering
Slack 1 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
2 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
→ Machine learning → Predictive modeling → Neural networks →
Artificial intelligence 3 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting ?! 4 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
5 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
A simple counting problem 6 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
The Rules: 1. Only one number 2. Convince me it's
correct 7 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
How many friends do you have? 8 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
Will I get the same number if... !"#$ I ask
every person you know if they consider you their friend? 9 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Will I get the same number if... ! " I
ask every person that knows you if they think you would consider them a friend? 10 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Is this the number of people that you'd tell a
secret to? 11 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
But it depends!! 12 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
How many users do we have? 13 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users 14 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
user_id name email deleted 1 Alice alice@*** 2 Bob bob@***
true 3 Carol 15 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE deleted != true AND email
!= null 16 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE last_active > 2017-07-24 17 —
WriteSpeakCode 2017 | Ronnie Chen @rondoftw
user_id email 12334
[email protected]
38602
[email protected]
52981
[email protected]
67640
[email protected]
18 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
¯\_(ϑ)_/¯ 19 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What are you not even aware of? 20 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw
Okay, I get it. But what's the big deal? 21
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
26% of professional computing jobs were held by women in
2016 22 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
23 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Numbers give you authority and the appearance of objectivity 24
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting is power. 25 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
26 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counts can determine funding, set agendas, and shift priorities 27
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Machine learning is like money laundering for bias — Maciej
Cegłowski, founder of @Pinboard 28 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
29 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
30 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What you count determines what is important. 31 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw