Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Hardest Problem in Data
Search
Ronnie Chen
August 24, 2017
Technology
0
230
The Hardest Problem in Data
Ronnie Chen
August 24, 2017
Tweet
Share
More Decks by Ronnie Chen
See All by Ronnie Chen
ChaosConf 2018
ronnieftw
4
1.8k
devopsdays MSP 2018: Staying Alive
ronnieftw
1
650
Luck Driven Development: Building for Serendipity in Slack's Data Platform
ronnieftw
1
500
Staying Alive: Patterns for Failure Management From the Bottom of the Ocean
ronnieftw
0
270
Scaling Data at Slack: A Series of Unfortunate Events
ronnieftw
0
1.6k
Other Decks in Technology
See All in Technology
New Relic 1 年生の振り返りと Cloud Cost Intelligence について #NRUG
play_inc
0
240
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
9.9k
通勤手当申請チェックエージェント開発のリアル
whisaiyo
3
470
[Data & AI Summit '25 Fall] AIでデータ活用を進化させる!Google Cloudで作るデータ活用の未来
kirimaru
0
3.9k
「もしもデータ基盤開発で『強くてニューゲーム』ができたなら今の僕はどんなデータ基盤を作っただろう」
aeonpeople
0
250
AWSの新機能をフル活用した「re:Inventエージェント」開発秘話
minorun365
2
460
[Neurogica] 採用ポジション/ Recruitment Position
neurogica
1
130
ペアーズにおけるAIエージェント 基盤とText to SQLツールの紹介
hisamouna
2
1.7k
Amazon Connect アップデート! AIエージェントにMCPツールを設定してみた!
ysuzuki
0
140
AI with TiDD
shiraji
1
300
ActiveJobUpdates
igaiga
1
320
たまに起きる外部サービスの障害に備えたり備えなかったりする話
egmc
0
420
Featured
See All Featured
Why Our Code Smells
bkeepers
PRO
340
57k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Claude Code どこまでも/ Claude Code Everywhere
nwiizo
61
50k
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
0
960
Testing 201, or: Great Expectations
jmmastey
46
7.8k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.1k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
110
Thoughts on Productivity
jonyablonski
73
5k
Odyssey Design
rkendrick25
PRO
0
440
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
100
HDC tutorial
michielstock
0
280
Transcript
The Hardest Problem in Data Ronnie Chen @rondoftw Data Engineering
Slack 1 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
2 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
→ Machine learning → Predictive modeling → Neural networks →
Artificial intelligence 3 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting ?! 4 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
5 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
A simple counting problem 6 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
The Rules: 1. Only one number 2. Convince me it's
correct 7 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
How many friends do you have? 8 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
Will I get the same number if... !"#$ I ask
every person you know if they consider you their friend? 9 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Will I get the same number if... ! " I
ask every person that knows you if they think you would consider them a friend? 10 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Is this the number of people that you'd tell a
secret to? 11 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
But it depends!! 12 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
How many users do we have? 13 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users 14 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
user_id name email deleted 1 Alice alice@*** 2 Bob bob@***
true 3 Carol 15 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE deleted != true AND email
!= null 16 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE last_active > 2017-07-24 17 —
WriteSpeakCode 2017 | Ronnie Chen @rondoftw
user_id email 12334
[email protected]
38602
[email protected]
52981
[email protected]
67640
[email protected]
18 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
¯\_(ϑ)_/¯ 19 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What are you not even aware of? 20 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw
Okay, I get it. But what's the big deal? 21
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
26% of professional computing jobs were held by women in
2016 22 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
23 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Numbers give you authority and the appearance of objectivity 24
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting is power. 25 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
26 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counts can determine funding, set agendas, and shift priorities 27
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Machine learning is like money laundering for bias — Maciej
Cegłowski, founder of @Pinboard 28 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
29 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
30 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What you count determines what is important. 31 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw