Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Hardest Problem in Data
Search
Ronnie Chen
August 24, 2017
Technology
0
220
The Hardest Problem in Data
Ronnie Chen
August 24, 2017
Tweet
Share
More Decks by Ronnie Chen
See All by Ronnie Chen
ChaosConf 2018
ronnieftw
4
1.8k
devopsdays MSP 2018: Staying Alive
ronnieftw
1
590
Luck Driven Development: Building for Serendipity in Slack's Data Platform
ronnieftw
1
470
Staying Alive: Patterns for Failure Management From the Bottom of the Ocean
ronnieftw
0
240
Scaling Data at Slack: A Series of Unfortunate Events
ronnieftw
0
1.5k
Other Decks in Technology
See All in Technology
テストを実施する前に考えるべきテストの話 / Thinking About Testing Before You Test
nihonbuson
PRO
15
2.1k
GoogleのAI Agent
shukob
0
140
SmartHRの複数のチームにおけるMCPサーバーの活用事例と課題
yukisnow1823
2
1.2k
OSMnx Galleryの紹介
mopinfish
0
150
S3 Tables を図解でやさしくおさらい~基本から QuickSight 連携まで/s3-tables-illustrated-basics-quicksight
emiki
2
340
大手企業のAIツール導入の壁を越えて:サイバーエージェントのCursor活用戦略
gunta
21
5.9k
プラットフォームとしての Datadog / Datadog as Platforms
aoto
PRO
1
340
MCP で繋ぐ Figma とデザインシステム〜LLM を使った UI 実装のリアル〜
kimuson
2
1.3k
AIコードエディタは開発を変えるか?Cursorをチームに導入して1ヶ月経った本音
ota1022
1
710
ソフトウェアは捨てやすく作ろう/Let's make software easy to discard
sanogemaru
10
5.8k
Contract One Dev Group 紹介資料
sansan33
PRO
0
6k
プロジェクトマネジメント実践論|現役エンジニアが語る!~チームでモノづくりをする時のコツとは?~
mixi_engineers
PRO
3
180
Featured
See All Featured
Statistics for Hackers
jakevdp
799
220k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.7k
Code Review Best Practice
trishagee
68
18k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
137
34k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
15
890
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
106
19k
Building a Modern Day E-commerce SEO Strategy
aleyda
40
7.3k
Designing for Performance
lara
608
69k
Raft: Consensus for Rubyists
vanstee
137
7k
Writing Fast Ruby
sferik
628
61k
A Tale of Four Properties
chriscoyier
159
23k
Transcript
The Hardest Problem in Data Ronnie Chen @rondoftw Data Engineering
Slack 1 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
2 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
→ Machine learning → Predictive modeling → Neural networks →
Artificial intelligence 3 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting ?! 4 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
5 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
A simple counting problem 6 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
The Rules: 1. Only one number 2. Convince me it's
correct 7 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
How many friends do you have? 8 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
Will I get the same number if... !"#$ I ask
every person you know if they consider you their friend? 9 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Will I get the same number if... ! " I
ask every person that knows you if they think you would consider them a friend? 10 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Is this the number of people that you'd tell a
secret to? 11 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
But it depends!! 12 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
How many users do we have? 13 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users 14 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
user_id name email deleted 1 Alice alice@*** 2 Bob bob@***
true 3 Carol 15 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE deleted != true AND email
!= null 16 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE last_active > 2017-07-24 17 —
WriteSpeakCode 2017 | Ronnie Chen @rondoftw
user_id email 12334
[email protected]
38602
[email protected]
52981
[email protected]
67640
[email protected]
18 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
¯\_(ϑ)_/¯ 19 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What are you not even aware of? 20 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw
Okay, I get it. But what's the big deal? 21
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
26% of professional computing jobs were held by women in
2016 22 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
23 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Numbers give you authority and the appearance of objectivity 24
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting is power. 25 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
26 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counts can determine funding, set agendas, and shift priorities 27
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Machine learning is like money laundering for bias — Maciej
Cegłowski, founder of @Pinboard 28 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
29 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
30 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What you count determines what is important. 31 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw