Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Paradoxes and theorems every developer should know
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Joshua Thijssen
June 21, 2016
Technology
290
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Paradoxes and theorems every developer should know
Joshua Thijssen
June 21, 2016
More Decks by Joshua Thijssen
See All by Joshua Thijssen
RAFT: A story on how clusters of computers keep your data in sync
jaytaph
0
73
The first few milliseconds of HTTPS
jaytaph
0
300
Paradoxes and theorems every developer should know
jaytaph
0
350
Paradoxes and theorems every developer should know
jaytaph
0
790
The first few milliseconds of HTTPS - PHPNW16
jaytaph
1
290
compiler_-_php010.pdf
jaytaph
0
160
Introduction into interpreters, compilers and JIT
jaytaph
1
380
Paradoxes and theorems every developer should know
jaytaph
1
980
Are you out of memory, or have plenty to spare?
jaytaph
0
270
Other Decks in Technology
See All in Technology
SONiC Scale-Up Working Group から探る Scale-UpやUltraEthernet機能の実装方法
ebiken
PRO
2
410
Agile and AI Redmine Japan 2026
hiranabe
3
280
Oracle AI Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
6
2k
Kiro Ambassador を目指す話
k_adachi_01
0
110
白金鉱業Meetup_Vol.24_「AIエージェントは分けるほど良い」は本当か? / Is it true that “the more you divide AI agents, the better”?
brainpadpr
1
410
マルチアカウント環境での コーディングエージェントを使った障害調査が大変なので AIエージェントにReadOnly権限を付与してみた / ReadOnly AI Agents for Multi-Account AWS Incident Response
yamaguchitk333
2
110
SONiCで構築・運用する生成AI向けパブリッククラウドネットワーク ~実装編~
sonic
0
280
2026TECHFRESH畢業分享會 - Lightning Talk - 資料也要 CI/CD? 用 Airbyte 自動化資料同步
line_developers_tw
PRO
0
1.3k
AI時代のコスト管理を考えよう〜明日から使える実践AWSノウハウ~
yoshimi0227
0
310
気軽に使える"情報のハブ"としてのNotion活用 〜フロー情報の集積点 と、 Claude Code × Notion AI〜
syucream
1
150
Kubernetesにおける学習基盤とLLMOpsの概要
ry
1
320
スタートアップにAmazon EKSは早すぎる? マルチプロダクト戦略を加速する Platform Engineeringの実践 / Is Amazon EKS Too Soon for Startups? Practical Platform Engineering to Accelerate a Multi-Product Strategy
elmodev09
0
370
Featured
See All Featured
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
1
390
Ten Tips & Tricks for a 🌱 transition
stuffmc
0
140
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
160
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
530
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
780
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
11k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.8k
Claude Code のすすめ
schroneko
67
230k
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
1
1.3k
コードの90%をAIが書く世界で何が待っているのか / What awaits us in a world where 90% of the code is written by AI
rkaga
62
44k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
55k
Transcript
1 Joshua Thijssen jaytaph <?php namespace
2 Joshua Thijssen Consultant and trainer @ NoxLogic Founder of
TechAnalyze.io Symfony Rainbow Books author Mastering the SPL author Blog: http://adayinthelifeof.nl Email:
[email protected]
Twitter: @jaytaph Tech nalyze WWW.TECHANALYZE.IO
3 https://dutchtechrecruitment.nl/ Text
Disclaimer: I'm not a (mad) scientist nor a mathematician. 4
German Tank Problem 5
6
6 15
7
7 53 72 8 15
8 k = number of elements m = largest number
72 + (72 / 4) - 1 = 89 9
10 Intelligence Statistics Actual June 1940 1000 169 June 1941
1550 244 August 1942 1550 327 https://en.wikipedia.org/wiki/German_tank_problem
10 Intelligence Statistics Actual June 1940 1000 169 June 1941
1550 244 August 1942 1550 327 https://en.wikipedia.org/wiki/German_tank_problem 122
10 Intelligence Statistics Actual June 1940 1000 169 June 1941
1550 244 August 1942 1550 327 https://en.wikipedia.org/wiki/German_tank_problem 122 271
10 Intelligence Statistics Actual June 1940 1000 169 June 1941
1550 244 August 1942 1550 327 https://en.wikipedia.org/wiki/German_tank_problem 122 271 342
11
11 ➡ Data leakage.
11 ➡ Data leakage. ➡ User-id's, invoice-id's, etc
11 ➡ Data leakage. ➡ User-id's, invoice-id's, etc ➡ Used
to approximate the number of iPhones sold in 2008.
11 ➡ Data leakage. ➡ User-id's, invoice-id's, etc ➡ Used
to approximate the number of iPhones sold in 2008. ➡ Calculate approximations of datasets with (incomplete) information.
12
➡ Avoid (semi) sequential data to be leaked. ➡ Adding
randomness and offsets will NOT solve the issue. ➡ Use UUIDs (better: timebased short IDs, you don't need UUIDs) 13
14 Collecting (big) data is easy Analyzing big data is
the hard part.
Confirmation Bias 15
2 4 6 16 Z={…,−2,−1,0,1,2,…}
21% 17
18 5 8 ? ? If a card shows an
even number on one face, then its opposite face is blue.
< 10% 19
20 coke beer 35 17 If you drink beer then
you must be 18 yrs or older.
20 coke beer 35 17 If you drink beer then
you must be 18 yrs or older.
20 coke beer 35 17 If you drink beer then
you must be 18 yrs or older.
Cognitive Adaption for social exchange 21
hint: Try and place your "technical problem" in a more
social context. 22
BDD 23
24 5 8 ? ? If a card shows an
even number on one face, then its opposite face is blue.
24 5 8 ? ? If a card shows an
even number on one face, then its opposite face is blue.
24 5 8 ? ? If a card shows an
even number on one face, then its opposite face is blue.
TESTING 25
26 ➡ Step 1: Write code ➡ Step 2: Write
tests ➡ Step 3: Profit
public function isLeapYeap($year) { return ($year % 4 == 0);
} 27 https://www.sundoginteractive.com/blog/confirmation-bias-in-unit-testing testIs1996ALeapYeap(); testIs2000ALeapYeap(); testIs2004ALeapYeap(); testIs2008ALeapYeap(); testIs2012ALeapYeap(); testIs1997NotALeapYear(); testIs1998NotALeapYear(); testIs2001NotALeapYear(); testIs2013NotALeapYear();
public function isLeapYeap($year) { return ($year % 4 == 0);
} 27 https://www.sundoginteractive.com/blog/confirmation-bias-in-unit-testing testIs1996ALeapYeap(); testIs2000ALeapYeap(); testIs2004ALeapYeap(); testIs2008ALeapYeap(); testIs2012ALeapYeap(); testIs1997NotALeapYear(); testIs1998NotALeapYear(); testIs2001NotALeapYear(); testIs2013NotALeapYear();
public function isLeapYeap($year) { return ($year % 4 == 0);
} 28 https://www.sundoginteractive.com/blog/confirmation-bias-in-unit-testing
29 ➡ Tests where written based on actual code. ➡
Tests where written to CONFIRM actual code, not to DISPROVE actual code!
30 TDD
31 ➡ Step 1: Write tests ➡ Step 2: Write
code ➡ Step 3: Profit, as less prone to confirmation bias (as there is nothing to bias!)
Birthday paradox 32
Question: 33 > 50% chance 4 march 18 september 5
december 25 juli 2 februari 9 october
23 people 34
366 persons = 100% 35
Collisions occur more often than you realize 36
Hash collisions 37
16 bits means 300 values before >50% collision probability 38
Watch out for: 39 ➡ Too small hashes. ➡ Unique
data. ➡ Your data might be less "protected" as you might think.
Heisenberg uncertainty principle 40
It's not about star trek (heisenberg compensators) 41
nor crystal meth 42
43 x position p momentum (mass x velocity) ħ 0.0000000000000000000000000000000001054571800
(1.054571800E-34)
The more precise you know one property, the less you
know the other. 44
This is NOT about observing! 45
Observer effect 46 heisenbug
It's about trade-offs 47
Benford's law 48
Numbers beginning with 1 are more common than numbers beginning
with 9. 49
Default behavior for natural numbers. 50
51
find . -name \*.php -exec wc -l {} \; |
sort | cut -b 1 | uniq -c 52
find . -name \*.php -exec wc -l {} \; |
sort | cut -b 1 | uniq -c 52 1073 1 886 2 636 3 372 4 352 5 350 6 307 7 247 8 222 9
53
Bayesian filtering 54
What's the probability of an event, based on conditions that
might be related to the event. 55
What is the chance that a message is spam when
it contains certain words? 56
57 P(A|B) P(A) P(B) P(B|A) Probability event A, if event
B (conditional) Probability event A Probability event B Probability event B, if event A
58 ➡ Figure out the probability a {mail, tweet, comment,
review} is {spam, negative} etc.
➡ 10 out of 50 comments are "negative". ➡ 25
out of 50 comments uses the word "horrible". ➡ 8 comments with the word "horrible" are marked as "negative". 59
60 negative "horrible" 10 comments 25 comments 8 comments
61
62 ➡ More words? ➡ Complex algorithm, ➡ but, we
can assume that words are not independent from eachother ➡ Naive Bayes approach
63
64 We must know beforehand which comments are negative?
TRAINING SET 65
66 "Your product is horrible and does not work properly.
Also, you suck." "I had a horrible experience with another product. But yours really worked well. Thank you!" Negative: Positive:
67 ➡ You might want to filter stop-words first. ➡
You might want to make sure negatives are handled property "not great" => negative. ➡ Bonus points if you can spot sarcasm.
➡ Collaborative filtering (mahout): ➡ If user likes product A,
B and C, what is the chance that they like product D? 68
69 Mess up your (training) data, and nothing can save
you (except a training set reboot)
70 ➡ 30% change of acceptance for CFP ➡ 5
CFP's Binomial probability
70 ➡ 30% change of acceptance for CFP ➡ 5
CFP's 1 - (0.7 * 0.7 * 0.7 * 0.7 * 0.7) = 1 - 0.168 = 0.832 83% on getting selected at least once! Binomial probability
http://farm1.static.flickr.com/73/163450213_18478d3aa6_d.jpg 71
72 Find me on twitter: @jaytaph Find me for development
and training: www.noxlogic.nl / www.techademy.nl Find me on email:
[email protected]
Find me for blogs: www.adayinthelifeof.nl