Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bayesian Statistical Analysis: A Gentle Introdu...
Search
Chris Fonnesbeck
December 05, 2011
Research
4
640
Bayesian Statistical Analysis: A Gentle Introduction
Get to know the Reverend Bayes.Reverend
Chris Fonnesbeck
December 05, 2011
Tweet
Share
More Decks by Chris Fonnesbeck
See All by Chris Fonnesbeck
Statistical Thinking for Data Science
fonnesbeck
5
1.2k
Structured Decision-making and Adaptive Management For The Control Of Infectious Disease
fonnesbeck
3
110
Estimating Microbial Diversity
fonnesbeck
0
120
Other Decks in Research
See All in Research
Stealing LUKS Keys via TPM and UUID Spoofing in 10 Minutes - BSides 2025
anykeyshik
0
160
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observation and Wikipedia
satai
3
340
ドメイン知識がない領域での自然言語処理の始め方
hargon24
1
140
SREのためのテレメトリー技術の探究 / Telemetry for SRE
yuukit
9
1.6k
論文紹介:Safety Alignment Should be Made More Than Just a Few Tokens Deep
kazutoshishinoda
0
120
スキマバイトサービスにおける現場起点でのデザインアプローチ
yoshioshingyouji
0
260
国際論文を出そう!ICRA / IROS / RA-L への論文投稿の心構えとノウハウ / RSJ2025 Luncheon Seminar
koide3
10
6k
「どう育てるか」より「どう働きたいか」〜スクラムマスターの最初の一歩〜
hirakawa51
0
1k
Combinatorial Search with Generators
kei18
0
1.2k
言語モデルの地図:確率分布と情報幾何による類似性の可視化
shimosan
8
2.1k
Unsupervised Domain Adaptation Architecture Search with Self-Training for Land Cover Mapping
satai
3
290
Minimax and Bayes Optimal Best-arm Identification: Adaptive Experimental Design for Treatment Choice
masakat0
0
190
Featured
See All Featured
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.3k
Scaling GitHub
holman
463
140k
Side Projects
sachag
455
43k
Mobile First: as difficult as doing things right
swwweet
225
10k
Designing for Performance
lara
610
69k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.1k
Agile that works and the tools we love
rasmusluckow
331
21k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
A Tale of Four Properties
chriscoyier
162
23k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
670
Speed Design
sergeychernyshev
32
1.2k
Code Reviewing Like a Champion
maltzj
527
40k
Transcript
Bayesian Statistical Analysis A Gentle Introduction Center for Quantitative Sciences
Workshop 18 November 2011 Christopher J. Fonnesbeck Monday, December 5, 11
What is Bayesian Inference? Monday, December 5, 11
Practical methods for making inferences from data using probability models
for quantities we observe and about which we wish to learn. Gelman et al., 2004 Monday, December 5, 11
Rev. Thomas Bayes Monday, December 5, 11
Rev. Thomas Bayes Simon Laplace Monday, December 5, 11
Conclusions in terms of probability statements p( |y) unknowns observations
Monday, December 5, 11
Classical inference conditions on unknown parameter p(y| ) unknowns observations
Monday, December 5, 11
Classical vs Bayesian Statistics Monday, December 5, 11
Frequentist Monday, December 5, 11
Frequentist observations random Monday, December 5, 11
Frequentist model, parameters fixed Monday, December 5, 11
Frequentist Inference Monday, December 5, 11
Choose an estimator ˆ µ = P xi n based
on frequentist (asymptotic) criteria Monday, December 5, 11
Choose a test statistic based on frequentist (asymptotic) criteria t
= ¯ x µ s/ p n Monday, December 5, 11
Bayesian Monday, December 5, 11
Bayesian observations fixed Monday, December 5, 11
Bayesian model, parameters “random” Monday, December 5, 11
Components of Bayesian Statistics Monday, December 5, 11
Specify full probability model 1 Pr(y| )Pr( |⇥)Pr(⇥) Monday, December
5, 11
data y Monday, December 5, 11
data y covariates X Monday, December 5, 11
data y covariates X parameters ✓ Monday, December 5, 11
data y covariates X parameters ✓ missing data ˜ y
Monday, December 5, 11
2 Calculate posterior distribution Pr( |y) Monday, December 5, 11
3Check model for lack of fit Monday, December 5, 11
Why Bayes? ? Monday, December 5, 11
“... the Bayesian approach is attractive because it is useful.
Its usefulness derives in large measure from its simplicity. Its simplicity allows the investigation of far more complex models than can be handled by the tools in the classical toolbox.” Link and Barker (2010) Monday, December 5, 11
coherence X ˜ y y ✓ Monday, December 5, 11
Interpretation Monday, December 5, 11
Pr( ¯ Y 1.96 ⇥ ⇥ n < µ <
¯ Y + 1.96 ⇥ ⇥ n ) = 0.95 Confidence Interval Pr(a(Y ) < ✓ < b(Y )|✓) = 0.95 Monday, December 5, 11
Credible Interval Pr(a(y) < ✓ < b(y)|Y = y) =
0.95 Monday, December 5, 11
Uncertainty Monday, December 5, 11
C alpha N z b_psi beta a_psi pi mu psi
Ntotal occupied a b Ndist psi z alpha pi N beta mu occupied N alpha beta N alpha beta Complex Models Monday, December 5, 11
Probability Monday, December 5, 11
Pr(A) = m n A = an event of interest
m = no. of favourable outcomes n = total no. of possible outcomes (1) classical Monday, December 5, 11
all elementary events are equally likely Monday, December 5, 11
Pr(A) = lim n→∞ m n n = no. of
identical and independent trials m = no. of times A has occurred (2) frequentist Monday, December 5, 11
Between 1745 and 1770 there were 241,945 girls and 251,527
boys born in Paris Monday, December 5, 11
A = “Chris has Type A blood” Monday, December 5,
11
A = “Titans will win Superbowl XLVI” Monday, December 5,
11
A = “The prevalence of diabetes in Nashville is >
0.15” Monday, December 5, 11
(3) subjective Pr(A) Monday, December 5, 11
Measure of one’s uncertainty regarding the occurrence of A Pr(A)
Monday, December 5, 11
Pr(A|H) Monday, December 5, 11
A = “It is raining in Atlanta” Monday, December 5,
11
Pr(A|H) = 0.5 Monday, December 5, 11
Pr( A|H ) = ⇢ 0 . 4 if raining
in Nashville 0 . 25 otherwise Monday, December 5, 11
Pr(A|H) = 1, if raining 0, otherwise Monday, December 5,
11
S A Pr(A) = area of A area of S
Monday, December 5, 11
S A B A ∩ B Pr(A ⇥ B) =
Pr(A) + Pr(B) Pr(A ⇤ B) Monday, December 5, 11
A A ∩ B Pr(B|A) = Pr(A B) Pr(A) Monday,
December 5, 11
A A ∩ B conditional probability Pr(B|A) = Pr(A B)
Pr(A) Monday, December 5, 11
Independence Pr(B|A) = Pr(B) Monday, December 5, 11
S A B A ∩ B Pr(B|A) = Pr(A B)
Pr(A) Monday, December 5, 11
S A B A ∩ B Pr(A|B) = Pr(A B)
Pr(B) Pr(B|A) = Pr(A B) Pr(A) Monday, December 5, 11
Pr(A B) = Pr(A|B)Pr(B) = Pr(B|A)Pr(A) Monday, December 5, 11
Bayes Theorem Pr(B|A) = Pr(A|B)Pr(B) Pr(A) Monday, December 5, 11
Bayes Theorem Pr( |y) = Pr(y| )Pr( ) Pr(y) Posterior
Probability Prior Probability Likelihood of Observations Normalizing Constant Monday, December 5, 11
Bayes Theorem Pr( |y) = Pr(y| )Pr( ) R Pr(y|
)Pr( )d Monday, December 5, 11
“proportional to” Pr( |y) Pr(y| )Pr( ) Monday, December 5,
11
Pr( |y) Pr(y| )Pr( ) Posterior Prior Likelihood Monday, December
5, 11
information p( |y) p(y| )p( ) Monday, December 5, 11
“Following observation of , the likelihood contains all experimental information
from about the unknown .” θ y y L(✓|y) Monday, December 5, 11
binomial model data parameter sampling distribution of X p(X|✓) =
✓ N n ◆ ✓x (1 ✓)N x Monday, December 5, 11
binomial model likelihood function for θ L(✓|X) = ✓ N
n ◆ ✓x (1 ✓)N x Monday, December 5, 11
prior distribution p(θ|y) ∝ p(y|θ)p(θ) Monday, December 5, 11
Prior as population distribution Monday, December 5, 11
Monday, December 5, 11
Prior as information state Monday, December 5, 11
Monday, December 5, 11
All plausible values Monday, December 5, 11
Between 1745 and 1770 there were 241,945 girls and 251,527
boys born in Paris Monday, December 5, 11
Bayesian analysis is subjective Monday, December 5, 11
Statistical analysis is subjective Monday, December 5, 11
“... all forms of statistical inference make assumptions, assumptions which
can only be tested very crudely and can almost never be verified.” - Robert E. Kass Monday, December 5, 11
3 Model checking Monday, December 5, 11
1.5 2.0 2.5 0.0 0.2 0.4 0.6 0.8 1.0 x
p(x) separation Monday, December 5, 11
source: Gelman et al. 2008 Monday, December 5, 11
weakly-informative prior -4 -2 0 2 4 0.0 0.1 0.2
0.3 0.4 xrange Pr(x) Monday, December 5, 11
source: Gelman et al. 2008 Monday, December 5, 11
example: genetic probabilities Monday, December 5, 11
X-linked recessive Monday, December 5, 11
Monday, December 5, 11
affected carrier no gene unknown Woman Husband Brother Mother is
the woman a carrier? Monday, December 5, 11
Pr(θ = 1) = Pr(θ = 0) = 1 2
Pr(θ = 1) Pr(θ = 0) = 1 prior odds Monday, December 5, 11
affected carrier no gene unknown Woman Husband Brother Son Son
Mother Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Pr(y1 = 0, y2 = 0|θ = 0) = 1 Monday, December 5, 11
Pr(y1 = 0, y2 = 0|θ = 1) = (0.5)(0.5)
= 0.25 Pr(y1 = 0, y2 = 0|θ = 0) = 1 “likelihood ratio” p(y1 = 0, y2 = 0|θ = 1) p(y1 = 0, y2 = 0|θ = 0) = 0.25 1 = 1/4 Monday, December 5, 11
what about Mom? Monday, December 5, 11
what about Mom? y = {y1 = 0, y2 =
0} Pr( = 1|y) = Pr(y| = 1)Pr( = 1) Pr(y) = Pr(y| = 1)Pr( = 1) P ✓ Pr(y| )Pr( ) Monday, December 5, 11
y = {y1 = 0, y2 = 0} Monday, December
5, 11
Pr( = 1|y) = p(y| = 1)Pr( = 1) p(y|
= 1)Pr( = 1) + p(y| = 0)Pr( = 0) y = {y1 = 0, y2 = 0} Monday, December 5, 11
Pr( = 1|y) = p(y| = 1)Pr( = 1) p(y|
= 1)Pr( = 1) + p(y| = 0)Pr( = 0) = (0.25)(0.5) (0.25)(0.5) + (1.0)(0.5) = 0.125 0.625 = 0.2 y = {y1 = 0, y2 = 0} Monday, December 5, 11
3rd unaffected son? Pr( = 1|y3 ) = (0.5)(0.2) (0.5)(0.2)
+ (1)(0.8) = 0.111 posterior from previous Monday, December 5, 11
Hierarchical Models Monday, December 5, 11
effectiveness of cardiac surgery example Monday, December 5, 11
Hospital Operations Deaths A 47 0 B 148 18 C
119 8 D 810 46 E 211 8 F 196 13 G 148 9 H 215 31 I 207 14 J 97 8 K 256 29 L 360 24 Monday, December 5, 11
clustering induces dependence between observations Monday, December 5, 11
parameters sampled from common distribution j hospital j survival rate
Monday, December 5, 11
population distribution j f(⇥) hyperparameters Monday, December 5, 11
θ1 θ2 θk y1 y2 yk ... ... deaths parameters
Monday, December 5, 11
θ1 θ2 θk y1 y2 yk ... ... deaths parameters
µ, σ2 hyperparameters Monday, December 5, 11
, ϕµ ϕσ θ1 θ2 θk y1 y2 yk ...
... deaths parameters µ, σ2 hyperparameters Monday, December 5, 11
non-hierarchical models of hierarchical data can easily be underfit or
overfit Monday, December 5, 11
“experiments” j = 1, . . . , J likelihood
∼ Binomial( , ) deaths j operations j θj logit( ) ∼ N(µ, ) θi σ2 population model µ ∼ , ∼ Pµ σ2 Pσ priors Monday, December 5, 11
0/47 = 0 18/148 = 0.12 8/119 = 0.07 46/810
= 0.06 Monday, December 5, 11
Monday, December 5, 11
Monday, December 5, 11