BCPVU XIBU iDBVTBM JOĘVFODFw NFBOT IFSF JT UP JNBHJOF DIBOHJOH IJDI PUIFS WBSJBCMFT BMTP DIBOHF BT B DPOTFRVFODF 'PS FYBNQMF PG HMPCF UPTTFT / UIFO CPUI 8 BOE - NJHIU DIBOHF #VU Q XPV BOE - CVU OPU Q 8F ESBX UIBU MJLF UIJT Q / 8 - Generative model of the globe Begin conceptually: How do the variables in uence one another? N in uences W and L in uence in uence
- EJSFDUMZGPS FYBNQMF CZ NBOJQVMBUJOH UIF PO Q PS / #VU DIBOHJOH Q CZ GPS FYBNQMF FSBTJOH B SBOEPN DPO JOĘVFODF 8 BOE - BU MFBTU PO BWFSBHF 4P XF OFFE UXP NPSF BSSP Q / 8 - S DBVTBM EJBHSBN PG UIF HMPCF UPTTJOH FYQFSJNFOU ćFSF BSF TPN Generative model of the globe Begin conceptually: How do the variables in uence one another?
B GVODUJPO DBO BMTP NBLF UFTUJOH FBTJFS )FSFT B WFSZ TJNQMF GVODUJPO UIBU TJNVMBUFT TBNQMJOH OJOF UJNFT GSPN B HMPCF XJUI B XBUFS 3 DPEF # function to toss a globe covered p by water N times sim_globe <- function( p=0.7 , N=9 ) { sample(c("W","L"),size=N,prob=c(p,1-p),replace=TRUE) } /PUIJOH IBQQFOT VOUJM XF DBMM UIF GVODUJPO CZ JUT OBNF 3 DPEF sim_globe() [1] "L" "W" "W" "W" "L" "L" "L" "W" "L" 3FQFBU DBMMJOH UIF GVODUJPO UP TFF UIBU JU TJNVMBUFT B EJČFSFOU TBNQMF FBDI UJNF "OE CZ OBNJOH UIF QSPQPSUJPO PG XBUFS p BOE OVNCFS PG UPTTFT N JO UIF GVODUJPO EFĕOJUJPO XF DBO FBTJMZ DIBOHF UIFTF WBMVFT XIFO XF DBMM UIF GVODUJPO Possible observations Number of tosses Probability of each possible observation
compute posterior distribution compute_posterior <- function( the_sample , poss=c(0,0.25,0.5,0.75,1) ) { W <- sum(the_sample=="W") # number of W observed L <- sum(the_sample=="L") # number of L observed ways <- sapply( poss , function(q) (q*4)^W * ((1-q)*4)^L ) post <- ways/sum(ways) bars <- sapply( post, function(q) make_bar(q) ) data.frame( poss , ways , post=round(post,3) , bars ) } 5P VTF UIJT GVODUJPO ZPV OFFE UP HJWF JU B TBNQMF "OE XF DBO KVTU FNCFE UIF QSFWJPVT TJNVMBUJPO GVODUJPO JOTJEF JU Ways for p to produce W,L = (4p)W × (4–4p)L
compute posterior distribution compute_posterior <- function( the_sample , poss=c(0,0.25,0.5,0.75,1) ) { W <- sum(the_sample=="W") # number of W observed L <- sum(the_sample=="L") # number of L observed ways <- sapply( poss , function(q) (q*4)^W * ((1-q)*4)^L ) post <- ways/sum(ways) bars <- sapply( post, function(q) make_bar(q) ) data.frame( poss , ways , post=round(post,3) , bars ) } 5P VTF UIJT GVODUJPO ZPV OFFE UP HJWF JU B TBNQMF "OE XF DBO KVTU FNCFE UIF QSFWJPVT TJNVMBUJPO GVODUJPO JOTJEF JU Ways for p to produce W,L = (4p)W × (4–4p)L
compute posterior distribution compute_posterior <- function( the_sample , poss=c(0,0.25,0.5,0.75,1) ) { W <- sum(the_sample=="W") # number of W observed L <- sum(the_sample=="L") # number of L observed ways <- sapply( poss , function(q) (q*4)^W * ((1-q)*4)^L ) post <- ways/sum(ways) bars <- sapply( post, function(q) make_bar(q) ) data.frame( poss , ways , post=round(post,3) , bars ) } 5P VTF UIJT GVODUJPO ZPV OFFE UP HJWF JU B TBNQMF "OE XF DBO KVTU FNCFE UIF QSFWJPVT TJNVMBUJPO GVODUJPO JOTJEF JU Ways for p to produce W,L = (4p)W × (4–4p)L
compute posterior distribution compute_posterior <- function( the_sample , poss=c(0,0.25,0.5,0.75,1) ) { W <- sum(the_sample=="W") # number of W observed L <- sum(the_sample=="L") # number of L observed ways <- sapply( poss , function(q) (q*4)^W * ((1-q)*4)^L ) post <- ways/sum(ways) bars <- sapply( post, function(q) make_bar(q) ) data.frame( poss , ways , post=round(post,3) , bars ) } 5P VTF UIJT GVODUJPO ZPV OFFE UP HJWF JU B TBNQMF "OE XF DBO KVTU FNCFE UIF QSFWJPVT TJNVMBUJPO GVODUJPO JOTJEF JU Ways for p to produce W,L = (4p)W × (4–4p)L
compute posterior distribution compute_posterior <- function( the_sample , poss=c(0,0.25,0.5,0.75,1) ) { W <- sum(the_sample=="W") # number of W observed L <- sum(the_sample=="L") # number of L observed ways <- sapply( poss , function(q) (q*4)^W * ((1-q)*4)^L ) post <- ways/sum(ways) bars <- sapply( post, function(q) make_bar(q) ) data.frame( poss , ways , post=round(post,3) , bars ) } 5P VTF UIJT GVODUJPO ZPV OFFE UP HJWF JU B TBNQMF "OE XF DBO KVTU FNCFE UIF QSFWJPVT TJNVMBUJPO GVODUJPO JOTJEF JU Ways for p to produce W,L = (4p)W × (4–4p)L
W 0 0.5 1 0 W L W W W 0 0.5 1 0 W L W W W L proportion water (p) posterior probability 0 0.5 1 0 W L W W W L W proportion water (p) 0 0.5 1 0 W L W W W L W L proportion water (p) 0 0.5 1 0 W L W W W L W L W
89% intervals, but we need to see the 95% intervals so we can tell whether any of the e ects are robust.” at an arbitrary interval contains an arbitrary value is not meaningful. Use the whole distribution.
it than to ignore it Same goes for: missing data, compliance, inclusion, etc Good news: Samples do not need to be representative of population in order to provide good estimates of population What matters is why the sample di ers