Gaussian: Abstracts from generative error model, replaces with normal distribution, mechanistically silent Useful when handled with care Many special cases: ANOVA, ANCOVA, t-test, others From Breath of Bones: A Tale of the Golem
normal distribution (2) Inferential: For estimating mean and variance, normal distribution is least informative distribution (maxent) Variable does not have to be normally distributed for normal model to be useful. It’s a machine for estimating mean/variance.
140 150 160 170 180 30 35 40 45 50 55 60 height (cm) weight (kg) data(Howell1) d <- Howell1[Howell1$age>=18,] H W W = f(H) “Weight is some function of height”
mass and height (length) derive from growth pattern; Gaussian variation result of summed fluctuations (2) Static: Changes in height result in changes in weight, but no mechanism; Gaussian variation result of growth history
influence of unobserved causes: Generative model: H → W H W U UIF QSPQPSUJPOBMJUZ DPOTUBOU *G JUT GPS FYBNQMF UIFO B QFSTPO XIP LH VTMZ OPU FWFSZPOF XJUI UIF TBNF IFJHIU IBT FYBDUMZ UIF TBNF XFJHIU TIPVME SFĘFDU UIJT 4P XF OFFE UP JOUSPEVDF TPNF WBSJBUJPO 8FMM VTF ćF XBZ *MM EP UIJT JT UP TJNVMBUF PVS VOPCTFSWFE 6 WBSJBCMF GSPN UI ćFO XF DBO DPNQVUF B QFSTPOT XFJHIU BT 8 = β) + 6 F DPEF UP EP UIJT n to simulate weights of individuals from height t <- function(H,b,sd) {
justifiable priors Justify with information outside the data — like rest of model Priors not so important in simple models Very important/useful in complex models Need to practice now: simulate, understand 130 140 150 160 170 30 40 50 60 70 height (cm) weight (kg)
from scientific model Golem might be broken Even working golems might not deliver what you hoped Strong test: Simulation-Based Calibration Fahrvergnügen
runif(10,130,170) W <- sim_weight(H,b=0.5,sd=5) # run the model library(rethinking) m3.1 <- quap( alist( W ~ dnorm(mu,sigma), mu <- a + b*H, a ~ dnorm(0,10), b ~ dunif(0,1), sigma ~ dunif(0,10) ) , data=list(W=W,H=H) ) # summary precis( m3.1 ) mean sd 5.5% 94.5% a 5.19 9.43 -9.88 20.26 b 0.49 0.07 0.38 0.59 sigma 5.64 1.29 3.57 7.71 Vary slope and make sure posterior mean tracks it Use a large sample to see that it converges to data generating value Same for other unknowns (parameters)