(Re)Doing Bayesian Data Analysis
1
What’s in These Notes
I The Basics: Models, Probability, Bayes, and R
2
Credibility, Models, and Parameters
2.1
The Steps of Bayesian Data Analysis
2.1.1
R code
2.1.2
R packages
2.2
Example 1: Which coin is it?
2.2.1
Freedom of choice
2.3
Distributions
2.3.1
Beta distributions
2.3.2
Normal distributions
2.4
Example 2: Height vs Weight
2.4.1
Data
2.4.2
Describing a model for the relationship between height and weight
2.4.3
Prior
2.4.4
Posterior
2.4.5
Posterior Predictive Check
2.5
Where do we go from here?
2.6
Exercises
2.7
Footnotes
3
Some Useful Bits of R
3.1
You Gotta Have Style
3.1.1
An additional note about homework
3.2
Vectors, Lists, and Data Frames
3.2.1
Vectors
3.2.2
Lists
3.2.3
Data frames for rectangular data
3.2.4
Other types of data
3.3
Plotting with ggformula
3.4
Creating data with expand.grid()
3.5
Transforming and summarizing data dplyr and tidyr
3.6
Writing Functions
3.6.1
Why write functions?
3.6.2
Function parts
3.6.3
The function() function has its function
3.7
Some common error messages
3.7.1
object not found
3.7.2
Any message mentioning yaml
3.8
Exercises
3.9
Footnotes
4
Probability
4.1
Some terminology
4.2
Distributions in R
4.2.1
Example: Normal distributions
4.2.2
Simulating running proportions
4.3
Joint, marginal, and conditional distributions
4.3.1
Example: Hair and eye color
4.3.2
Independence
4.4
Exercises
4.5
Footnotes
5
Bayes’ Rule and the Grid Method
5.1
The Big Bayesian Idea
5.1.1
Likelihood
5.1.2
When Bayes is easy
5.2
Estimating the bias in a coin using the Grid Method
5.2.1
Creating a Grid
5.2.2
HDI from the grid
5.2.3
Automating the grid
5.3
Working on the log scale
5.4
Discrete Parameters
5.5
Exercises
5.6
Footnotes
II Inferring a Binomial Probability
6
Inferring a Binomial Probability via Exact Mathematical Analysis
6.1
Beta distributions
6.2
Beta and Bayes
6.2.1
The Bernoulli likelihood function
6.2.2
A convenient prior
6.2.3
Pros and Cons of conjugate priors
6.3
Getting to know the Beta distributions
6.3.1
Important facts
6.3.2
Alternative parameterizations of Beta distributions
6.3.3
beta_params()
6.3.4
Automating Bayesian updates for a proportion (beta prior)
6.4
What if the prior isn’t a beta distribution?
6.5
Exercises
7
Markov Chain Monte Carlo (MCMC)
7.1
King Markov and Adviser Metropolis
7.2
Quick Intro to Markov Chains
7.2.1
More info, please
7.2.2
Definition
7.2.3
Time-Homogeneous Markov Chains
7.2.4
Matrix representation
7.2.5
Regular Markov Chains
7.3
Back to King Markov
7.4
How well does the Metropolis Algorithm work?
7.4.1
Jumping to any island
7.4.2
Jumping only to neighbor islands
7.5
Markov Chains and Posterior Sampling
7.5.1
Example 1: Estimating a proportion
7.5.2
Example 2: Estimating mean and variance
7.5.3
Issues with Metropolis Algorithm
7.6
Two coins
7.6.1
The model
7.6.2
Exact analysis
7.6.3
Metropolis
7.6.4
Gibbs sampling
7.6.5
Advantages and Disadvantages of Gibbs vs Metropolis
7.6.6
So what do we learn about the coins?
7.7
MCMC posterior sampling: Big picture
7.7.1
MCMC = Markov chain Monte Carlo
7.7.2
Posterior sampling: Random walk through the posterior
7.7.3
Where do we go from here?
7.8
Exercises
8
JAGS – Just Another Gibbs Sampler
8.1
What JAGS is
8.1.1
JAGS documentation
8.1.2
Updating C and CLANG
8.2
Example 1: estimating a proportion
8.2.1
The Model
8.2.2
Load Data
8.2.3
Specify the model
8.2.4
Run the model
8.3
Extracting information from a JAGS run
8.3.1
posterior()
8.3.2
Side note: posterior sampling and the grid method
8.3.3
Using coda
8.3.4
Using bayesplot
8.3.5
Using Kruschke’s functions
8.4
Optional arguments to jags()
8.4.1
Number and size of chains
8.4.2
Starting point for chains
8.4.3
Running chains in parallel
8.5
Example 2: comparing two proportions
8.5.1
The data
8.5.2
The model
8.5.3
Describing the model to JAGS
8.5.4
Fitting the model
8.5.5
Inspecting the results
8.5.6
Difference in proportions
8.5.7
Sampling from the prior
8.6
Exercises
9
Heierarchical Models
9.1
Gamma Distributions
9.2
One coin from one mint
9.3
Multiple coins from one mint
9.4
Multiple coins from multiple mints
9.5
Therapeutic Touch
9.5.1
Abstract
9.5.2
Data
9.5.3
A heierarchical model
9.6
Other parameterizations we might have tried
9.6.1
Shape parameters for Beta
9.6.2
Mean instead of mode
9.7
Shrinkage
9.8
Example: Baseball Batting Average
9.9
Exerciess
10
(Model Comparison)
11
(NHST)
12
(Point Null Hypotheses)
13
Goals, Power, Sample Size
13.1
Intro
13.1.1
Goals
13.1.2
Obstacles
13.2
Power
13.2.1
Three ways to increase power
13.3
Calculating Power: 3 step process
13.4
Power Examples
13.4.1
Example: Catching an unfair coin
13.4.2
Example: Estimating a Proportion
13.4.3
More Complex Example
14
Stan
14.1
Why Stan might work better
14.2
Describing a model to Stan
14.3
Samping from the prior
14.4
Exercises
15
GLM Overview
15.1
Data consists of observations of variables
15.1.1
Variable Roles
15.1.2
Types of Variables
15.2
GLM Framework
16
Estimating One and Two Means
16.1
Basic Model for Two Means
16.1.1
Data
16.1.2
Model
16.2
An Old Sleep Study
16.2.1
Data
16.2.2
Model
16.2.3
Separate standard deviations for each group
16.2.4
Comparison to t-test
16.2.5
ROPE (Region of Practical Equivalence)
16.3
Variations on the theme
16.3.1
Other distributions for the response
16.3.2
Other Priors for
\(\sigma\)
(or
\(\tau\)
)
16.3.3
Paired Comparisons
16.4
How many chains? How long?
16.4.1
Why multiple chains?
16.4.2
What large n.eff does and doesn’t do for us
16.5
Looking at Likelihood
16.6
Exercises
17
Simple Linear Regression
17.1
The deluxe basic model
17.1.1
Likelihood
17.1.2
Priors
17.2
Example: Galton’s Data
17.2.1
Describing the model to JAGS
17.2.2
Problems and how to fix them
17.3
Centering and Standardizing
17.3.1
\(\beta_0\)
and
\(\beta_1\)
are still correlated
17.4
We’ve fit a model, now what?
17.4.1
Estimate parameters
17.4.2
Make predictions
17.4.3
Posterior Predictive Checks
17.4.4
Posterior predictive checks with bayesplot
17.4.5
PPC with custom data
17.5
Fitting models with Stan
17.6
Two Intercepts model
17.7
Exercises
18
Multiple Metric Predictors
18.1
SAT
18.1.1
SAT vs expenditure
18.1.2
SAT vs expenditure and percent taking the test
18.1.3
What’s wrong with this picture?
18.1.4
Multiple predictors in pictures
18.2
Interaction
18.2.1
SAT with interaction term
18.3
Fitting a linear model with brms
18.3.1
Adjusting the model with brm()
18.4
Interpretting a model with an interaction term
18.4.1
Thinking about the noise
18.5
Exercises
19
Nominal Predictors
19.1
Fruit flies Study
19.2
Model 1: Out-of-the-box
19.3
Model 2: Custom Priors
19.4
Models 3 and 4: alternate parameterizations
19.5
Comparing groups
19.5.1
Comparing to the “intercept group”
19.5.2
Comparing others pairs of groups
19.5.3
Contrasts: Comparing “groups of groups”
19.6
More Variations
19.7
Exercises
20
Multiple Nominal Predictors
20.1
Crop Yield by Till Method and Fertilizer
20.1.1
What does
\(\sigma\)
represent?
20.2
Split Plot Design
20.3
Which model should we use?
20.3.1
Modeling Choices
20.3.2
Measuring a Model – Prediction Error
20.3.3
Out-of-sample prediction error
20.3.4
Approximating out-of-sample prediction error
20.4
Using loo
20.5
Overfitting Example
20.5.1
Brains Data
20.5.2
Measuring fit with
\(r^2\)
20.5.3
Leave One Out Analysis
20.6
Exercises
21
Dichotymous Response
21.1
What Model?
21.1.1
A method with issues: Linear regression
21.1.2
The usual approach: Logistic regression
21.1.3
Other approaches
21.1.4
Preparing the data
21.1.5
Specifying family and link function in brm()
21.2
Interpretting Logistic Regression Models
21.3
Robust Logistic Regression
22
Nominal Response
23
Ordinal Response
24
Count Response
24.1
Hair and eye color data
24.2
Are hair and eye color independent?
24.3
Poisson model
24.4
Exercises
25
This and That
25.1
Wells in Bangledesh
25.1.1
The data
25.1.2
The Question
25.1.3
Distance as a predictor
25.1.4
Updating a model without recompiling
25.1.5
Interpreting coefficients – discrete change
25.1.6
Interpreting coefficients – the divide by 4 trick
25.1.7
Other predictors: arsenic
25.1.8
Still more predictors
25.2
Gelman/Hill Principles for Buiding Models
25.2.1
Example: Bicycling
25.3
Electric Company
25.3.1
Interpreting a model with interaction
25.3.2
Comparing models
25.3.3
Comparing Treatement Effect by City
References
(Re)Doing Bayesain Data Analysis
10
(Model Comparison)