You may use your notes, the class text, any online materials from the class, and your previous homework. You may not consult other resources or other people. Send me an email if you have questions or difficulties of any sort.
Prepare your solutions using R Markdown and submit via Gradescope. Also email me your Rmd file.
Link in case embedded video isn’t working
(Yes, this was state-of-the-art TV when I was a kid.)
This problem is about a study of the effect of watching the PBS TV show The Electric Company on kids’ reading performance. To keep things a bit simpler, we will just use the data from one grade: grade 2.
This study used a randomized, paired design: At each school, the two worst performing classes were selected, and it was randomly determined which class would watch The Electric Company (treatment group) and which one would not (control group). In the ElecComp
data set, id
is a numeric code for each school, and you can see that there are two classes, one in each group, at each school.
Students were given a reading test before (pre) and after (post) the experimental period. The data provide mean scores on these tests for each class. (It would be even better to have individual-level data, but that’s not what we have.)
The data live (undocumented) in the CalvinBayes package. Let’s select the second grade and take a look.
Grade2 <- ElecComp %>%
filter(grade == 2)
Grade2 %>% arrange(id) %>% reactable::reactable()
Grade2 %>% gf_bar( ~ id, fill = ~ group) %>%
gf_theme(legend.position = "top")
Note that the id numbers are not contiguous and don’t start at 1. Let’s fix that and also add a group_idx
variable and recode the missing values in method
so Stan doesn’t complain about them:
library(forcats) # tools for categorical variables
Grade2 <-
Grade2 %>%
mutate(
orig_id = id,
id = as.integer(factor(id)),
group_idx = as.integer(factor(group)),
method = fct_explicit_na(method, na_level = "-"),
method_idx = as.integer(factor(method))
)
Grade2 %>% reactable::reactable()
And a few more data summaries for good measure.
df_stats(pre + post ~ group , data = Grade2)
response | group | min | Q1 | median | Q3 | max | mean | sd | n | missing |
---|---|---|---|---|---|---|---|---|---|---|
pre | control | 40.8 | 63.225 | 72.85 | 81.175 | 90.0 | 70.70882 | 13.60787 | 34 | 0 |
pre | treatment | 51.4 | 69.725 | 75.35 | 83.100 | 97.7 | 75.89706 | 10.82190 | 34 | 0 |
post | control | 67.6 | 85.475 | 97.20 | 102.975 | 110.3 | 93.21176 | 11.93149 | 34 | 0 |
post | treatment | 66.4 | 96.275 | 102.05 | 109.125 | 114.6 | 101.57059 | 10.24226 | 34 | 0 |
gf_point(post ~ group, data = Grade2) %>%
gf_line(group = ~ id, alpha = 0.5) |
gf_point((post - pre) ~ group, height = 0, width = 0.2, data = Grade2) %>%
gf_line(group = ~ id, alpha = 0.5)
1. What are the lines on those last two plots and why are they useful?
2. Let’s fit two models. (Feel free to adjust things like the number of iterations, etc. as needed.) Show the precis()
output for each model.
ec1 <-
ulam(
data = Grade2,
alist(
post ~ dnorm(mu, sigma),
mu <- pre + bg[group_idx],
bg[group_idx] ~ dnorm(0, 20),
sigma ~ dexp(0.10)
),
chains = 4, cores = 4, iter = 2000,
log_lik = TRUE
)
ec2 <-
ulam(
data = Grade2,
alist(
post ~ dnorm(mu, sigma),
mu <- bp * pre + bg[group_idx],
bg[group_idx] ~ dnorm(0, 20),
bp ~ dnorm(1, 1),
sigma ~ dexp(0.10)
),
chains = 4, cores = 4, iter = 2000,
log_lik = TRUE
)
3. Model Predictions.
For each model, create a plot that shows the predictions the model makes together with uncertainty in those predictions and the data. (There is not one right plot here, but there will be style points for how well your model communicates.)
How do the two models’ predictions differ?
4. Is one of these two models is better? If so which? Explain what you mean by better and how you decided.
5. According to the better model (or model ec2
if they are roughly equivalent), does it appear that Electric Company improves reading?
So far our models have not taken into account that we have a paired design.
6. Add school id to ec2
as part of a multilevel model. Let’s call it ec3
.
Why do we want to use a multilevel model here? [Hint: What would the alternative be?]
Explain your choice of prior for ec3
.
7. What does ec3
think about the effectiveness of watching Electric Company? How does this answer compare to ec2
?
8. What does ec3
think about school-to-school variation?
9. ec3
has a lot more parameters than ec2
. Which model does WAIC indicate is a better choice?
Note: in a complete treatment of thes data, we might include other things as well. Examples: all four grades of students, the city in which the schools were, varying slopes, and interactions. These were not included here to keep things a tad simpler.
There were actually 3 groups of classes in this study, not two. Some of the classes watched Electric Company as a replacement for their usual reading programs (R). Other classes watched Electric Company as a supplement to their regular reading programs (S).
10. Create a new model ec4
that has three groups (R, S, -) instead of two. Modify your ec3
if that is working successfully, else modify ec2
. Provide the precis()
output for your new model.
11. Does ec4
think that there is a preferred method? If so, which method? How much better?