9 Heierarchical Models

9.1 Gamma Distributions

We will use gamma distributions for some of our priors in this chapter. Gamma distributions have support \((0,\infty)\) and are skewed to the right. Both R and JAGS paramterize the gamma distributions with two parameters called shape and rate.

gf_dist("gamma", shape = 2, rate = 3, color = ~"Gamma(2, 3)") %>%
gf_dist("gamma", shape = 4, rate = 3, color = ~"Gamma(4, 3)") %>%
gf_dist("gamma", shape = 2, rate = 6, color = ~"Gamma(2, 6)") %>%
gf_dist("gamma", shape = 4, rate = 6, color = ~"Gamma(4, 6)") %>%
  gf_labs(title = "Some Gamma distributions")

The mean, mode, standard deviation can be calcuated from the shape \(s\) and rate \(r\) as follows:

\[\begin{align*} \mu &= \frac{s}{r} \\ \omega &= \frac{s−1}{r} \qquad (s > 1) \\ \sigma & = \frac{\sqrt{s}}{r} \end{align*}\] In addition, the scale parameter (1/rate) is sometimes used in place of the rate parameter. The gamma_params() function will automate conversion between various parameterizations. It works just like beta_params() that we have seen before.

gamma_params(mode = 15, sd = 10, plot = TRUE)

shape rate scale mode mean sd
4 0.2 5 15 20 10

As the shape parameter gets larger and larger, the gamma distribution becomes less and less skewed (and more and more like a normal distribution):

gf_dist("gamma", shape = 25, rate = 5, color = ~"Gamma(25, 5)") %>%
  gf_dist("norm", mean = 5, sd = 1, color = ~"Norm(5, 1)")

gf_dist("gamma", shape = 100, rate = 5, color = ~"Gamma(25, 5)") %>%
  gf_dist("norm", mean = 20, sd = 2, color = ~"Norm(5, 1)")

9.2 One coin from one mint

9.3 Multiple coins from one mint

9.4 Multiple coins from multiple mints

9.5 Therapeutic Touch

The study is described in the text. The article reporting on the study can be found at https://jamanetwork.com/journals/jama/fullarticle/187390. Here’s the abstract:

9.5.1 Abstract

Context.— Therapeutic Touch (TT) is a widely used nursing practice rooted in mysticism but alleged to have a scientific basis. Practitioners of TT claim to treat many medical conditions by using their hands to manipulate a “human energy field” perceptible above the patient’s skin.

Objective.— To investigate whether TT practitioners can actually perceive a “human energy field.”

Design.— Twenty-one practitioners with TT experience for from 1 to 27 years were tested under blinded conditions to determine whether they could correctly identify which of their hands was closest to the investigator’s hand. Placement of the investigator’s hand was determined by flipping a coin. Fourteen practitioners were tested 10 times each, and 7 practitioners were tested 20 times each.

Main Outcome Measure.— Practitioners of TT were asked to state whether the investigator’s unseen hand hovered above their right hand or their left hand. To show the validity of TT theory, the practitioners should have been able to locate the investigator’s hand 100% of the time. A score of 50% would be expected through chance alone.

Results.— Practitioners of TT identified the correct hand in only 123 (44%) of 280 trials, which is close to what would be expected for random chance. There was no significant correlation between the practitioner’s score and length of experience (r=0.23). The statistical power of this experiment was sufficient to conclude that if TT practitioners could reliably detect a human energy field, the study would have demonstrated this.

Conclusions.— Twenty-one experienced TT practitioners were unable to detect the investigator’s “energy field.” Their failure to substantiate TT’s most fundamental claim is unrefuted evidence that the claims of TT are groundless and that further professional use is unjustified.

9.5.2 Data

library(mosaic)
head(TherapeuticTouch, 3)
y s
1 S01
0 S01
0 S01
gf_barh(s ~ ., data = TherapeuticTouch, fill = ~ factor(y))

9.5.3 A heierarchical model

Big ideas:

  • The ten trials for each subject are a sample from the many trials that could have been done.

    • distribution of results: \({\sf Bern}(\theta_s)\) – each subject has a potentially different \(\theta_s\).
  • The subjects themselves are just a sample from all of the TT practitioners that could have been in the study.

    • So the \(\theta_s\) values are a sample from a distribution of \(\theta\) values for all TT practititioners and tell us something about that distribution.

    • We will assume a beta distribution for this, where the parameters are unknown and estimated from the data.

  • Use the data to estimate both the individual level \(\theta_s\) values and the group level parameters of the beta distribution.

  • Parameterization of the beta distribution for \(\theta_s\).

    • We are primarily interested in the mean or mode of this distribution (typical value of \(\theta\) for TT practitioner).

    • Many combinations of shape parameters give the same mean (or mode), and they are highly correlated. For example, \({\sf Beta}(2,4)\), \({\sf Beta}(20,40)\), and \({\sf Beta}(200,400)\) all have a mean of 1/3.

    • We will parameterize this Beta distribution with mode (\(\omega\)), and concentration (\(\kappa\))

    • We will need to convert mode and concentration into the two shape parameters, since JAGS and R use the two shape parameters.

    \[\begin{align} \alpha &= \omega (\kappa - 2) + 1\\ \beta &= (1 - \omega) (\kappa - 2) + 1) \end{align}\]

  • \(\omega\) and \(\kappa\) will need priors

    • \(\omega\): Beta
    • \(\kappa - 2\): Gamma (because \(\kappa >2\))

Putting this altogether we have the following picture:

Now we code it up for JAGS.

gamma_params(mean = 1, sd = 10)
shape rate scale mode mean sd
0.01 0.01 100 NA 1 10
touch_model <- function() {
   for (i in 1:Ntotal) {
     y[i] ~ dbern(theta[s[i]])
   }
  for (s in 1:Nsubj) {
    theta[s] ~ dbeta(omega * (kappa - 2) + 1, (1 - omega) * (kappa - 2) + 1)
  }
  omega ~ dbeta(1, 1)
  kappa <- kappaMinusTwo + 2
  kappaMinusTwo ~ dgamma(0.01, 0.01)     # mean = 1, sd = 10
  }
set.seed(1234)
TouchData <- list(
  Ntotal = nrow(TherapeuticTouch),
  Nsubj = length(unique(TherapeuticTouch$s)),
  y = TherapeuticTouch$y,
  # must convert subjects to sequence 1:Nsubj
  s = as.numeric(factor(TherapeuticTouch$s))
)
touch_jags <-
  jags(
  data = TouchData,
  model = touch_model,
  parameters.to.save = c("theta", "kappa", "omega"),
)
touch_jags
## Inference for Bugs model at "/var/folders/py/txwd26jx5rq83f4nn0f5fmmm0000gn/T//RtmpTl7XcA/model163463f4f8d70.txt", fit using jags,
##  3 chains, each with 2000 iterations (first 1000 discarded)
##  n.sims = 3000 iterations saved
##           mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
## kappa      52.286  54.028   8.932  20.025  34.399  65.489 210.966 1.056    44
## omega       0.439   0.035   0.367   0.416   0.439   0.462   0.508 1.004  1100
## theta[1]    0.363   0.088   0.175   0.305   0.369   0.426   0.517 1.016   210
## theta[2]    0.384   0.084   0.208   0.331   0.389   0.444   0.536 1.015   190
## theta[3]    0.409   0.081   0.241   0.358   0.414   0.463   0.564 1.005   970
## theta[4]    0.411   0.083   0.236   0.357   0.415   0.467   0.564 1.020   270
## theta[5]    0.409   0.080   0.244   0.358   0.411   0.461   0.563 1.010   500
## theta[6]    0.410   0.082   0.235   0.359   0.414   0.464   0.562 1.006  1100
## theta[7]    0.409   0.080   0.244   0.359   0.412   0.461   0.560 1.019   180
## theta[8]    0.410   0.080   0.243   0.360   0.413   0.462   0.565 1.012   510
## theta[9]    0.407   0.080   0.247   0.356   0.410   0.459   0.561 1.008   650
## theta[10]   0.409   0.081   0.238   0.358   0.413   0.463   0.559 1.008   710
## theta[11]   0.433   0.077   0.276   0.384   0.433   0.483   0.588 1.004  1700
## theta[12]   0.433   0.081   0.278   0.383   0.434   0.485   0.593 1.005   710
## theta[13]   0.434   0.078   0.280   0.385   0.434   0.484   0.587 1.002  3000
## theta[14]   0.432   0.080   0.272   0.380   0.432   0.482   0.593 1.001  3000
## theta[15]   0.433   0.080   0.275   0.381   0.434   0.484   0.590 1.004  3000
## theta[16]   0.457   0.080   0.300   0.408   0.454   0.504   0.624 1.002  3000
## theta[17]   0.456   0.079   0.295   0.406   0.453   0.506   0.621 1.002  1600
## theta[18]   0.454   0.081   0.295   0.401   0.453   0.504   0.626 1.003  3000
## theta[19]   0.457   0.078   0.313   0.407   0.454   0.505   0.618 1.001  3000
## theta[20]   0.455   0.080   0.298   0.403   0.452   0.508   0.612 1.003  3000
## theta[21]   0.457   0.082   0.296   0.404   0.455   0.510   0.621 1.005  1000
## theta[22]   0.457   0.079   0.311   0.405   0.454   0.507   0.625 1.003  1800
## theta[23]   0.480   0.083   0.328   0.425   0.474   0.529   0.658 1.008   510
## theta[24]   0.482   0.083   0.327   0.427   0.476   0.531   0.660 1.003  1300
## theta[25]   0.506   0.088   0.351   0.446   0.497   0.562   0.691 1.008   260
## theta[26]   0.507   0.088   0.354   0.446   0.498   0.560   0.700 1.003   680
## theta[27]   0.505   0.087   0.355   0.445   0.495   0.559   0.695 1.008   250
## theta[28]   0.527   0.093   0.374   0.460   0.517   0.581   0.735 1.008   300
## deviance  378.963   5.156 368.591 375.437 379.284 382.411 388.755 1.010   220
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 13.2 and DIC = 392.1
## DIC is an estimate of expected predictive error (lower deviance is better).

What do we learn from a quick look at this output?

  • The Rhat values look good
  • The autocorrelation varies from parameter to parameter. For some parameters, it looks like it will take a much longer run to get a large effective sample size.

So let’s do a larger run.

touch_jags <-
  jags.parallel(
  data = TouchData,
  model = touch_model,
  parameters.to.save = c("theta", "kappa", "omega"),
  n.burnin = 1000,
  n.iter = 41000,
  n.chains = 5,
  n.thin = 10,
  jags.seed = 54321
)    
touch_jags
## Inference for Bugs model at "touch_model", fit using jags,
##  5 chains, each with 41000 iterations (first 1000 discarded), n.thin = 10
##  n.sims = 20000 iterations saved
##           mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
## kappa      55.710  55.745   8.632  21.295  37.091  68.780 210.357 1.001  8100
## omega       0.435   0.037   0.362   0.412   0.436   0.460   0.508 1.001  8900
## theta[1]    0.360   0.087   0.169   0.305   0.367   0.422   0.513 1.001 19000
## theta[2]    0.384   0.084   0.206   0.332   0.389   0.441   0.536 1.001 20000
## theta[3]    0.407   0.080   0.240   0.357   0.410   0.460   0.560 1.001 20000
## theta[4]    0.407   0.080   0.239   0.357   0.410   0.461   0.560 1.001 19000
## theta[5]    0.408   0.080   0.240   0.359   0.411   0.460   0.561 1.001  8400
## theta[6]    0.407   0.080   0.240   0.357   0.410   0.459   0.561 1.001 20000
## theta[7]    0.407   0.080   0.241   0.358   0.410   0.459   0.559 1.001  8100
## theta[8]    0.407   0.080   0.238   0.357   0.411   0.459   0.560 1.001 12000
## theta[9]    0.408   0.080   0.241   0.358   0.411   0.460   0.563 1.001  6900
## theta[10]   0.408   0.080   0.242   0.358   0.411   0.460   0.563 1.001 13000
## theta[11]   0.431   0.079   0.274   0.382   0.431   0.481   0.591 1.001 20000
## theta[12]   0.430   0.079   0.271   0.380   0.430   0.480   0.588 1.001 20000
## theta[13]   0.431   0.079   0.272   0.381   0.430   0.481   0.588 1.001 10000
## theta[14]   0.430   0.078   0.276   0.380   0.430   0.479   0.590 1.001 20000
## theta[15]   0.431   0.078   0.274   0.382   0.432   0.481   0.590 1.001 20000
## theta[16]   0.454   0.080   0.300   0.402   0.451   0.503   0.622 1.001 20000
## theta[17]   0.454   0.080   0.303   0.402   0.451   0.503   0.624 1.001 20000
## theta[18]   0.455   0.080   0.301   0.403   0.451   0.505   0.624 1.001 20000
## theta[19]   0.454   0.080   0.301   0.402   0.452   0.502   0.626 1.001 20000
## theta[20]   0.454   0.079   0.300   0.403   0.452   0.502   0.620 1.001 20000
## theta[21]   0.455   0.080   0.303   0.403   0.452   0.504   0.621 1.001 20000
## theta[22]   0.455   0.080   0.302   0.402   0.451   0.504   0.622 1.001 18000
## theta[23]   0.477   0.083   0.327   0.422   0.471   0.527   0.659 1.001 12000
## theta[24]   0.476   0.082   0.328   0.422   0.470   0.526   0.657 1.001 15000
## theta[25]   0.501   0.087   0.349   0.441   0.492   0.553   0.691 1.001  9700
## theta[26]   0.500   0.086   0.351   0.441   0.493   0.551   0.691 1.001 20000
## theta[27]   0.500   0.086   0.350   0.441   0.493   0.553   0.691 1.001  9900
## theta[28]   0.525   0.093   0.370   0.460   0.514   0.581   0.733 1.001 20000
## deviance  379.089   5.187 368.486 375.671 379.322 382.640 388.787 1.001 20000
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 13.5 and DIC = 392.5
## DIC is an estimate of expected predictive error (lower deviance is better).
touch_mcmc <- as.mcmc(touch_jags)
plot_post(touch_mcmc[, "omega"], comparison_value = 0.5)

## $posterior
##        ESS   mean median   mode
## var1 14724 0.4354 0.4357 0.4379
## 
## $hdi
##   prob     lo     hi
## 1 0.95 0.3638 0.5091
## 
## $comparison
##   value P(< comp. val.) P(> comp. val.)
## 1   0.5          0.9597         0.04025
diag_mcmc(touch_mcmc, par = "omega")

diag_mcmc(touch_mcmc, par = "kappa")

diag_mcmc(touch_mcmc, par = "theta[1]")
mcmc_pairs(touch_mcmc, pars = c("omega", "kappa"))

GGally::ggpairs(posterior(touch_jags) %>% select(omega, kappa))

gf_point(kappa ~ omega, data = posterior(touch_jags), alpha = 0.05) %>%
  gf_density2d(kappa ~ omega, data = posterior(touch_jags))

9.6 Other parameterizations we might have tried

9.6.1 Shape parameters for Beta

Suppose we decided to parameterize the beta distribution with shape parameters like this?

touch_model2 <- function() {
   for (i in 1:Ntotal) {
     y[i] ~ dbern(theta[s[i]])
   }
  for (s in 1:Nsubj) {
    theta[s] ~ dbeta(alpha, beta)
  }
  kappa <- alpha + beta
  mu <- alpha / (alpha + beta)
  alpha <- alphaMinusOne + 1
  beta  <- betaMinusOne + 1
  alphaMinusOne ~ dgamma(0.01, 0.01)
  betaMinusOne ~ dgamma(0.01, 0.01)
  }

We’ll run it with the same options we used above to faciliate easy comparisons.

touch_jags2 <-
  jags.parallel(
  data = TouchData,
  model = touch_model2,
  parameters.to.save = c("theta", "alpha", "beta", "mu", "omega", "kappa"),
  n.burnin = 1000,
  n.iter = 41000,
  n.chains = 5,
  n.thin = 10,
  jags.seed = 54321
)    

The resuls are disasterous: Rhat values well above 1 and effective sample sizes that are much smaller than before.

touch_jags2
## Inference for Bugs model at "touch_model2", fit using jags,
##  5 chains, each with 41000 iterations (first 1000 discarded), n.thin = 10
##  n.sims = 20000 iterations saved
##           mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
## alpha      19.586  23.261   1.000   4.833  12.099  24.854  89.225 2.300     7
## beta       24.922  29.869   1.000   6.200  15.476  31.857 110.226 2.411     7
## kappa      44.508  52.940   2.000  11.184  27.621  56.852 197.948 2.369     7
## mu          0.448   0.040   0.369   0.420   0.447   0.482   0.506 1.157    27
## theta[1]    0.322   0.120   0.056   0.247   0.344   0.409   0.509 1.708    10
## theta[2]    0.355   0.107   0.112   0.292   0.370   0.431   0.532 1.435    14
## theta[3]    0.391   0.098   0.172   0.333   0.399   0.455   0.567 1.210    30
## theta[4]    0.391   0.097   0.175   0.335   0.400   0.455   0.570 1.205    30
## theta[5]    0.392   0.098   0.176   0.335   0.401   0.457   0.571 1.196    31
## theta[6]    0.391   0.098   0.170   0.334   0.400   0.456   0.570 1.213    30
## theta[7]    0.392   0.097   0.176   0.335   0.399   0.455   0.571 1.188    33
## theta[8]    0.391   0.097   0.174   0.334   0.399   0.456   0.567 1.204    30
## theta[9]    0.391   0.098   0.171   0.333   0.400   0.455   0.568 1.212    29
## theta[10]   0.391   0.098   0.175   0.333   0.399   0.455   0.568 1.205    31
## theta[11]   0.427   0.094   0.233   0.370   0.428   0.484   0.621 1.066   250
## theta[12]   0.427   0.094   0.233   0.371   0.429   0.485   0.619 1.064   260
## theta[13]   0.427   0.094   0.231   0.370   0.428   0.484   0.619 1.073   180
## theta[14]   0.428   0.094   0.229   0.371   0.429   0.485   0.619 1.071   220
## theta[15]   0.427   0.094   0.230   0.370   0.429   0.483   0.616 1.070   190
## theta[16]   0.463   0.096   0.286   0.402   0.456   0.517   0.679 1.046   260
## theta[17]   0.463   0.096   0.285   0.403   0.456   0.517   0.676 1.044   330
## theta[18]   0.464   0.096   0.286   0.403   0.457   0.518   0.682 1.048   270
## theta[19]   0.462   0.096   0.282   0.401   0.456   0.517   0.678 1.049   280
## theta[20]   0.463   0.095   0.288   0.402   0.457   0.517   0.677 1.047   260
## theta[21]   0.462   0.097   0.283   0.401   0.456   0.516   0.682 1.045   290
## theta[22]   0.463   0.096   0.287   0.401   0.456   0.518   0.680 1.048   280
## theta[23]   0.499   0.104   0.325   0.429   0.485   0.554   0.747 1.134    35
## theta[24]   0.500   0.104   0.325   0.430   0.486   0.555   0.747 1.135    35
## theta[25]   0.533   0.117   0.353   0.452   0.512   0.597   0.814 1.280    17
## theta[26]   0.533   0.117   0.354   0.450   0.513   0.597   0.812 1.282    17
## theta[27]   0.533   0.118   0.353   0.451   0.512   0.597   0.819 1.289    16
## theta[28]   0.569   0.133   0.373   0.471   0.540   0.644   0.880 1.459    12
## deviance  378.572   5.644 367.282 374.715 378.780 382.457 389.202 1.034   120
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 15.4 and DIC = 394.0
## DIC is an estimate of expected predictive error (lower deviance is better).

9.6.2 Mean instead of mode

This change seems less dramatic. Let’s see how using mean and concentraion compares to using mode and concentration.

touch_model3 <- function() {
   for (i in 1:Ntotal) {
     y[i] ~ dbern(theta[s[i]])
   }
  for (s in 1:Nsubj) {
    theta[s] ~ dbeta(mu * kappa, (1 - mu) * kappa)
  }
  mu ~ dbeta(2, 2)
  kappa <- kappaMinusTwo + 2
  kappaMinusTwo ~ dgamma(0.01, 0.01)
  }
touch_jags3 <-
  jags.parallel(
  data = TouchData,
  model = touch_model3,
  parameters.to.save = c("theta", "mu", "kappa"),
  n.burnin = 1000,
  n.iter = 41000,
  n.chains = 5,
  n.thin = 10,
  jags.seed = 54321
)    

This model seems to perform reasonably well.

touch_jags3
## Inference for Bugs model at "touch_model3", fit using jags,
##  5 chains, each with 41000 iterations (first 1000 discarded), n.thin = 10
##  n.sims = 20000 iterations saved
##           mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
## kappa      58.519  58.419   9.258  22.258  39.046  72.852 222.144 1.001 16000
## mu          0.441   0.033   0.377   0.419   0.441   0.463   0.507 1.001  8700
## theta[1]    0.364   0.087   0.174   0.310   0.373   0.425   0.514 1.001 18000
## theta[2]    0.387   0.082   0.213   0.337   0.392   0.443   0.536 1.001 20000
## theta[3]    0.409   0.078   0.248   0.360   0.412   0.460   0.558 1.001 17000
## theta[4]    0.409   0.078   0.244   0.361   0.412   0.461   0.559 1.001 15000
## theta[5]    0.409   0.079   0.241   0.360   0.412   0.461   0.557 1.001 19000
## theta[6]    0.409   0.080   0.243   0.360   0.412   0.462   0.561 1.001 20000
## theta[7]    0.409   0.079   0.242   0.360   0.412   0.461   0.560 1.001 12000
## theta[8]    0.408   0.079   0.242   0.359   0.411   0.460   0.559 1.001 11000
## theta[9]    0.409   0.079   0.244   0.360   0.413   0.461   0.562 1.001 20000
## theta[10]   0.410   0.079   0.243   0.361   0.412   0.462   0.559 1.001 20000
## theta[11]   0.432   0.078   0.276   0.382   0.432   0.481   0.589 1.001 13000
## theta[12]   0.433   0.078   0.276   0.383   0.433   0.482   0.588 1.001  9900
## theta[13]   0.431   0.077   0.274   0.382   0.432   0.480   0.584 1.001 12000
## theta[14]   0.431   0.078   0.274   0.382   0.432   0.481   0.586 1.001 20000
## theta[15]   0.431   0.078   0.276   0.383   0.431   0.480   0.590 1.001 20000
## theta[16]   0.454   0.078   0.306   0.403   0.451   0.503   0.619 1.001 13000
## theta[17]   0.455   0.078   0.307   0.404   0.452   0.503   0.619 1.001 20000
## theta[18]   0.454   0.079   0.302   0.403   0.451   0.503   0.618 1.001 18000
## theta[19]   0.455   0.077   0.309   0.405   0.452   0.503   0.617 1.001 20000
## theta[20]   0.454   0.079   0.302   0.402   0.451   0.503   0.619 1.001 15000
## theta[21]   0.454   0.078   0.300   0.404   0.452   0.502   0.616 1.001  6200
## theta[22]   0.455   0.079   0.303   0.404   0.452   0.503   0.622 1.001 17000
## theta[23]   0.477   0.082   0.329   0.422   0.471   0.526   0.657 1.001 20000
## theta[24]   0.477   0.081   0.329   0.423   0.472   0.526   0.655 1.001 20000
## theta[25]   0.500   0.086   0.352   0.441   0.491   0.551   0.692 1.001 20000
## theta[26]   0.499   0.085   0.353   0.441   0.491   0.549   0.690 1.001 20000
## theta[27]   0.498   0.084   0.350   0.441   0.491   0.549   0.685 1.001 20000
## theta[28]   0.522   0.091   0.368   0.458   0.512   0.577   0.725 1.001 12000
## deviance  379.246   5.135 368.618 375.817 379.516 382.831 388.728 1.001 20000
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 13.2 and DIC = 392.4
## DIC is an estimate of expected predictive error (lower deviance is better).
TouchData0 <- list(
  Ntotal = 0,
  Nsubj = length(unique(TherapeuticTouch$s)),
  # y = TherapeuticTouch$y,
  # must convert subjects to sequence 1:Nsubj
  s = as.numeric(factor(TherapeuticTouch$s))
)

So why do we prefer the mode to the mean? Let’s take a look at the prior distribution on one of the \(\theta\)s.

touch_jags_prior <-
  jags.parallel(
  data = TouchData0,
  model = touch_model,
  parameters.to.save = c("theta", "kappa", "omega"),
  n.burnin = 1000,
  n.iter = 41000,
  n.chains = 5,
  n.thin = 10,
  DIC = FALSE,
  jags.seed = 54321
)    
touch_jags_prior3 <-
  jags.parallel(
  data = TouchData0,
  model = touch_model3,
  parameters.to.save = c("theta", "kappa", "mu"),
  n.burnin = 1000,
  n.iter = 41000,
  n.chains = 5,
  n.thin = 10,
  DIC = FALSE,
  jags.seed = 54321
)    
gf_dens( ~ theta.1, data = posterior(touch_jags_prior), color = ~"mode") %>%
gf_dens( ~ theta.1, data = posterior(touch_jags_prior3), color = ~"mean") 

Using the mean rather than mode corresponds to an unfortunate prior on \(\theta_i\).

9.7 Shrinkage

9.8 Example: Baseball Batting Average

9.9 Exerciess