4 Grid Method

1. Bob plays basketball. He shoots 70% of his shots from 2-point range and makes 48% of these shots. He shoots 30% of his shots from 3-point range and makes 32% of these shots. Bob just made a shot. What is the probability that it was a 3-point shot?

Do this problem twice. The first time, use probability rules, carefully denoting the probabilities involved. (For any number that appears, I should be able to tell from your notation where it came from.) The second time, use the “grid method” (create a tibble with prior, likelihood, and posterior for each of the possibilities – it will be a very small grid in this case).

2. The exponential distributions are a family of skewed distributions that can be used to model a number of situations, often related to the amount of time until some event occurs. The rate parameter (often denoted \(\lambda\)) is the only parameter for this family. In this problem, we will use the grid method to estimate the rate parameter.

Plot a few exponential distributions using

gf_dist("exp", rate = 1, color = ~ "rate = 1") %>%
  gf_dist("exp", rate = 2, color = ~ "rate = 2") %>%
  gf_dist("exp", rate = 1/2, color = ~ "rate = 1/2")

Use the grid method to approximate the posterior distribution for the rate parameter using the following data set.

4.22, 2.88, 6.65, 0.16, 0.28, 1.58, 1.57, 0.73, 13.63, 0.15, 5.02, 2.4, 1.41, 1.89, 0.94, 4.25, 7.82, 2.39, 2.95 and 20.21

Here’s how you could do it with a uniform prior. Adjust this to use a \({\sf Triangle}(0, 3, 1)\) distribution instead.

library(purrr)
x <- c(2.10, 36.06, 4.23, 1.13, 5.5, 11.24, 6.82,  2.88, 13.63, 6.56, 
       0.45,  1.53, 5.34, 1.57, 4.87, 9.44, 2.82, 12.88,  5.24, 5.12)

ExpGrid <-
  expand_grid(
    rate = seq(0, 2, length.out = 1001)
  ) %>%
  mutate(
    logprior = 0,
    loglikelihood = map_dbl(rate, ~ sum(dexp(x, rate = .x, log = TRUE))),
    logposterior = 0 + loglikelihood,
    posterior = exp(logposterior)
  )

Plot the posterior distribution for the rate parameter. What values of the rate parameter seem plausible, given this data?
Pick three plausible values for the rate parameter, one on the high end of the plausible range and one on the low end and one near the peak of the posterior distribution. (You may use an HPDI for this, but you can also just inspect the plot you made.) Now make a plot overlaying those three exponential distributions on top of a density histogram for the data.

3. Let’s try the grid method for a model with two parameters. Suppose we want to estimate the mean and standard deviation of the heights of 21-year-old American men or women (your choice which group). First, lets get some data.

```r
library(NHANES)
Men   <- NHANES %>% filter(Gender == "male", Age == 21)
Women <- NHANES %>% filter(Gender == "female", Age == 21)
```

Likelihood Our model is that heights are normally distributed with mean \(\mu\) and standard deviation \(\sigma\):
Prior. For our prior, let’s use something informed just a little bit by what we know about people (we could do better with other priors, but that’s not our goal at the moment):
- the mean height is somewhere between 5 and 7 feet (let’s use 150 and 200 cm which are close to that)
- the standard deviation is positive, but no more than 20 cm (so 95% of people are within 40 cm (~ 16 inches) of average – that seems like a pretty safe bet).
- we will use a uniform prior over these ranges (even though you probably believe that some parts of the ranges are much more credible than others).

So our model is \[\begin{align*} \mathrm{Height} & \sim {\sf Norm}(\mu, \sigma) \\ \mu & \sim {\sf Unif}(150, 200) \\ \sigma & \sim {\sf Unif}(0, 20) \end{align*}\]

Grid. Use a grid that has 201-501 values for each parameter. Fill in the ?? below to create and update your grid.

Notes:

It is more numerically stable to work on the log-scale as much as possible,
You may normalize if you want to, but it isn’t necessary for this problem.

library(purrr)
Height_Grid <- 
  expand_grid(
    mu      = seq(??, ??, ?? = ??),
    sigma   = seq(??, ??, ?? = ??)    
  ) %>%
  filter(sigma != 0) %>%   # remove if sigma == 0
  mutate(
    prior = ??,
    logprior = ??,
    loglik = 
      map2_dbl(mu, sigma, 
               ~ ?? )      # use .x for mu and .y for sigma
    logpost = logprior + loglik,
    posterior = exp(logpost)
  )

Once you have created and updated your grid, you can visualize your posterior using a command like this (use gf_lims() if you want to zoom in a bit):

gf_tile(posterior ~ mu + sigma, data = Height_Grid) %>%
  gf_contour(posterior ~ mu+ sigma, data = Height_Grid, color = "yellow")

Now answer the following questions.

Using the picture you just created, what would you say are credible values for \(\mu\) and for \(\sigma\)?
Create posterior samples using slice_sample(). Be sure to use weight_by and replace = TRUE. Recreate a plot like the one above using your posterior samples. The two plots should look similar. [Hint: use gf_density2d_filled().]
Use your posterior samples to compute the 95% highest density (posterior) interval for each parameter.
Create a plot showing the posterior distribution for \(\mu\) and a 95% HDI for \(\mu\).

4. Redo the previous problem using a triangle prior for each parameter. You will need to choose where to put the edges and the peak of the triangle. Choose values that reflect what you now about people’s height. Feel free to choose a more informative prior that we used in problem 3.

5. Redo the previous problem using a normal prior for each parameter. Choose what values to use for the mean and standard deviation that reflect what you know about heights. Feel free to choose a more informative prior that we used in problem 3.

6. Repeat problem 3 using quap() and overlay the two posteriors (one from the grid method and one from quadratic approximation) for each parameter. How do they compare?

7. Repeat problem 4 using quap() and overlay the two posteriors (one from the grid method and one from quadratic approximation) for each parameter. How do they compare?

8. Repeat problem 5 using quap() and overlay the two posteriors (one from the grid method and one from quadratic approximation) for each parameter. How do they compare?