Getting Started with Bayes

\[ \operatorname{Pr}(A \mid B) = \frac{\operatorname{Pr}(\mbox{both})}{\operatorname{Pr}(\mbox{condition})} = \frac{\operatorname{Pr}(\mbox{joint})}{\operatorname{Pr}(\mbox{marginal})} = \frac{\operatorname{Pr}(A, B)}{\operatorname{Pr}(B)} \]

Rearranging:

\[\begin{align*} \operatorname{Pr}(A, B) &= \operatorname{Pr}(B) \cdot \operatorname{Pr}(A \mid B) \\ &= \operatorname{Pr}(A) \cdot \operatorname{Pr}(B \mid A) \end{align*}\]

More rearranging

\[\begin{align*} \operatorname{Pr}(B) \operatorname{Pr}(A \mid B) &= \operatorname{Pr}(A) \cdot \operatorname{Pr}(B \mid A) \\[5mm] \operatorname{Pr}(A \mid B) &= \frac{\operatorname{Pr}(A) \cdot \operatorname{Pr}(B \mid A)}{\operatorname{Pr}(B)} \end{align*}\]

And some different letters (and words):

\[\begin{align*} \operatorname{Pr}(H \mid D) &= \frac{\operatorname{Pr}(H) \cdot \operatorname{Pr}(D \mid H)}{\operatorname{Pr}(D)} \\[5mm] \operatorname{Pr}(\mbox{hypothesis} \mid \mbox{data}) &= \frac{\operatorname{Pr}(\mbox{hypothesis}) \cdot \operatorname{Pr}(\mbox{data} \mid \mbox{hypothesis})}{\operatorname{Pr}(\mbox{data})} \end{align*}\]

The big take-away:

\[ \mbox{posterior} \propto \mbox{prior} \cdot \mbox{liklihood} \]

First Example of Grid Approximation Method

Data: 6 of 9 sampled points were water

Prior: \(p \sim \operatorname{Unif}(0,1)\)

Likelihood: How likely are the data given a parameter value?

WaterGrid <-
  expand_grid(
    p = seq(0, 1, length.out = 501) 
  ) %>%
  mutate(
    prior = 1, 
    likelihood = p^6 * (1-p)^3,
    likelihood = likelihood / sum(likelihood),
    posterior0 = prior * likelihood,
    posterior = posterior0 / sum(posterior0)
  )
library(patchwork)
gf_line( prior ~ p, data = WaterGrid) | 
gf_line( likelihood ~ p, data = WaterGrid) |
gf_line( posterior ~ p, data = WaterGrid)

Rescaling

Those plots are on very different scales. Mostly this doesn’t matter. But technically, the posterior plot is wrong since we haven’t scaled it properly. It should be scaled so that the area under the curve is 1 since it is a density. The likelihood is not a density, but we could rescale it just to make it compatible with the other two.

To rescale to get area 1, we need to take into account the width of our grid. If we let \(h_p\) be the height of one of these curves at position \(p\), and approximate the area with a rectangle, we want

\[ \sum h_p \Delta p = 1 \] so \[ \sum h = 1 / \Delta p \] We can get this by adding an additional / 0.002 to the rescaling that made the sum equal 1.

WaterGrid <-
  expand_grid(
    p = seq(0, 1, length.out = 501) 
  ) %>%
  mutate(
    prior0 = 1, 
    prior = prior0 / sum(prior0) / 0.002,
    likelihood0 = p^6 * (1-p)^3,
    likelihood = likelihood0 / sum(likelihood0) / 0.002,
    posterior0 = prior * likelihood,
    posterior = posterior0 / sum(posterior0) / 0.002
  )

Now we can plot these functions on the same scale:

gf_line( prior ~ p, data = WaterGrid, color = ~ "prior") %>% 
gf_line( likelihood ~ p, data = WaterGrid, color = ~ "likelihood", size = 1.7) %>%
gf_line( posterior ~ p, data = WaterGrid, color = ~ "posterior")

Saving the intermediate values is just done for inspection purposes. In the next example we rescale in place.

Second Version – New Prior

New Prior: \(p \sim \operatorname{Triangle}(0, 1, 0.7)\)

library(triangle)
WaterGrid2 <-
  expand_grid(
    p = seq(0, 1, length.out = 501)
  ) %>%
  mutate(
    prior = dtriangle(p, 0, 1, 0.7),
    likelihood = p^6 * (1-p)^3,
    likelihood = likelihood / sum(likelihood) / 0.002,
    posterior = prior * likelihood,
    posterior = posterior / sum(posterior) / 0.002
  )
gf_line( prior ~ p, data = WaterGrid2, color = ~ "prior") %>% 
gf_line( likelihood ~ p, data = WaterGrid2, color = ~ "likelihood", size = 1.7) %>%
gf_line( posterior ~ p, data = WaterGrid2, color = ~ "posterior")

Let’s compare the two posteriors.

gf_line( posterior ~ p, data = WaterGrid, color = ~ "uniform prior") %>%
  gf_line( posterior ~ p, data = WaterGrid2, color = ~ "triangle prior")