Let’s create a function that computes the probability of having matching birthdays among \(n\) people. We’ll assume there are 365 equally likely birthdays.
The formula for this when there are \(n\) people is
\[ P(\mbox{matching birthday}) = 1 - P(\mbox{no matching birthday}) = 1 - \frac{365}{365} \cdot \frac{364}{365} \cdot \frac{363}{365} \cdots \frac{365 - n + 1}{365} \]
To create a function in R, we use the following template
my_function <- function(inputs) {
# code goes here
}
inputs
is a comma-separated list of named inputs. (You can also give the default values if you like.)
The function code can (and usually will) refer to these inputs.
The last line of the function, or any line wrapped in return()
provides the output of the function.
Let’s try this with our birthday problem. As we will see, R has some nice features that allow us compute products of the sort we need for our birthday problem.
bday <- function(num_people) {
1 - prod((365:(365 - num_people + 1)/365))
}
bday(10)
## [1] 0.1169482
bday(20)
## [1] 0.4114384
bday(25)
## [1] 0.5686997
bday(23)
## [1] 0.5072972
Looks like things are working. But R has the ability to receive a vector of inputs and return a vector of outputs. As written, our bday()
function doesn’t do this, but we can ask R to convert it over to a vectorized function for us:
bday <- Vectorize(bday) # convert to vectorized version
bday(1:23) # compute multiple probabilities all at once
## [1] 0.000000000 0.002739726 0.008204166 0.016355912 0.027135574 0.040462484
## [7] 0.056235703 0.074335292 0.094623834 0.116948178 0.141141378 0.167024789
## [13] 0.194410275 0.223102512 0.252901320 0.283604005 0.315007665 0.346911418
## [19] 0.379118526 0.411438384 0.443688335 0.475695308 0.507297234
This allows us to generate all the data needed to create a nice graphical view of what is going on here:
gf_point(bday(1:50) ~ 1:50) |>
gf_hline(yintercept = ~ 0.5, color = "red") |>
gf_labs(title = "Probability of matching birthdays in a group of people",
x = "number of people in the group",
y = "probablitiy of matching birthdays"
)