Transformations and combinations allow us to create new random variables from old.
Toss a pair of fair dice. Let X be the result on the first die and Y the result on the second die.
Then S=X+Y is the sum of the two dice and P=XY is the product.
Let F be the temperature of a randomly selected object in degrees Fahrenheit.
Then C=5/9(F−32) is the object’s temperature in degrees Celsius.
A random sample of n adult females is chosen. Let Xi be the height (in inches) of the ith person in the sample. Then ¯X=X1+X2+⋯Xnn, is the sample mean.
General Question: If X is the result of combining two or more known random variables or of transforming a single random variable, what can we know about the distribution of X?
0. \operatorname{Var}(X) = \operatorname{E}(X^2) – \operatorname{E}(X)^2
1a. \operatorname{E}(X+b) = E(X) + b and \operatorname{Var}(X+b) = \operatorname{Var}(X).
1b. \operatorname{E}(aX) = aE(X) and \operatorname{Var}(aX) = a^2 \operatorname{Var}(X).
1. \operatorname{E}(aX + b) = a \operatorname{E}(X) + b
2. \operatorname{E}(X+Y) = \operatorname{E}(X) + \operatorname{E}(Y).
3. If X and Y are independent, then \operatorname{E}(XY) = \operatorname{E}(X)\operatorname{E}(Y).
4. If X and Y are independent, then \operatorname{Var}(X+Y) = \operatorname{Var}(X) + \operatorname{Var}(Y).
These rules are not too difficult to prove, but we will focus mainly an how to use the rules.
X and Y are independent random variables if the distribution of X is the same for each value of Y and vice versa. This is equivalent to saying that
\operatorname{P}(X \le x \operatorname{and}Y \le y) = \operatorname{P}(X \le x) \cdot \operatorname{P}(Y \le y) for all x and y.
If a pair of fair dice are tossed, X is the value of the first die, and Y is the value of the second, then X and Y are independent.
If X and Y are the height and weight of a randomly selected person, then X and Y are not independent. (The distribution of weights is different for taller people compared to the distribution for shorter people.)
In the examples below we will compare simulations to the rules above.
Two Dice
Let X and Y be the values on two fair dice. Look at the sum and product: X+Y and XY.
Uniform
Let X and Y be independent random variables that have the uniform distribution on [0,1]. Again, let’s look at the sum and product.
Build your own examples
You can do a similar thing with any distributions you like. It even works if X and Y have different distributions!
We can combine the rules above to build two more rules.
5. \operatorname{E}(a_1 X_1 + a_2 X_2 + \cdots a_n X_n) = a_1 \operatorname{E}(X_1) + a_2 \operatorname{E}(X_2) + \cdots a_n \operatorname{E}(X_n)
6. If X_1, X_2, \dots, X_n are indepenent, then \operatorname{Var}(a_1 X_1 + a_2 X_2 + \cdots a_n X_n) = a_1^2 \operatorname{Var}(X_1) + a_2^2 \operatorname{Var}(X_2) + \cdots a_n^2 \operatorname{Var}(X_n)
\operatorname{SD}(a_1 X_1 + a_2 X_2 + \cdots a_n X_n) = \sqrt{ a_1^2 \operatorname{SD}(X_1)^2 + a_2^2 \operatorname{SD}(X_2)^2 + \cdots a_n^2 \operatorname{SD}(X_n)^2 }
Fact 1 Any linear transformation of a normal random variable is normal.
Fact 2 Any linear combination of independent normal random variables is normal
Example
If X \sim {\sf Norm}(1,1), Y \sim {\sf Norm}(-1,2), W \sim {\sf Norm}(5,4), and X,Y,W are independent, find the distribution of C = 2X + 3Y + W.
Fact 3 If X_1, X_2, \dots X_n are independent random variables that have the same distribution, then the sum will be approximately normal, no matter what distribution X_i has, provided n is “large enough”.
The approximation gets better and better as n increases.
Simulation
sim1 <- runif(10000, 0, 1)
sim2 <- runif(10000, 0, 1)
sim3 <- runif(10000, 0, 1)
sim4 <- runif(10000, 0, 1)
sim5 <- runif(10000, 0, 1)
sim6 <- runif(10000, 0, 1)
sum2 <- sim1 + sim2
sum4 <- sim1 + sim2 + sim3 + sim4
sum6 <- sim1 + sim2 + sim3 + sim4 + sim5 + sim6
gf_dhistogram( ~ sum2, title = "Sum of 2 uniform random variables")
gf_dhistogram( ~ sum4, title = "Sum of 4 uniform random variables")
gf_dhistogram( ~ sum6, title = "Sum of 6 uniform random variables")
It takes longer for an exponential distribution, but it still converges to normal pretty quickly.
sim1 <- rexp(10000, 1)
sim2 <- rexp(10000, 1)
sim3 <- rexp(10000, 1)
sim4 <- rexp(10000, 1)
sim5 <- rexp(10000, 1)
sim6 <- rexp(10000, 1)
sim7 <- rexp(10000, 1)
sim8 <- rexp(10000, 1)
sum2 <- sim1 + sim2
sum4 <- sim1 + sim2 + sim3 + sim4
sum6 <- sim1 + sim2 + sim3 + sim4 + sim5 + sim6
sum8 <- sim1 + sim2 + sim3 + sim4 + sim5 + sim6 + sim7 + sim8
gf_dhistogram( ~ sum2, title = "Sum of 2 exponential random variables")
gf_dhistogram( ~ sum4, title = "Sum of 4 exponential random variables")
gf_dhistogram( ~ sum6, title = "Sum of 6 exponential random variables")
gf_dhistogram( ~ sum8, title = "Sum of 8 exponential random variables")
Often we are interested the mean of something. The mean can be written as
\overline X = \frac{X_1 + X_2 + X_3 + \cdots X_n}{n}
If each X_i is randomly selected from the same population, then the numerator is a sum of independent and identially distributed (iid) random variables, so…
Our rules tell us the mean and variance (and standard deviation) of \overline {X}.
\overline{X} will be approximately normal, provided our sample is large enough.
This means we can approximate probability about \overline{X}, no matter what distribution the X_i’s come from!
1. If each X_i has a mean of \mu and a standard deviation of \sigma, and the X_i are independent, fill in the question marks below:
\overline{X} \approx {\sf Norm}(?, ?)
This is arguably the most important result in all of statistics and is referred to as the Central Limit Theorem.
2. If X and Y are independent random variables, \operatorname{Var}(X+Y) = \operatorname{Var}(X) + \operatorname{Var}(Y). What about \operatorname{Var}(X-Y)? Your first guess might be that \operatorname{Var}(X-Y) = \operatorname{Var}(X) – \operatorname{Var}(Y). But this cannot be true, since if \operatorname{Var}(Y) > \operatorname{Var}(X) we would have \operatorname{Var}(X-Y) < 0, but variance is always non-negative.
What is the correct rule? [Hint: use the rules we have.]
\operatorname{Var}(X - Y) = ??
3. Suppose X \sim {\sf Norm}(10,3), Y \sim {\sf Norm}(6,2), and X and Y are independent.
4. Suppose X is a random variable with mean = 6 and sd = 2, Y is a random variable with mean = -1 and sd = 2, and X and Y are independent.
What are the mean and standard deviation of 2X-Y?
What are the mean and standard deviation of 3X+4Y?
5. Let X and Y be independent {\sf Gamma}(shape = 3, rate = 4) random variables.
Let S = X+Y and D = X-Y. Is a gamma distribution a good fit for S? Use a simulation with 10000 repetitions to test this. For each (sum and difference):
gf_fitdistr()
to compare your histogram to the best fitting Gamma distribution.fitdistr()
to get the shape and rate parameters for the best fit.6. So, it looks like the sum of two independent gamma random variables with the same shape and scale also has a gamma distribution.
We would not expect the difference to have a gamma distribution, since values or D can be negative and a gamma RV cannot be negative. Use simulations to see whehter it looks like D is normal.
7. Suppose X \sim {\sf Gamma}(3, 4) and W \sim {\sf Gamma}(1,5) and W is independent of X. Use simulations to check whether it is reasonable to conclude that X+W has a gamma distribution.
8. Another important distribution in applications is the Chi-squared distribution. The Chi-squared distribution is a special case of the Gamma distribution. If n is a positive integer, then X has a Chi-square distribution with n degrees of freedom if X has a Gamma distribution with shape = n/2 and rate = 1/2, i.e., {\sf Chisq}(n) = {\sf Gamma}(n/2,1/2). The abbreviation for Chi-squared in R is chisq
.
Let Z be the standard normal random variable; i.e., Z \sim Norm(0,1). What kind of distribution does Z^2 have? One of the following is true about Z^2. * It has a normal distribution or * It has a Chi-squared distribution with 1 degree of freedom. Use a simulations to determine which of these two alternatives is correct.