Processing math: 9%

Creating new random variables from old

Transformations and combinations allow us to create new random variables from old.

Examples

  • Toss a pair of fair dice. Let X be the result on the first die and Y the result on the second die.
    Then S=X+Y is the sum of the two dice and P=XY is the product.

  • Let F be the temperature of a randomly selected object in degrees Fahrenheit.
    Then C=5/9(F32) is the object’s temperature in degrees Celsius.

  • A random sample of n adult females is chosen. Let Xi be the height (in inches) of the ith person in the sample. Then ¯X=X1+X2+Xnn, is the sample mean.

Important Rules for Combinations and Transformations

General Question: If X is the result of combining two or more known random variables or of transforming a single random variable, what can we know about the distribution of X?



Rules for transforming and combining random variables

0. \operatorname{Var}(X) = \operatorname{E}(X^2) – \operatorname{E}(X)^2

  • Not really about transformation and combinations, but useful to remember.

1a. \operatorname{E}(X+b) = E(X) + b and \operatorname{Var}(X+b) = \operatorname{Var}(X).

1b. \operatorname{E}(aX) = aE(X) and \operatorname{Var}(aX) = a^2 \operatorname{Var}(X).

1. \operatorname{E}(aX + b) = a \operatorname{E}(X) + b

  • The expected value of a linear transformation is the linear transformation of the expected value.

2. \operatorname{E}(X+Y) = \operatorname{E}(X) + \operatorname{E}(Y).

  • The expected value of a sum is the sum of the expected values.

3. If X and Y are independent, then \operatorname{E}(XY) = \operatorname{E}(X)\operatorname{E}(Y).

  • The expected product ist the procuct of the expected values – provided the random variables are independent.

4. If X and Y are independent, then \operatorname{Var}(X+Y) = \operatorname{Var}(X) + \operatorname{Var}(Y).

  • The variance of a sum is the sum of the variances – provided the random variables are independent.

These rules are not too difficult to prove, but we will focus mainly an how to use the rules.

Independent Random Variables

X and Y are independent random variables if the distribution of X is the same for each value of Y and vice versa. This is equivalent to saying that

\operatorname{P}(X \le x \operatorname{and}Y \le y) = \operatorname{P}(X \le x) \cdot \operatorname{P}(Y \le y) for all x and y.

Independence Examples

  • If a pair of fair dice are tossed, X is the value of the first die, and Y is the value of the second, then X and Y are independent.

  • If X and Y are the height and weight of a randomly selected person, then X and Y are not independent. (The distribution of weights is different for taller people compared to the distribution for shorter people.)

Examples

In the examples below we will compare simulations to the rules above.

Two Dice

Let X and Y be the values on two fair dice. Look at the sum and product: X+Y and XY.


Uniform

Let X and Y be independent random variables that have the uniform distribution on [0,1]. Again, let’s look at the sum and product.


Build your own examples

You can do a similar thing with any distributions you like. It even works if X and Y have different distributions!

Linear Combinations

We can combine the rules above to build two more rules.


Rules for transforming and combining random variables (continued)


5. \operatorname{E}(a_1 X_1 + a_2 X_2 + \cdots a_n X_n) = a_1 \operatorname{E}(X_1) + a_2 \operatorname{E}(X_2) + \cdots a_n \operatorname{E}(X_n)

  • The expected value of a linear combination is the linear combination of the expected values.


6. If X_1, X_2, \dots, X_n are indepenent, then \operatorname{Var}(a_1 X_1 + a_2 X_2 + \cdots a_n X_n) = a_1^2 \operatorname{Var}(X_1) + a_2^2 \operatorname{Var}(X_2) + \cdots a_n^2 \operatorname{Var}(X_n)

  • Note the squaring.
  • We can write this in terms of standard deviation if we like

\operatorname{SD}(a_1 X_1 + a_2 X_2 + \cdots a_n X_n) = \sqrt{ a_1^2 \operatorname{SD}(X_1)^2 + a_2^2 \operatorname{SD}(X_2)^2 + \cdots a_n^2 \operatorname{SD}(X_n)^2 }

  • This is sometimes called the Pythagorean identity for standard deviation. The independence assumption is analogous to the assumption that the triangle has a right angle.

Normal Distributions are special

Fact 1 Any linear transformation of a normal random variable is normal.

Fact 2 Any linear combination of independent normal random variables is normal

Example
If X \sim {\sf Norm}(1,1), Y \sim {\sf Norm}(-1,2), W \sim {\sf Norm}(5,4), and X,Y,W are independent, find the distribution of C = 2X + 3Y + W.



Fact 3 If X_1, X_2, \dots X_n are independent random variables that have the same distribution, then the sum will be approximately normal, no matter what distribution X_i has, provided n is “large enough”.
The approximation gets better and better as n increases.

Simulation

sim1 <- runif(10000, 0, 1)
sim2 <- runif(10000, 0, 1)
sim3 <- runif(10000, 0, 1)
sim4 <- runif(10000, 0, 1)
sim5 <- runif(10000, 0, 1)
sim6 <- runif(10000, 0, 1)
sum2 <- sim1 + sim2
sum4 <- sim1 + sim2 + sim3 + sim4
sum6 <- sim1 + sim2 + sim3 + sim4 + sim5 + sim6
gf_dhistogram( ~ sum2, title = "Sum of 2 uniform random variables")

gf_dhistogram( ~ sum4, title = "Sum of 4 uniform random variables")

gf_dhistogram( ~ sum6, title = "Sum of 6 uniform random variables")

It takes longer for an exponential distribution, but it still converges to normal pretty quickly.

sim1 <- rexp(10000, 1)
sim2 <- rexp(10000, 1)
sim3 <- rexp(10000, 1)
sim4 <- rexp(10000, 1)
sim5 <- rexp(10000, 1)
sim6 <- rexp(10000, 1)
sim7 <- rexp(10000, 1)
sim8 <- rexp(10000, 1)
sum2 <- sim1 + sim2
sum4 <- sim1 + sim2 + sim3 + sim4
sum6 <- sim1 + sim2 + sim3 + sim4 + sim5 + sim6
sum8 <- sim1 + sim2 + sim3 + sim4 + sim5 + sim6 + sim7 + sim8
gf_dhistogram( ~ sum2, title = "Sum of 2 exponential random variables")

gf_dhistogram( ~ sum4, title = "Sum of 4 exponential random variables")

gf_dhistogram( ~ sum6, title = "Sum of 6 exponential random variables")

gf_dhistogram( ~ sum8, title = "Sum of 8 exponential random variables")

What does this have to do with statistics?

Often we are interested the mean of something. The mean can be written as

\overline X = \frac{X_1 + X_2 + X_3 + \cdots X_n}{n}

If each X_i is randomly selected from the same population, then the numerator is a sum of independent and identially distributed (iid) random variables, so…


  1. Our rules tell us the mean and variance (and standard deviation) of \overline {X}.

  2. \overline{X} will be approximately normal, provided our sample is large enough.

  3. This means we can approximate probability about \overline{X}, no matter what distribution the X_i’s come from!


Some Practice

1. If each X_i has a mean of \mu and a standard deviation of \sigma, and the X_i are independent, fill in the question marks below:

\overline{X} \approx {\sf Norm}(?, ?)

This is arguably the most important result in all of statistics and is referred to as the Central Limit Theorem.

2. If X and Y are independent random variables, \operatorname{Var}(X+Y) = \operatorname{Var}(X) + \operatorname{Var}(Y). What about \operatorname{Var}(X-Y)? Your first guess might be that \operatorname{Var}(X-Y) = \operatorname{Var}(X) – \operatorname{Var}(Y). But this cannot be true, since if \operatorname{Var}(Y) > \operatorname{Var}(X) we would have \operatorname{Var}(X-Y) < 0, but variance is always non-negative.

What is the correct rule? [Hint: use the rules we have.]

\operatorname{Var}(X - Y) = ??

3. Suppose X \sim {\sf Norm}(10,3), Y \sim {\sf Norm}(6,2), and X and Y are independent.

  1. What is the distribution of 2X – Y?
  2. What is the distribution of X+Y?
  3. What is \operatorname{P}(X \le 4)?
  4. What is \operatorname{P}(Y \ge 2)?
  5. What is \operatorname{P}(1 \le 2X - Y \le 4)?
  6. What is \operatorname{P}(X - Y \le 4)?
  7. What is \operatorname{P}(X \le Y)? (Hint: \operatorname{P}(X \le Y) = \operatorname{P}(X-Y \le 0). Use the distribution of X-Y.)

4. Suppose X is a random variable with mean = 6 and sd = 2, Y is a random variable with mean = -1 and sd = 2, and X and Y are independent.

  1. What are the mean and standard deviation of 2X-Y?

  2. What are the mean and standard deviation of 3X+4Y?

5. Let X and Y be independent {\sf Gamma}(shape = 3, rate = 4) random variables.
Let S = X+Y and D = X-Y. Is a gamma distribution a good fit for S? Use a simulation with 10000 repetitions to test this. For each (sum and difference):

6. So, it looks like the sum of two independent gamma random variables with the same shape and scale also has a gamma distribution.

We would not expect the difference to have a gamma distribution, since values or D can be negative and a gamma RV cannot be negative. Use simulations to see whehter it looks like D is normal.

7. Suppose X \sim {\sf Gamma}(3, 4) and W \sim {\sf Gamma}(1,5) and W is independent of X. Use simulations to check whether it is reasonable to conclude that X+W has a gamma distribution.

8. Another important distribution in applications is the Chi-squared distribution. The Chi-squared distribution is a special case of the Gamma distribution. If n is a positive integer, then X has a Chi-square distribution with n degrees of freedom if X has a Gamma distribution with shape = n/2 and rate = 1/2, i.e., {\sf Chisq}(n) = {\sf Gamma}(n/2,1/2). The abbreviation for Chi-squared in R is chisq.

Let Z be the standard normal random variable; i.e., Z \sim Norm(0,1). What kind of distribution does Z^2 have? One of the following is true about Z^2. * It has a normal distribution or * It has a Chi-squared distribution with 1 degree of freedom. Use a simulations to determine which of these two alternatives is correct.