Remember that the standard error is the standard deviation of a null distribution or a sampling distribution. If we are using randomization or bootstrap, we can estimate the standard error by simply taking the standard deviation of the resulting distribution. But in many situations, there are formulas for the standard error we can use to avoid simulation.
Standard error measures sample to sample variability. This variability explains why samples, even under ideal conditions, don’t match populations exactly, and helps us quantify how much we can learn about the population from our sample. But standard error does not take into account other reasons why our sample might not match the population. For example, standard error does not measure problems like these:
Some of these problems can be addressed by more complicated statistical procedures, but the methods we have developed in this course generally assume optimal statistical conditions to be valid.
Here are our four SE formulas arranged in a table based on the parameter of interest and the number of groups. When there are two groups, we are interested in the difference in proportions or difference in means.
parameter type | one group | two groups |
---|---|---|
proportion | \(\displaystyle SE = \sqrt{\frac{p (1-p)}{n}}\) | \(\displaystyle SE = \sqrt{ \frac{p_1 (1-p_1)}{n_1} + \frac{p_2 (1-p_2)}{n_2}}\) |
mean | \(\displaystyle SE = \frac{\sigma}{\sqrt{n}}\) | \(\displaystyle SE = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}\) |
The table above will be provided on Test 3 and on the final exam.
This is true! But we can estimate them, substituting in a value that comes either from the null hypothesis or from our sample data.
When we have a paired design, the first step is to convert two variables into a single variable (usually by substraction, but sometimes the ratio is used instead). After we do that conversion, we are left with a single quantitative variable and we are interested in the mean of that variable. So the paired situation is just a special case (with an extra step) of the situation for one or two means.
We won’t learn the standard error formulas for the intercept, slope, or predictions of a linear model, but output from statistical software provides those values for us or uses them to do other calculations (like p-values and confidence intervals). The standard error formulas are similar to the ones above (the numerator includes a measure of variability in the population and the denominator includes a measure of sample size) but more complicated.
Chi-squared (goodness of fit and 2-way tables) is a bit different from the other situations we have covered.
The Chi-squared test statistic can be computed from a table of counts using
\[X^2 = \sum \frac{(\mathrm{observed} - \mathrm{expected})^2}{\mathrm{expected}} \;.\]
In each scenario below,
determine what variables you would need to collect and whether each is categorical or quantitative,
determine the parameter(s) of interest, and state the null and alternative hypotheses if you are going to do a hypothesis test,
determine whether you are using a paired design.
Sometimes there may be more than one way to design the study, but don’t design a poor study when a better option is available.
You want to know what proportion of Calvin students got a flu shot this year.
You want to know whether male students or female students were more likely to get a flu shot this year.
You want to know which of three diet plans is most effective at helping people lose weight.
You want to know whether rhubarb grows faster or slower if you cover it with a bucket for 3 weeks.
You want to know whether people can swim faster if they wear wet suits.
You want to know if there is an association between education level and smoking.
Make up additional examples for any of the scenarios we have covered in class but were not used abovce. (You can also make up additional examples like these just to practice identifying the correct analysis method for a study.)
Write down the word equation for a test statistic based on a standard error and a normal or t distribution.
Write down the word equation for a confidence interval based on a standard error and a normal or t distribution.
In each of the situations in problems 9 and 10, compute a p-value or confidence interval (or both).
In a study to compare the endurance of male and female mice, mice were made to swim in a bucket with a weight attached to their tail and rescued when they became exhausted. The table below gives the some information about the distribution of these “times to exhaustion” (in minutes).
sex | n | mean | sd |
---|---|---|---|
female | 162 | 11.4 | 26.09 |
male | 135 | 6.7 | 6.69 |
Use the data in StudentSurvey
(from Lock5withR
) to answer the following questions. In each case, some summary output is provided. That should be all you need.
(This is a sample of students from one particular university, so we can only generalize results to students at that university or perhaps to students at “similar universities” – and then only if the sample was reasonably representative.)
Do men exercise more than women? How much more?
response | Sex | min | Q1 | median | Q3 | max | mean | sd | n | missing |
---|---|---|---|---|---|---|---|---|---|---|
Exercise | Female | 0 | 4 | 7 | 12 | 27 | 8.11 | 5.199 | 168 | 1 |
Exercise | Male | 0 | 5 | 10 | 14 | 40 | 9.876 | 6.069 | 193 | 0 |
Are men more likely to be smokers than women? If so, how much more likely?
response | Sex | prop_No | prop_Yes | n |
---|---|---|---|---|
Smoke | Female | 0.9053 | 0.09467 | 169 |
Smoke | Male | 0.8601 | 0.1399 | 193 |
Give a 95% confidence interval for the slope of a regression of weight on height for the men based ond this study. How would you interpret this slope?
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | -69.06 | 44.17 | -1.564 | 0.1196 |
Height | 3.496 | 0.6228 | 5.613 | 7.119e-08 |
What is the average number of piercings for women at this university?
response | Sex | min | Q1 | median | Q3 | max | mean | sd | n | missing |
---|---|---|---|---|---|---|---|---|---|---|
Piercings | Female | 0 | 2 | 3 | 5 | 10 | 3.379 | 1.991 | 169 | 0 |
Piercings | Male | 0 | 0 | 0 | 0 | 5 | 0.1719 | 0.7566 | 192 | 1 |
Since we have the data for problem 10, we could get R to do all the work using t.test()
, lm()
, and prop.test()
1. Do that.
(Reminder: To get just the men for part c, you can use StudentSurvey %>% filter(...)
.)
For part c, we should really perform some checks to make sure that the linear model is OK to use here. What four things are we looking for? Perform the checks and state your conclusions.
Go back to some of the situations in problems 9 and 10 and create a randomization or bootstrap distribution, then
sd()
to estiamte the standard error.How do these results compare with the answer you got using formulas?
prop.test()
is slightly different from the method we learned. (1) It uses a Chi-squared statistic instead of a z statistic. Chi-squared is just the sqare of z when df = 1. (2) By default, prop.test()
uses a “continuity correction” to improve its accuracy. If you turn that off with correct = FALSE
, then the results will be more similar to you hand calculations.↩︎