Randall Pruim

eCOTS 2014

- Often used on first day of class
- Story
- woman claims she can tell whether milk has been poured into tea or vice versa.
- Question: How do we test this claim?

- Often used on first day of class
- Story
- woman claims she can tell whether milk has been poured into tea or vice versa.
- Question: How do we test this claim?
- Have her taste some tea prepared each way

(Flip a coin to decide which way.) - How many cups? (Let's try 10.)
- How do we evaluate her score?

- Have her taste some tea prepared each way

Have students guess a sequence of 10 heads and tails.

Flip a coin 10 times and compare.

Before the reveal, ask students

- What do you expect your score to be?
- What do you think the best score in the class will be?
- Is a perfect score possible just by guessing? Likely?
- What about 9 out of 10?

Use `rflip()`

to simulate flipping coins

```
rflip()
```

```
Flipping 1 coin [ Prob(Heads) = 0.5 ] ...
H
Number of Heads: 1 [Proportion Heads: 1]
```

Faster if we flip multiple coins at once:

```
rflip(10)
```

```
Flipping 10 coins [ Prob(Heads) = 0.5 ] ...
H H H T T T H H H T
Number of Heads: 6 [Proportion Heads: 0.6]
```

- easier to consider
`heads`

= correct;`tails`

= incorrect than to compare with a given pattern- this switch bothers me more than it bothers my students

`rflip(10)`

simulates 1 lady tasting 10 cups 1 time.

We can do that many times to see how guessing ladies do:

```
do(2) * rflip(10)
```

```
n heads tails prop
1 10 3 7 0.3
2 10 5 5 0.5
```

`do()`

is clever about what it remembers- 2 isn't many – we'll do many next.

```
Ladies <- do(5000) * rflip(10)
head(Ladies, 1)
```

```
n heads tails prop
1 10 4 6 0.4
```

```
histogram( ~ heads, data=Ladies, width=1 )
```

```
tally( ~(heads >= 9) , data=Ladies)
```

```
TRUE FALSE
63 4937
```

```
tally( ~(heads >= 9) , data=Ladies,
format="prop")
```

```
TRUE FALSE
0.0126 0.9874
```

```
tally( ~(heads >= 9) , data=Ladies,
format="prop")
```

```
TRUE FALSE
0.0126 0.9874
```

```
prop( ~(heads >= 9), data=Ladies)
```

```
TRUE
0.0126
```

- Do it for your data
- Do it for “random” data
- Do it lots of times for “random” data

- definition of “random” is important, but can
often be handled by
`shuffle()`

or`resample()`

```
diffmean(age ~ sex, data=HELPrct)
```

```
diffmean
-0.7841
```

```
do(1) *
diffmean(age ~ shuffle(sex), data=HELPrct)
```

```
diffmean
1 1.798
```

```
Null <- do(5000) *
diffmean(age ~ shuffle(sex), data=HELPrct)
```

```
prop( ~(abs(diffmean) > 0.7841), data=Null )
```

```
TRUE
0.3586
```

```
histogram(~ diffmean, data=Null, v=-.7841)
```

```
Bootstrap <- do(5000) *
diffmean(age~sex, data= resample(HELPrct))
histogram( ~diffmean, data=Bootstrap,
v=-.7841, glwd=4 )
```

```
cdata(~diffmean, data=Bootstrap, p=.95)
```

```
low hi central.p
-2.4617 0.8276 0.9500
```

```
confint(Bootstrap, method="quantile")
```

```
name lower upper level method
1 diffmean -2.462 0.8276 0.95 quantile
```

```
confint(Bootstrap) # default uses st. err
```

```
name lower upper level method estimate margin.of.error
1 diffmean -2.432 0.8761 0.95 stderr -0.778 1.654
```

```
do(1) * lm(width ~ length, data=KidsFeet)
```

```
Intercept length sigma r.squared
1 2.862 0.2479 0.3963 0.411
```

```
do(3) * lm( width ~ shuffle(length), data=KidsFeet)
```

```
Intercept length sigma r.squared
1 8.975 0.0007113 0.5164 3.382e-06
2 8.416 0.0232976 0.5155 3.629e-03
3 8.345 0.0261777 0.5152 4.581e-03
```

```
do(1) *
lm(width ~ length + sex, data=KidsFeet)
```

```
Intercept length sexG sigma r.squared
1 3.641 0.221 -0.2325 0.3849 0.4595
```

```
do(3) *
lm( width ~ length + shuffle(sex),
data=KidsFeet)
```

```
Intercept length sexG sigma r.squared
1 2.907 0.2486 -0.127647 0.3963 0.4271
2 2.883 0.2464 0.034996 0.4014 0.4122
3 2.849 0.2487 -0.008942 0.4018 0.4111
```

```
Null <- do(5000) *
lm( width ~ length + shuffle(sex),
data=KidsFeet)
histogram( ~ sexG, data=Null,
v=-0.2325, glwd=4)
```

```
histogram( ~ sexG, data=Null,
v=-0.2325, glwd=4)
```

```
prop( ~ (sexG <= -0.2325), data=Null )
```

```
TRUE
0.0382
```