Population, Samples, and Estimates Exercises
Exercises
For these exercises, we will be using the following dataset:
library(downloader)
url < "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/mice_pheno.csv"
filename < basename(url)
download(url, destfile=filename)
dat < read.csv(filename)
We will remove the lines that contain missing values:
dat < na.omit( dat )

Use
dplyr
to create a vectorx
with the body weight of all males on the control (chow
) diet. What is this populationâ€™s average? 
Now use the
rafalib
package and use thepopsd
function to compute the population standard deviation. 
Set the seed at 1. Take a random sample of size 25 from
x
. What is the sample average? 
Use
dplyr
to create a vectory
with the body weight of all males on the high fat (hf
) diet. What is this populationâ€™s average? 
Now use the
rafalib
package and use thepopsd
function to compute the population standard deviation. 
Set the seed at 1. Take a random sample of size 25 from
y
. What is the sample average? 
What is the difference in absolute value between and $$\bar{X}\bar{Y}$?

Repeat the above for females. Make sure to set the seed to 1 before each
sample
call. What is the difference in absolute value between and $$\bar{X}\bar{Y}$? 
For the females, our sample estimates were closer to the population difference than with males. What is a possible explanation for this?
 A) The population variance of the females is smaller than that of the males; thus, the sample variable has less variability.
 B) Statistical estimates are more precise for females.
 C) The sample size was larger for females.
 D) The sample size was smaller for females.