Problems for Exam 1 (this is a take home - only exam)
Researchers wish to examine whether total agricultural output differs among
states in different
regions of the U.S. They have taken a sample of three states each from five
regions: The West,
the Southwest, the Midwest, the Southeast, and the Northeast. The data (taken
from http://www.stuffaboutstates.com/agriculture), are listed below (in millions
of dollars from 2004):
SW NM 2.56
SW TX 16.5
SW OK 5.05
WE WA 5.87
WE UT 1.25
WE NV .454
MW IL 9.71
MW SD 4.88
MW OH 5.46
SE MS 4.09
SE NC 8.21
SE GA 6.11
NE PA 4.86
NE NH .169
NE MD 1.74
Write one or more sentences summarizing your conclusions for each of the
following problems. Your summary should allow a person to understand your
results even if they do not know the software that you used (SAS, R, etc.).
1. Conduct an analysis of variance (fill out the ANOVA table, including F
statistic, and P value) to test the null hypothesis of equality of agricultural
output means by region. Examine a residual by predicted plot and a normal plot
of the residuals to assess model assumptions.
2. For the analyses in Problem 1, use the log(standard deviation) and log(mean)
regression to examine whether a power transformation is needed. Reanalyze the
data using a transformation suggested by the regression. Are the assumptions better met
in this analysis?
3. If each region had a large set of states, and you thought that the true
agricultural output in these regions (in millions of dollars) was SW = 8, WE =
2.5, MW = 7, SE = 6, and NE = 2.5, based on your previous analyses, how many
states would need to be sampled from each region to detect this difference with
80% power at alpha = .05?
4. In addressing these research questions, is the effect of region within the
U.S. better considered a fixed effect or a random effect?