Problems for Exam 1 (this is a take home - only exam)

Researchers wish to examine whether total agricultural output differs among states in different regions of the U.S. They have taken a sample of three states each from five regions: The West, the Southwest, the Midwest, the Southeast, and the Northeast. The data (taken from http://www.stuffaboutstates.com/agriculture), are listed below (in millions of dollars from 2004):

SW NM 2.56
SW TX 16.5
SW OK 5.05
WE WA 5.87
WE UT 1.25
WE NV .454
MW IL 9.71
MW SD 4.88
MW OH 5.46
SE MS 4.09
SE NC 8.21
SE GA 6.11
NE PA 4.86
NE NH .169
NE MD 1.74

Write one or more sentences summarizing your conclusions for each of the following problems. Your summary should allow a person to understand your results even if they do not know the software that you used (SAS, R, etc.).

1. Conduct an analysis of variance (fill out the ANOVA table, including F statistic, and P value) to test the null hypothesis of equality of agricultural output means by region. Examine a residual by predicted plot and a normal plot of the residuals to assess model assumptions.

2. For the analyses in Problem 1, use the log(standard deviation) and log(mean) regression to examine whether a power transformation is needed. Reanalyze the data using a transformation suggested by the regression. Are the assumptions better met in this analysis?

3. If each region had a large set of states, and you thought that the true agricultural output in these regions (in millions of dollars) was SW = 8, WE = 2.5, MW = 7, SE = 6, and NE = 2.5, based on your previous analyses, how many states would need to be sampled from each region to detect this difference with 80% power at alpha = .05?

4. In addressing these research questions, is the effect of region within the U.S. better considered a fixed effect or a random effect?