Stat 407/507 Fall 2022 Exam 2

Stat 407/507 Fall 2022 Exam 2 (take home part)

Bring to the in-class exam on October 17

Please remember that this is an exam, so you are to do your own work and not communicate with other students about the exam

1. How do rent costs vary by state and the number of bedrooms? To address this question certain states were selected and two estimates of rent costs were obtained for a number of bedrooms varying from 0 to 4. The data are in this file. Here is some code to help read the data into R and SAS. Remember to show your work and comment on your results for each problem and sub-problem.

a. Use boxplots to visualize potential effects of state and number of bedrooms on rents. What do the plots suggest about group differences and model assumptions?

b. Assume that these eight states are the only ones of interest. Conduct an analysis of variance to test the null hypotheses of equality of rents by state, by number of bedrooms, and of no interaction between the two factors (show the ANOVA table, including sums of squares, mean squares, F statistic, and P value). If there is no interaction, use Tukey multiple comparison tests to compare levels of significant main effect factors. If an interaction is found, use an interaction plot and simple main effect tests to help interpret the interaction.

c. Examine a residual by predicted plot and a normal plot of the residuals to assess model assumptions of equal variance and normality.

d. Use the Box-Cox procedure to find a recommended transformation of the response. Would you reject the null hypothesis that the power parameter = 1?

e. Select a transformation of the wages, and repeat the ANOVA on the transformed data (just parts b and c above). Do the results change? Which analysis is more appropriate? What is your conclusion about the effects of state and number of bedrooms on rent?

2. Listed below are the sums of squares from a completely randomized factorial experiment with three factors. Factors A and B are fixed effects with 3 and 2 levels, respectively. Factor C is a random effect with 5 levels. The design is balanced with 2 replicates for each treatment combination.

a) Complete the ANOVA table including degrees of freedom, mean squares, and a row for total SS and df.

b) Then, using a table of expected values of mean squares (linked with lecture 16), calculate the value of the F statistic for each effect and also report the degrees of freedom for each F statistic. Show your formula for the F statistic as well as its numerical value. If an exact F test cannot be found, calculate a value for an approximate F value (you do not have to calculate degrees of freedom for approximate F tests, and you do not have to find P values for the tests).

Source     DF       Sum of squares    Mean square            F
A                            40.4                                                    F = MSA/MS? = xx.xx       on y and z df
B                           6.21
C                             1096.4
A*B                        43.3
A*C                        220.5
B*C                        110.4
A*B*C                   76.3
Error                       104.7
Total                       1698.3