Fall 2010    Stat 514     Test 3    Take Home Problems

Bring these results to the exam on December 13.  Keep answers on separate sheets, as you will hand in the solution to one or more problems. 

1. Data was collected  on the per capita income (PCINC), and percentage of the labor force employed in agriculture (AGR), industry (IND), and in service occupations (SER) for 20 European countries in 1960.  We wish to develop a model to predict per capita income using the the percentage of the labor force employed in agriculture and service occupations. Perform four multiple regression analyses of these data, using least-squares, bootstrapped least-squares (tests only), and  robust regression (methods M and MM).  Examining the results of these analyses and using plots of the data, which analyses give similar results?  Which method(s) give the best model, and why did you choose that (those) method(s)?

 2. Salary data was collected for the 2010 season for the top four major league baseball teams in each of three divisions (East, Central, and West) in the two leagues (American and National).  Consider the average salary data for each team, which is the 'avgsalary' variable in this SAS programAnalyze the average salary data to test for the effects of league, division, and their interaction using least squares, aligned ranks, and robust regression methods. Which method(s) are the most appropriate for analysis of these data, and what are the results for that (those) analyses?

3. In performing a multiple regression analysis (such as the one above involving economic data from the European countries), suppose that you are considering using either a bootstrap method or a robust regression-based method for analyzing the data.  In what situations would a bootstrap approach be preferred?  In what situations would a robust regression-based approach be preferred?