Module 1, Take home part of the final

1) Using one of the four sets of means from Homework 2, part 1, use 1000 iterations to simulate the values of the mean-squared error (MSE) from the ANOVA analysis. Do this two ways: one using the OUTSTAT command in PROC GLM, and the second way using ODS statements. Run PROC UNIVARIATE with the PLOT option for each of the two simulations, and email me your two sets of code and the PROC UNVARIATE results ONLY. Hand in the code and UNIVARIATE printout.

2) Each of the three variables in this data set is highly skewed and needs a log transformation. However, 0 values occur in the data. One often-used approach in this case is to add a small constant to each data point before taking the log value. Use ARRAY statements along with other DATA step statements and/or PROC's to create new variables which are equal to the log of (original value + (minimum of variable)/4 ) for each of these three variables.

3) Below is a SAS program to do a regression simulation and part of its output. Something is wrong with the output - report what is wrong with the output, and what mistake in the program causes the problem. Although you can enter the code yourself to find or verify the answer, try to do it first just by looking at this information.

data regsim ;
  do i = 1 to 100 ;
    x = 15 + 2*rannor(0) ;
	y = 7 + 3*x +x2 + 8*rannor(0) ;
	x2 = x**2 ;
	x3 = x**3 ;
	x4 = x**4 ;
    output ;
  end ;
  proc reg ;
    model y = x ;
	model y = x x2  ;
	model y = x x2 x3  ;
	model y = x x2 x3 x4  ;
  run ;

The REG Procedure
                                 Model: MODEL2
                             Dependent Variable: y
                              Analysis of Variance

                                     Sum of           Mean
 Source                   DF        Squares         Square    F Value    Pr > F
 Model                     2     9852.15067     4926.07533       1.33    0.2681
 Error                    96         354354     3691.19013
 Corrected Total          98         364206

              Root MSE             60.75517    R-Square     0.0271
              Dependent Mean      274.38425    Adj R-Sq     0.0068
              Coeff Var            22.14237

                              Parameter Estimates
                           Parameter       Standard
      Variable     DF       Estimate          Error    t Value    Pr > |t|
      Intercept     1      136.61230      221.94699       0.62      0.5397
      x             1       13.78691       29.87413       0.46      0.6455
      x2            1       -0.29757        0.99843      -0.30      0.7663