Statistics 521 Homework #1 due by Friday, January 23

1. Find a multivariate data set that has between 5 and 15 response variables and between 30 and 200 observations. Write a SAS program to read in the data set, and also write up a summary of the data (no more than one page) that describes the data. The summary should identify all of the variables, discuss the population from which they were sampled, and describe some interesting questions that the data might address. Turn in a copy of the SAS program and your data summary. (This is essentially the same as problem 1 in Chapter 1 of our text) I may select some of these data sets to be posted on our course web site and for discussion in class. Note that this data set may be your data (or a subset of it) for your class project.

2. Are the measured variables in your data set above in comparable units?  Do you think the response variables should be standardized prior to performing multivariate analyses?  Why or why not?

3. Are the experimental units in your data set above likely to satisfy conditions of independence that are required for many multivariate analyses?  Explain your answer.

4. Consider the following data matrix x :

2 4 3
2 3 4
3 5 4
1 2 6
2 6 8

Answer the following questions by doing calculations by hand. Show all work.

a. What are the values of p, N, x_(32), and x_3 ?

b. Calculate the mean vector, and the sample covariance matrix.

c. Calculate Z, the matrix of Z scores for these data.

d. Construct a scatterplot of x_3 against x_2.

e. Construct a scatterplot of z_3 against z_2.

(This is essentially the same as problem 1.6 in our text.)