Module 3, Homework 1

1) Report on steps 1 and 2 of the Data Mining Process (from Lecture 1) for your project. In other words, describe the problem that you will address, and describe the database that you will use to address it. Identify the variables and mention how many observations the dataset contains.

2) In the SAS directory there is a STAT folder, which contains a SAMPLES folder. In this folder find a file named REGEX; this file has many datasets and examples of using SAS PROC REG. One of the datasets is named HTWT, it contains 111 records of Height, Weight, Age, and Sex, of a set of children. Create a library and save this data into a SAS data file in your library. Fit a regression model (Weight = Height Age Sex) to predict Weight using the other three variables by using both ordinary SAS code in the Editor window and also using Enterprise Miner. Verify that the same results are obtained from both analyses. Hand in both sets of output. If you would like more practice: is this the best model possible?

For question 2, hand in a hard copy of your SAS code, and of the results from your two models.