Fitting nested terms in SAS and R: This lecture has code for a nested experiment. The data is from the text "Design of Experiments: Statistical Principles of Research Design and Analysis" by Robert O. Kuehl (second edition) Glucose was measured in a laboratory on three different days, with two separate runs per day. Thus runs are nested within days. SAS In SAS, to specify a nested term use parentheses. In the code below we see that run is nested within day by writing "run(day)" proc glm data = table77 ; class day run ; model glucose = day run(day) ; random day run(day) / test ; lsmeans day / e = run(day) stderr ; run ; proc mixed data = table77 ; class day run ; model glucose = ; random day run(day) ; run ; R R has more than one way to specify a nested term, depending on what package is used. We will mention three ways to specify nested effects in R models: ------------------------------------------------------------------------------ METHOD 1: In random or mixed effects models, if the factor is coded as nested, then you can specify it in the model as if it is crossed and R will recognize it as nested. In the R code for nesting for this glucose example (in a separate file), day 1 has runs 1 and 2, while day 2 has runs 3 and 4, etc. This identifies the run factor as being nested in day. In the glucose example, if day is fixed and run is nested within day (as a random effect), we could specify: table77.lmen <- lme(glucose ~ day, random = ~ 1 | run, data = table77, method = "REML") The "random = ~ 1 | run" part would typically specify a random effect that is potentially crossed (not nested) with day. Since the coding makes run nested within day, R instead knows to treat run as nested within day. The coding for the altered data below, however, uses runs 1 and 2 for each day, allowing it to be interpreted as crossed with day. ------------------------------------------------------------------------------ METHOD 2: Using the aov function to obtain nested sums of squares, but F ratios are not correct. summary(aov(glucose ~ day + Error(day/run), data = table77)) summary(aov(glucose ~ day + run %in% day, data = table77)) Either way calculates the sums of squares for run within day correctly, but they do not form the correct F ratio. Also, the two notations above are not often recognized by other packages. ------------------------------------------------------------------------------ METHOD 3: This way is reliable, but not very transparent. This way is based on the identity that: SSB(A) = SSB + SSA*B (In other words, the SS for a nested effect equals what SSB + SSA*B would have been had the factors been crossed instead of nested) Thus one way to calculate nested SS is to pretend the effect is crossed, calculate both SSB and SSA*B and add them: summary(aov(glucose ~ day + run + run:day, data = table77a)) However this is inconvenient and sometimes not possible depending on how the nested effect was coded. Instead we can use the fact that if a model is specified with an interaction but excluding a lower-order term, the software generally includes the term automatically. In other words if the software sees: summary(aov(glucose ~ day + run:day, data = table77a)) It will automatically use SSrun + SSrun:day as the result for SSrun:day. This approach works more widely than just with the aov function, many of my examples use it. Altered data for this example (the coding does not identify the run as nested): data <- matrix(scan(),ncol=3,byrow=TRUE) 1 1 42.5 1 1 43.3 1 1 42.9 1 2 42.2 1 2 41.4 1 2 41.8 2 1 48.0 2 1 44.6 2 1 43.7 2 2 42.0 2 2 42.8 2 2 42.8 3 1 41.7 3 1 43.4 3 1 42.5 3 2 40.6 3 2 41.8 3 2 41.8 data table77a <- data.frame(data) colnames(table77a) <- c("day","run","glucose") rm(data) table77a$day <- as.factor(table77a$day) table77a$run <- as.factor(table77a$run) summary(aov(glucose ~ day + run %in% day, data = table77a)) # has separate SS for run and run by day interaction summary(aov(glucose ~ day + run + run:day, data = table77a)) # since no run effect is listed, run:day is interpreted as being run +run:day summary(aov(glucose ~ day + run:day, data = table77a)) # NOTE: if the code below is used, a different set of SS will result # because the coding does not identify run as nested summary(aov(glucose ~ day + run, data = table77a))