Repeated Measures ANOVA using R

Posted on March 9, 2009 by Paul Gribble

While so-called “between-subjects” ANOVA is absolutely straightforward in R, performing repeated measures (within-subjects) ANOVA is not so obvious. I have come across at least three different ways of performing repeated measures ANOVA in R. Which method you use depends on a number of things including (a) how you wish to perform follow-up tests (e.g. Tukey tests); (b) whether you care about (and want to correct for) heterogeneity of variances of differences between groups (sphericity).

Note: to adhere to the sum-to-zero convention for effect weights, you should always do this before running anovas in R:

> options(contrasts=c("contr.sum","contr.poly"))

Put the code right at the top of your script, run it each time you start a new R session.

This matters sometimes (not always). If you don’t do it, your sum of squares calculations may not match what you get, for example, in SPSS, or in many common textbooks on ANOVA (e.g. Maxwell & Delaney). See [obtaining the same ANOVA results in R as in SPSS – the difficulties with Type II and Type III sums of squares] for a more detailed account of why this is.

Of course, if you have some principled reason why you do not want to adhere to the sum-to-zero convention for effects weights, then by all means, use some other convention. The important thing to remember is that R, by default, does NOT adhere to the sum-to-zero convention, so if you fail to use the above code snippet, your results may not look like what you get in SPSS or in common texts on ANOVA.

On to some example data and procedures:

Let’s assume we have the following dataset: one dependent variable and a single repeated-measures factor with three levels:

> dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6)
> subject <- factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3",
+ "s4","s4","s4","s5","s5","s5"))
> myfactor <- factor(c("f1","f2","f3","f1","f2","f3","f1","f2","f3",
+ "f1","f2","f3","f1","f2","f3"))
> mydata <- data.frame(dv, subject, myfactor)

1. Univariate approach using aov()

This method gives you the “usual” looking ANOVA table

> am1 <- aov(dv ~ myfactor + Error(subject/myfactor), data=mydata)
> summary(am1)
Error: subject
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  4   12.4     3.1
Error: subject:myfactor
          Df  Sum Sq Mean Sq F value   Pr(>F)
myfactor   2 14.9333  7.4667  13.576 0.002683 **
Residuals  8  4.4000  0.5500
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

You get no test for sphericity, and if you try to use mauchly.test() you get an error message:

> mauchly.test(am1)
Error in UseMethod("mauchly.test", object) :
  no applicable method for "mauchly.test"

You also cannot use TukeyHSD() to perform posthoc Tukey tests:

> TukeyHSD(am1)
Error in UseMethod("TukeyHSD") : no applicable method for "TukeyHSD"

The way to perform posthoc tests is to form a linear contrast that tests the pairwise comparison of interest, compute an F ratio associated with the contrast, and then correct for Type-I error as you wish — e.g. if you want to use Tukey tests, convert the F into a q (studentized range statistic) and then look up a probability using ptukey(). The error term (denominator) of the F ratio is read off of the ANOVA table output. Let’s say you wish to compare the means of levels 1 and 2 of factor1:

> SScomp <- (mean(dv[myfactor=="f1"]) -
+ mean(dv[myfactor=="f2"]))^2
> dfcomp <- 1
> n <- 5 #5 subjects per group
> SSerr <- 4.4 #read off the ANOVA table
> dferr <- 8 # read off the ANOVA table
> MSerr <- SSerr / dferr
> Fcomp <- (n * SScomp/2)/MSerr
> pcomp <- 1-pf(Fcomp, dfcomp, dferr)
> Fcomp
[1] 11.63636
> pcomp
[1] 0.00920665

now compute a Tukey probability:

> qobs <- sqrt(2*Fcomp)
> ngroups <- length(levels(myfactor))
> ptk <- 1 - ptukey(qobs, ngroups, (ngroups - 1) * (n - 1))
> ptk
[1] 0.02234716

2. Univariate approach using lme()

This method fits a linear mixed-effects model, and produces output that looks more like a typical linear model that you get using lm(). First we need to load in the nlme package.

> require(nlme)
> am2 <- lme(dv ~ myfactor, random = ~1|subject/myfactor,
+ data=mydata)
> summary(am2)
Linear mixed-effects model fit by REML
 Data: NULL
       AIC      BIC    logLik
  50.62575 53.53519 -19.31288
Random effects:
 Formula: ~1 | subject
        (Intercept)
StdDev:   0.9219536
 Formula: ~1 | myfactor %in% subject

        (Intercept)  Residual
StdDev:   0.6727578 0.3120855

Fixed effects: dv ~ myfactor
            Value Std.Error DF  t-value p-value
(Intercept)   2.2 0.5291500  8 4.157611  0.0032
myfactorf2    1.6 0.4690417  8 3.411210  0.0092
myfactorf3    2.4 0.4690417  8 5.116815  0.0009
 Correlation:
           (Intr) myfct2
myfactorf2 -0.443
myfactorf3 -0.443  0.500
Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max
-0.48688951 -0.31208553  0.05125158  0.28645975  0.44662036 

Number of Observations: 15
Number of Groups:
              subject myfactor %in% subject
                    5                    15

Note that this output is not like a traditional ANOVA table, it is more like what you get from the lm() function. You can get it to spit out the F test for the myfactor main effect using anova():

> anova(am2)
            numDF denDF  F-value p-value
(Intercept)     1     8 60.40869  0.0001
myfactor        2     8 13.57575  0.0027

You can see the F-value and p-value for the myfactor effect are the same as in the ANOVA table output using aov() above in 1.

Note there is no test of sphericity, and mauchly.test() gives an error message:

> mauchly.test(am2)
Error in UseMethod("mauchly.test", object) :
  no applicable method for "mauchly.test"

with TukeyHSD() you also get no joy:

> TukeyHSD(am2)
Error in UseMethod("TukeyHSD") : no applicable method for "TukeyHSD"

To perform pairwise tests you can follow the same procedure as above in 1, OR you can do the following using the multcomp package and the glht() function:

> require(multcomp)
> summary(glht(am2,linfct=mcp(myfactor="Tukey")))
     Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts
Fit: lme.formula(fixed = dv ~ myfactor, data = mydata, random = ~1 |
    subject/myfactor)
Linear Hypotheses:
             Estimate Std. Error z value Pr(>|z|)
f2 - f1 == 0    1.600      0.469   3.411  0.00185 **
f3 - f1 == 0    2.400      0.469   5.117  < 1e-04 ***
f3 - f2 == 0    0.800      0.469   1.706  0.20308
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

Note that this method essentially uses the error term from the model fit as a whole … in other words it essentially assumes the sphericity assumption holds.

3. Multivariate approach using lm() / mlm()

The first step is to reorganize the data into a matrix form where rows are subjects, and columns are levels of the repeated measures factor:

> dvm <- with(mydata, cbind(dv[myfactor=="f1"],
+ dv[myfactor=="f2"], dv[myfactor=="f3"]))
> dvm
     [,1] [,2] [,3]
[1,]    1    3    4
[2,]    2    2    3
[3,]    2    5    6
[4,]    3    4    4
[5,]    3    5    6

The next step is to define a multivariate linear model:

> mlm1 <- lm(dvm ~ 1)
> mlm1
Call:
lm(formula = dvm ~ 1)
Coefficients:
             [,1]  [,2]  [,3]
(Intercept)  2.2   3.8   4.6

Note how the coefficients are equal to the means of each group.

The next step is to define a variable that defines the design of the study:

> rfactor <- factor(c("f1","f2","f3"))
> rfactor
[1] f1 f2 f3
Levels: f1 f2 f3

The next step is to load the car package, which contains the Anova() function (note the capital A in Anova())

> library(car)

Now we define a new anova model object mlm1.aov by a function call to Anova(), and then output the summary. Note in the summary we pass a flag, multivariate=FALSE, to avoid seeing the multivariate tests:

> mlm1.aov <- Anova(mlm1, idata = data.frame(rfactor),
+ idesign = ~rfactor, type="III")
> summary(mlm1.aov, multivariate=FALSE)
Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
                 SS num Df Error SS den Df      F   Pr(>F)
(Intercept) 187.267      1   12.400      4 60.409 0.001477 **
rfactor      14.933      2    4.400      8 13.576 0.002683 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Mauchly Tests for Sphericity

        Test statistic p-value
rfactor        0.26171 0.13388
Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity
         GG eps Pr(>F[GG])
rfactor 0.57528    0.01537 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
         HF eps Pr(>F[HF])
rfactor 0.65851    0.01086 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

So we get a traditional ANOVA table with the F test for the myfactor effect (called rfactor). Note that the F-value and p-value are the same as in the univariate approach above in 1. We also get, however, a test of sphericity (Mauchly test), and we get two corrected p-values, a Greenhouse-Geisser correction, and a Huynh-Feldt correction. Unfortunately TukeyHSD() still will not work here, so we have to do any followup tests by one of the methods outlined above in 1. or 2… OR, if we are concerned about the sphericity assumption, one may simply do a t.test on the difference scores between the two groups of interest, e.g.:

> contrastD <- dvm[,1]-dvm[,2]
> t.test(contrastD)

    One Sample t-test

data:  contrastD
t = -3.1379, df = 4, p-value = 0.03492
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 -3.0157148 -0.1842852
sample estimates:
mean of x
     -1.6

Note that this is a test uncorrected for Type-I error. You could of course transform the t-statistic into an F (since dfnum=1, F = t^2), then convert the t into a q, and compute ptukey, as above.

Good luck

About Paul Gribble

Professor The University of Western Ontario London, Ontario, Canada

View all posts by Paul Gribble →

This entry was posted in Uncategorized and tagged ANOVA, aov(), linear model, lm(), multivariate, posthoc, R, r-project, repeated measures, sphericity, statistics, Tukey, univariate, within-subjects. Bookmark the permalink.

49 Responses to Repeated Measures ANOVA using R

Pete Hurd says:

March 24, 2009 at 4:15 pm

Alas, my copy is in another city, or I’d provide the exact page numbers, but I believe Faraway’s “Extending the linear model in R” has a few pages on bootstrapping p values for effects on lme() models to avoid the thorny (http://134.148.236.121/R/help/06/05/27666.html) error term problem.
Pete Hurd says:

March 27, 2009 at 7:27 pm

A link to changes in the book I mentioned earlier of potential interest to readers of this thread: http://www.maths.bath.ac.uk/~jjf23/ELM/mixchange.pdf
Rafael Laboissière says:

May 20, 2009 at 1:16 pm

Great lab website and very informative blog, Paul. Congratulations! I hope to see you soon again.
Lucy says:

July 13, 2009 at 9:36 am

Hi,
I’m trying to perform a repeated measures ANOVA for the effect of time (in weeks) on the preferences of giraffe feeding on a particular species of tree. Would anyone have any advice on how to do this?
Thanks!
Martin O'Gorman says:

August 20, 2009 at 11:03 am

Excellent and informative blog but sadly illustrating the real need for some sort of uniformity within R. How can it be this “hard” to do something so straightforward?
Matt Dubins says:

September 1, 2009 at 10:33 pm

Wow. For the first time since I long ago started to poke around in R you’ve helped me to run a mixed effects ANOVA without having to restructure my damned dataset! Thanks!!
Ruediger Pfister says:

November 28, 2009 at 3:22 pm

Excellent information, thanks! I just want to point out an R-package “ez” by Michael Lawrence which somewhat helps to analyze mixed ANOVA models:

http://cran.at.r-project.org/web/packages/ez/index.html
or says:

November 29, 2009 at 9:22 am

Thank you for this post.
I would like to know how can I use your method of comparisons to check for simple main effects.
Paul says:

November 30, 2009 at 3:23 pm

Thanks for your question. Simple main effects are follow up tests done in designs with more than one factor, and refer to tests for the significance of one factor within the level of another factor. The easy way to do this is to essentially run a new ANOVA using only the data from the single level of the factor you’re testing within. So for example if you have two factors, A (a1,a2,a3) and B (b1,b2), and you are interested in the simple main effects of A within b1, then you would take the data corresponding to (a1b1, a2b1, a3b1) and run a new ANOVA on those data alone, ignoring the data from b2. If this were a between-subjects design then you would have to so something tricky, i.e. use the MSresiduals not from this new little ANOVA but from the original ANOVA … but for within-subjects designs, it’s better to use separate error terms for each analysis (you are essentially assuming homogeneity of variance is violated).

So to summarize, in a within-subjects design, performing a simple main effects analysis simply amounts to conducting a number of additional one-way, within-subject ANOVAs using data from only the level of the factor you’re interested in. Again, note that for doing this in a between-subjects design, you should use the MSerror term from the original 2-way ANOVA instead of the error term for the new one-way ANOVA. For within-subjects designs, this is not necessary (separate error terms are better).

Finally note that these guidelines are based on my own opinion (others may vary), which has been informed in large part by the approach taken in the Maxwell & Delaney book: Designing Experiments and Analysing Data: A Model Comparison Perspective (2nd Edition) by Scott E. Maxwell & Harold D. Delaney. Lawrence Erlbaum Associates (2003). ISBN: 0805837183
kay says:

March 7, 2010 at 8:07 am

hello,

that lab helped ma a lot, thanks!

i’ve got one question regarding formula expansion if a second fixed factor, say..

f2<-c(rep("a",9),rep("b",6))

is included. i'd like to use this design with the lme approach and post-hocs with multcomp. what would the formula look like?

thanks for any advise,
kay
kay says:

March 7, 2010 at 8:43 am

sorry, i wanted to say “between subjects factor” not “a second fixed factor”….
Pingback: Repeated measures ANOVA with r | R-statistics blog
Building Confidence in Children says:

April 23, 2010 at 12:16 pm

Good Website! I wanted to ask if I might be able to site some of your website and use a couple of items for a school assignment. Please drop me an email if its ok or not. Thanks
Pingback: Analyse von wiederholten Messungen
Paul says:

May 28, 2010 at 1:28 pm

Sure no problem
Xiao says:

June 8, 2010 at 3:58 pm

This is a great website and is very informative!.

I have a question regarding lme() and aov() though. I recently did a psycholinguistic study using a 2 by 2 within-subject design. I used aov() to do both subject and item analysis. The functions are as follows:
aov(rt~factor1*factor2 + Error(Subject/(factor1*factor2)), data=ReadingTime)
aov(rt~factor1*factor2 + Error(Item/(factor1*factor2)), data=ReadingTime)

I attempted to use lme() to do the same analysis, but I don’t know how to include factor1 and factor2 into the “random effects” part of the function. I tried the following, but R gave me an error message. I wonder if you know what is going on with my function.

summary(result<-lme(p5rt3SD~MSubj*ESubj, random=~1|Subject/(MSubj*ESubj), data=data))

Error in getGroups.data.frame(dataMix, groups) :
Invalid formula for groups
Error in summary(result <- lme(p5rt3SD ~ MSubj * ESubj, random = ~1 | :
Tinus says:

June 14, 2010 at 2:53 am

You could do all the above or you could BLOCK for the repeated measures. You want to know if there is a difference in treatment effects. Thus blocking will remove the nonsense effects of the repeated measures and only leave the treatments. Much less effort than above.

Note that the p-value given by blocks is not helpful and should not be interpreted.
Rafael Laboissière says:

August 9, 2010 at 4:33 pm

Hi Paul,

If one is interested in just doing the Mauchly’s test of sphericity, it is possible to get it directly from the data, without going trough Anova(), as illustrated below. Note that it yields the same result (W statistic and associated p-value) as showed in your post above.

> dv mlm mauchly.test (lm (mlm ~ 1), X = ~1)
Mauchly's test of sphericity Contrasts orthogonal to ~1
data: SSD matrix from lm(formula = mlm ~ 1) W = 0.2617, p-value = 0.1339

Rafael Laboissière says:

August 9, 2010 at 4:46 pm

Sorry for messing up the comments. WordPress does not seem to like the “<-" assignment inside "code" or "pre" tags. Here is my next try:

> dv <- c (1, 3, 4, 2, 2, 3, 2, 5, 6, 3, 4, 4, 3, 5, 6)
> mlm <- matrix (dv, nrow = 5, byrow = TRUE)
> mauchly.test (lm (mlm ~ 1), X = ~1)

	Mauchly's test of sphericity
	Contrasts orthogonal to
	~1


data:  SSD matrix from lm(formula = mlm ~ 1) 
W = 0.2617, p-value = 0.1339

Charles Nicholson says:

September 17, 2010 at 6:13 pm

I have question that I hope you might be able to assist with.

For my disseration, I have completed 2,430 tests to compare algorithm performance. The design is as follows: three algorithms (M1, M2, M3) were tested on problems designed with 3 factors (Supply, Cost, FixedCost) which have three levels each (High, Medium, Low); 30 random instances were created for each possible factor level (3x3x3X30 = 810 distinct problems). Each algorithm was tested on all 810 distinct problems.

I am testing for difference in solution time at every level for the algorithms. Now, regardless of method used, the factor levels themselves dominate the performance (FixedCost = Low will run 100 times faster than FixedCost = High), though this is not of interest.

For example: aov(time ~ Method)
shows a significant difference, however, I would like to know which methods (M1, M2, M3) are different from each other at every possible factor level.

If I use: mymodel <- aov(time ~ Method + Supply)
and then: TukeyHSD(myModel)
I get the Method differences overall, and the Supply differences overall, but this is not what I need.

I need Method differences overall, and Method differences for Supply = High, then Method differences for Supply = Medium, then Method difference for Supply = Low.

Is this a reasonable thing to ask?

Since I am testing each method on the same 30 problems per each factor level (810 problems total) do I need to worry about an specified error-term or not?
By the way, my Ph.D. is obviously not Stats… I designed the algorithms and appreciate your help with the analysis model.

Thank you for your time, consideration, and any insight you might offer.
Paul says:

October 27, 2010 at 7:35 pm

Sorry for the long delay! Sounds like you need to include the interaction effect in your model. Try using * instead of +
good luck!
Chunhao Tu says:

February 8, 2011 at 11:50 pm

Hi,
This is a great website and I learn a lot from the posts. I have one question. How do we calculate the sample size/power analysis for two way repeated-measures ANOVA?

many thanks
Sophia Cai says:

April 28, 2011 at 12:43 pm

I am wondering why we cannot use simultaneous error correction techniques (e.g., TukeyHSD) for withi-subjects designs. Is it because the sphericity assumption? If so, what is the reason for it?
Marcus Morrisey says:

May 30, 2011 at 2:35 pm

thanks for the clear summary, very helpful
Steph says:

December 8, 2011 at 10:08 am

Dear Paul,
Thank you for the very informative site!
I used your suggestion for preforming a TukeyHSD for repeated measure ANOVA :).
I’m writing a paper and i need to write a reference for the Tukey test. Can you tell me please a name of a book or a paper that you used for this test ?

Thanks !
Mark Lee says:

December 11, 2011 at 12:51 am

Very nicely done. My student version of SPSS would not perform Repeated Measures ANOVA and, as an R novice, I had no idea how to best structure the data set. Your protocol worked like a charm and I’ve since been able to compare it with the SPSS General Linear/Univariate approach, and the results are comparable. Thank you.
Paul says:

December 12, 2011 at 11:47 am

Hi,

Thanks for your comment.

A book I use for my graduate statistics course is:

Designing Experiments and Analysing Data: A Model Comparison Perspective (2nd Edition) by Scott E. Maxwell & Harold D. Delaney. Lawrence Erlbaum Associates (2003). ISBN: 0805837183

It has lots of material on post-hoc tests and multiple comparisons, including the Tukey test(s).
Paul says:

December 12, 2011 at 11:48 am

Hi,

Power analyses with higher-order ANOVA designs is sometimes not so simple. I can refer you to this book that I use in my graduate statistics course:

Designing Experiments and Analysing Data: A Model Comparison Perspective (2nd Edition) by Scott E. Maxwell & Harold D. Delaney. Lawrence Erlbaum Associates (2003). ISBN: 0805837183

It has some material on power analyses in higher-order designs, and it refers to other source material with more details still.

Good luck,
Paul says:

December 12, 2011 at 11:51 am

By all means, you can use Tukey tests for within-subjects designs… it’s just that some software packages don’t have that capability built-in. It’s not that big of a deal to do them by hand … or if you’re using a language like R, to program your own function to do it. I can refer you to the book I use for my graduate statistics course, which covers post-hoc tests in all sorts of ANOVA designs including within-subjects designs:

Designing Experiments and Analysing Data: A Model Comparison Perspective (2nd Edition) by Scott E. Maxwell & Harold D. Delaney. Lawrence Erlbaum Associates (2003). ISBN: 0805837183

Good luck,
TAFL says:

March 29, 2012 at 1:46 pm

Hi Paul,
Thank you for such a well written example. I am having one problem though. I have been copying your R commands, and all seems to be working up until the last step

>summary(glht(am2,linfct=mcp(myfactor=”Tukey”)))

where I get this message

Error in contrMat(table(mf[[nm]]), type = types[pm]) :
less than two groups

I have been searching for a solution, but have not found one. The only thing I could find online that might have been relevant is to check if the factors are have actually recognized as factors by R. However, I have confirmed that “myfactor” is a factor, so I don’t know how I can get around this hurdle.
Paul says:

March 29, 2012 at 2:44 pm

Strange … I’m not sure I have a magic bullet answer. You can confirm the number of levels of your “myfactor” variable using:

> levels(myfactor)
TAFL says:

March 29, 2012 at 4:16 pm

Hi Paul,
Thanks for the response. Unfortunately, the levels(myfactor) does insist there are three factors “f1” “f2” “f3”, so I guess we are stuck…if I can solve this problem, I will be sure to follow up on this post.
ZC says:

May 25, 2012 at 9:53 pm

Try factor(as.character(myfactor))
Sonja says:

August 28, 2012 at 2:16 am

Maybe you can add how your data set looks like (in a grid) to understand the first bit, how you assign the variables to factors, easier?
Dimitris Fouskakis says:

September 17, 2012 at 7:46 am

Hello Paul and thanks for the very nice website. I have a question. You are performing Tukey tests with two different ways (1. & 2.) and both of them assumes sphericity. When you compare groups 1 and 2, with your code you get a p-value of 0.02234716. When you are using the glht (2nd approach) function the p-value is 0.00185. Why the two methods lead to different p-values?
Thanks in advance, Dimitris
gshenaut says:

September 19, 2012 at 2:18 pm

I’m a new R user and I have a question. For several decades, I’ve been relying on Gary Perlman’s UNIX|Stat anova for my various research analysis needs. I’m learning R because I need to do some other kinds of analyses, and I’m starting with ANOVA just to get used to it. Anyway, the interesting thing about Perlman’s ANOVA program is that instead of manually specifying the factors, factor types, error term, and so on, as you have described here, you structure the dataset to specify each data point’s role in the design, and the program figures out what parameters to use. Here’s a description of the program’s input format: http://oldwww.acm.org/perlman/stat/doc/anova.htm

Anyway, my question is, has anyone that you know of developed an R script that can extract the design information from labels in the dataset? From this article, I suppose the script would need to be passed a parameter giving the type of ANOVA to perform (1, 2, 3) and the name of the datafile. (If not, I suppose I’m going to try to do this myself, probably using Perlman’s code as a starting point.)
Paul says:

September 19, 2012 at 2:24 pm

No, at least not that I know of. Sounds like an interesting idea though.
Paul says:

September 19, 2012 at 2:24 pm

that’s a good question….
Eliecer says:

March 4, 2013 at 5:24 pm

HI Paul, excelent explanations but I will need to do a 2-ways RM-ANOVA
can you explain how to insert another factor in the multvariate way for running RM-ANOVA?
Paul Gribble says:

March 6, 2013 at 10:58 am

Hi Eliecer,
You can find some information about doing 2-factor repeated measures anova in R, in my notes for the statistics course I teach:

http://www.gribblelab.org/neurostats/index.html

Click to access SplitPlot.pdf
Nils Ekeroth says:

May 2, 2013 at 9:38 am

Thank you for this very informative post!
I have one question/suggestion:
I don’t think that the aov-formula in section 1 is directly equivalent to the lme-formula in section 2. In your example, F-ratios and p-values are identical but this is not the case for me when I try the same thing with my data (which very much deviate from the assumption of sphericity).
I have to specify “compound symmetry” as the correlation matrix in the lme-formula to get output identical to the aov-formula. Hence, I think the lme-function should be like this:

am2 <- lme(dv ~ myfactor, random = ~1|subject, data=mydata, corr = corCompSymm(form = ~1|subject)

If I omit the "corr" part I THINK R automatically create a model which include to the most appropriate correlation matrix? In your case, that is probably "compound symmetry" in my case some other matrix is used?

I am doing a two-way mixed model with one between factor (3 levels) and one within factor (3 levels). If I remove the between-factor for the analysis, the two models (i.e. with and without the "corr" part) are nearly exactly identical.
I would really appreciate your comments on this!
Best, Nils
Paul Gribble says:

May 3, 2013 at 11:00 am

Hi Nils,
I’m afraid the issue of compound symmetry is beyond my expertise…
ange says:

May 27, 2013 at 7:15 am

Hi, someone knows how can i obtein the coefficients of the polynomial of degree n that fits a set of given points? Thanks…… :)
Rob Baer says:

May 27, 2013 at 3:05 pm

Nils,

?lme tells us that the default for the corr= argument is NULL.

correlation
an optional corStruct object describing the within-group correlation structure. See the documentation of corClasses for a description of the available corStruct classes. Defaults to NULL, corresponding to no within-group correlations.

This is appropriately set to the structure of you data set — often empirically, to minimize AIC, BIC, or optimize log-likelihood of the fit. For possible choices of var-covar structures, see vales under ?corClasses.
Nils Ekeroth says:

May 28, 2013 at 9:02 am

Thank you very much for your reply Rob,

I don’t fully understand what “no within-group correlation” (i.e. corr=NULL) means? The correlation matrix has to be ‘something’, right?

I have done what you describe with my data. That is, tested different classes of correlation structures for each dependent variable of my data set and then chosen the appropriate model by comparing log-likelihoods and AICs.

In this comparison I normally include 4 alternative correlation matrices (corr=corCAR1, corr=corCompSymm , corr=corSymm and the default corr=NULL). I have noted that the log-likelihood of the model with (corr=NULL) always seems to be exactly (down to the fifth decimal) identical to one of the alternative models. Often, but not always, the “corr=NULL” model log-likelihood is identical to that of the “corr=corCompSymm” model as seems to be the case for the data in the post by Paul G. This is why I drew the conclusion that R in some way automatically selects the best correlation structure if corr= is set to NULL, as I wrote in my original comment..

Paul – I believe that the aov-model you use implicitly assumes compound symmetry of your data. Therefore, it may be that the corCompSymm(form = ~1|subject)-argument should be included in the lme-model formulation to make it exactly equivalent to the aov-model?
Pip Howard says:

June 16, 2013 at 11:23 am

For anyone who may be interested, another way to get Mauchley’s test of sphericity (including the GG and HF corrected p values) within your ANOVA output and without having to restructure your data is to use:

ezANOVA(data= datafile, dv=.(dependent variable), wid=.(participant/stimuli label column), within=.(repeated measures variable), between=.(optional between groups variable), detailed=TRUE, type = “III”)
Charles Darwin II (@CharlesDarwinII) says:

November 18, 2013 at 6:23 pm

Thank you for taking the time to write this (in 2009!). I have been looking for post-hoc analysis methods for n-factor anova w/ with-in subject factors, and looking at options from Matlab to MyStat and now R!! Your post is very well written, and have saved me so much time!! Thank you!
egorananyev says:

December 1, 2013 at 7:34 am

Thank you very much for this step-by-step guide. Maybe I’m missing something major, but I would like to find the corrected df’s for HF or GG corrections from Anova(). Any help is appreciated!
Paul Gribble says:

December 1, 2013 at 10:55 am

Hi,
Sorry but I don’t know the answer, I’m not sure how to get the adjusted df directly. What you could do is manually calculate the epsilon parameter for the G-G correction (look up the formula online) and then apply it to the df to get the corrected df (again, look up the formula online).