Non-normal Data in Repeated Measures ANOVA: Impact on Type I Error and Power
PDF

Keywords

Violation of normality
Within-subject design
Robustness
Power
ANOVA Violación de la normalidad
Diseño intrasujeto
Robustez
Potencia
ANOVA

How to Cite

Blanca, M. J., Arnau, J., García-Castro, F. J., Alarcón, R., & Bono, R. (2023). Non-normal Data in Repeated Measures ANOVA: Impact on Type I Error and Power. Psicothema, 35(1), 21–29. Retrieved from https://reunido.uniovi.es/index.php/PST/article/view/19372

Abstract

Background: Repeated measures designs are commonly used in health and social sciences research. Although there are other, more advanced, statistical analyses, the F-statistic of repeated measures analysis of variance (RM-ANOVA) remains the most widely used procedure for analyzing differences in means. The impact of the violation of normality has been extensively studied for between-subjects ANOVA, but this is not the case for RM-ANOVA. Therefore, studies that extensively and systematically analyze the robustness of RM-ANOVA under the violation of normality are needed. This paper reports the results of two simulation studies aimed at analyzing the Type I error and power of RM-ANOVA when the normality assumption is violated but sphericity is fulfilled. Method: Study 1 considered 20 distributions, both known and unknown, and we manipulated the number of repeated measures (3, 4, 6, and 8) and sample size (from 10 to 300). Study 2 involved unequal distributions in each repeated measure. The distributions analyzed represent slight, moderate, and severe deviation from normality. Results: Overall, the results show that the Type I error and power of the F-statistic are not altered by the violation of normality. Conclusions: RM-ANOVA is generally robust to non-normality when the sphericity assumption is met.

PDF

References

Armstrong, R. (2017). Recommendations for analysis of repeatedmeasures designs: Testing and correcting for sphericity and use of

MANOVA and mixed model analysis. Ophthalmic & Physiological Optics, 37(5), 585–593. https://doi.org/1.1111/opo.12399.

Arnau, J., Bendayan, R., Blanca, M. J., & Bono, R. (2014). Should we rely on the Kenward–Roger approximation when using linear mixed

models if the groups have different distributions? British Journal of Mathematical and Statistical Psychology, 67(3), 408–429. https://doi.org/10.1111/bmsp.12026

Arnau, J., Bono, R., Blanca, M. J., & Bendayan, R. (2012). Using the linear mixed model to analyze nonnormal data distributions in longitudinal

designs. Behavior Research Methods, 44, 1224–1238. https://doi.org/10.3758/s13428-012-0196-y

Bathke, A., Schabenberger, O., Tobias, R., & Madden, L. (2009). Greenhouse-Geisser adjustment and the ANOVA-type statistic:

Cousins or twins? The American Statistician, 63(3), 239–246. https://doi.org/1.1198/tast.2009.08187

Bendayan, R., Arnau, J., Blanca, M. J., & Bono, R. (2014). Comparison of the procedures of Fleishman and Ramberg et al. for generating

non-normal data in simulation studies. Anales de Psicología, 30(1), 364–371. https://doi.org/10.6018/analesps.30.1.135911

Berkovits, I., Hancock, G., & Nevitt, J. (2000). Bootstrap resampling approaches for repeated measure designs: Relative robustness to

sphericity and normality violations. Educational and Psychological Measurement, 60(6), 877–892. https://doi.org/1.1177/00131640021970961

Blanca, M. J. (2004). Alternativas de análisis estadístico en los diseños de medidas repetidas [Approaches to the statistical analysis of repeated measures designs]. Psicothema, 16(3), 509–518.

Blanca, M. J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2017). Non-normal data: Is ANOVA still a valid option? Psicothema, 29(4),

–557. https://doi.org/1.7334/psicothema2016.383

Blanca, M. J., Alarcón, R., & Bono, R. (2018). Current practices in data analysis procedures in psychology: What has changed? Frontiers in

Psychology, 9, Article 2558. https://doi.org/10.3389/fpsyg.2018.02558

Blanca, M. J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. (2013). Skewness and kurtosis in real data samples. Methodology:

European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78–84. https://doi.org/10.1027/1614-2241/a000057

Bono, R., Arnau, J., Blanca, M. J., & Alarcón, R. (2016). Sphericity estimation bias for repeated measures designs in simulation studies.

Behavior Research Methods, 48(4), 1621–1630. https://doi.org/10.3758/s13428-015-0673-1

Bono, R., Arnau, J., & Vallejo, G. (2010). Modelización de diseños splitplot y estructuras de covarianza no estacionarias: un estudio de

simulación [Modeling split-plot data and nonstationary covariance structures: A simulation study]. Escritos de Psicología - Psychological

Writings, 3(3), 1–7. https://doi.org/10.5231/Psy.Writ.2010.2903

Bono, R., Blanca, M. J., Arnau, J., & Gómez-Benito, J. (2017). Nonnormal distributions commonly used in health, education, and social

sciences: A systematic review. Frontiers in Psychology, 8, Article 1602. https://doi.org/10.3389/fpsyg.2017.01602

Bosley, T. (2019). Comparative power of the Friedman, Neave and Worthington match, Skillings-Mack, trimmed means repeated

measures ANOVA, and bootstrap trimmed means repeated measures ANOVA tests [Doctoral dissertation, Wayne State University].

https://digitalcommons.wayne.edu/oa_dissertations/2318/

Box, G. E. P. (1953). Non-normality on test on variance. Biometrika, 40(3–4), 318–335. https://doi.org/10.1093/biomet/40.3-4.318

Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. https://doi.org/10.1111/j.2044-8317.1978.tb00581.x

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Erlbaum.

Cooper, J. A., & Garson, G. D. (2016). Power analysis. Statistical Associates Blue Book Series.

Davis, C. S. (2002). Statistical methods for the analysis of repeated measurements. Springer.

De Livera, A., Zaloumis, S., & Simpson, J. (2014). Models for the analysis of repeated continuous outcome measures in clinical trials: Analysis

of repeated continuous measures. Respirology, 19(2), 155–161. https://doi.org/1.1111/resp.12217

Fernández, P., Livacic-Rojas, P., & Vallejo, G. (2007). Cómo elegir la

mejor prueba estadística para analizar un diseño de medidas repetidas

[How to choose the best statistical analysis for analyzing a repeated

measures design]. International Journal of Clinical Psychology, 7(1),

–175.

Fernández, P., Vallejo, G., Livacic-Rojas, P. E., & Tuero, E. (2010). Características y análisis de los diseños de medidas repetidas en

la investigación en España en los últimos 10 años [Characteristics and analysis of repeated measures designs used in research in Spain

over the last 10 years]. In M. J. Blanca et al. (coords.), Actas del XI

Congreso de Metodologías de las Ciencias Sociales y de la Salud (pp. 193–198). Universidad de Málaga.

Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43(4), 521–532. https://doi.org/1.1007/BF02293811

Goedert, K., Boston, R., & Barrett, A. (2013). Advancing the science of spatial neglect rehabilitation: An improved statistical approach with

mixed linear modeling. Frontiers in Human Neuroscience, 7, Article 211. https://doi.org/1.3389/fnhum.2013.00211

Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576.

https://doi.org/10.1146/annurev.psych.58.110405.085530

Gueorguieva, R., & Krystal, J. (2004). Move over ANOVA: Progress in analyzing repeated-measures data and its reflection in papers

published in the Archives of General Psychiatry. Archives of General Psychiatry, 61(3), 310–317. https://doi.org/10.1001/archpsyc.61.3.310

Gunasekara, F., Richardson, K., Carter, K., & Blakely, T. (2014). Fixed effects analysis of repeated measures data. International Journal of

Epidemiology, 43(1), 264–269. https://doi.org/1.1093/ije/dyt221

Haverkamp, N., & Beauducel, A. (2017). Violation of the sphericity assumption and its effect on Type-I error rates in repeated measures

ANOVA and multi-level linear models (MLM). Frontiers in Psychology, 8, Article 1841. https://doi.org/10.3389/fpsyg.2017.01841

Haverkamp, N., & Beauducel, A. (2019). Differences of Type I error rates for ANOVA and Multilevel-Linear-Models using SAS and SPSS for

repeated measures designs. Meta-Psychology, 3, Article MP.2018.898. https://doi.org/10.15626/MP.2018.898

Islam, M., & Chowdhury, R. (2017). Analysis of repeated measures data. Springer. https://doi.org/1.1007/978-981-10-3794-8

Keselman, H. J., Algina, J., & Kowalchuk, R. (2001). The analysis of repeated measures designs: A review. British Journal of Mathematical

& Statistical Psychology, 54(1), 1–2. https://doi.org/1.1348/000711001159357

Keselman, H. J., Algina, J., & Kowalchuk, R. K. (2002). A comparison of data analysis strategies for testing omnibus effects in higher-order

repeated measures designs. Multivariate Behavioral Research, 37(3), 331–357. https://doi.org/10.1207/S15327906MBR3703_2

Keselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S., Cribbie, R. A., Donahue, B., Kowalchuk, R. K., Lowman, L. L., Petoskey, M. D., Keselman, J. C., & Levin, J. R. (1998). Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68(3), 350–386. https://doi.org/10.3102/00346543068003350

Keselman, J. C., Lix, L. M., & Keselman, H. J. (1996). The analysis of repeated measurements: A quantitative research synthesis. British

Journal of Mathematical and Statistical Psychology, 49(2), 275–298. https://doi.org/10.1111/j.2044-8317.1996.tb01089.x

Kherad-Pajouh, S., & Renaud, O. (2015). A general permutation approach for analyzing repeated measures ANOVA and mixed-model designs.

Statistical Papers, 56(4), 947–967. https://doi.org/1.1007/s00362-014-0617-3

Kirk, R. E. (2013). Experimental design. Procedures for the behavioral sciences (4th ed.). Sage.

Kowalchuk, R. K., Keselman, H. J., Algina, J., & Wolfinger, R. D. (2004). The analysis of repeated measurements with mixed-model adjusted F

tests. Educational and Psychological Measurement, 64(2), 224–242. https://doi.org/10.1177/0013164403260196

Livacic-Rojas, P., Vallejo, G., & Fernández, P. (2010). Analysis of Type I error rates of univariate and multivariate procedures in repeated measures

designs. Communications in Statistics – Simulation and Computation, 39(3), 624–664. https://doi.org/1.1080/03610910903548952

Maurissen, J., & Vidmar, T. (2017). Repeated-measure analyses: Which one? A survey of statistical models and recommendations for

reporting. Neurotoxicology and Teratology, 59, 78–84. https://doi.org/1.1016/j.ntt.2016.1.003

Meltzer, J. A. (2001). The effects on Type I error rate and power of the singlefactor repeated measures ANOVA F-test and selected alternatives

under non-normality and non-uniformity [Doctoral dissertation, The State University of New Jersey]. ProQuest Dissertations Publishing.

Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1), 156–166.

https://doi.org/10.1037/0033-2909.105.1.156

Moskowitz, D. S., & Hershberger, S. L. (2013). Modeling intraindividual variability with repeated measures data: Methods and applications.

Taylor & Francis. https://doi.org/1.4324/9781410604477

Raghavarao, D., & Padgett, L. (2014). Repeated measurements and crossover designs. John Wiley & Sons.

Robey, R. R., & Barcikowski, R. S. (1992). Type I error and the number of iterations in Monte Carlo studies of robustness. British Journal of

Mathematical and Statistical Psychology, 45(2), 283–288. https://doi.org/10.1111/j.2044-8317.1992.tb00993.x

SAS Institute Inc. (2013). SAS® 9.4 guide to software Updates. SAS Institute Inc.

Schober, P., & Vetter, T. (2018). Repeated measures designs and analysis of longitudinal data: If at first you do not succeed – try, try again.

Anesthesia and Analgesia, 127(2), 569–575. https://doi.org/1.1213/ANE.0000000000003511

Sheskin, D. J. (2003). Handbook of parametric and nonparametric statistical procedures. Chapman and Hall/CRC.

Singh, V., Rana, R., & Singhal, R. (2013). Analysis of repeated measurement data in the clinical trials. Journal of Ayurveda and

Integrative Medicine, 4(2), 77–81. https://doi.org/1.4103/0975-9476.113872

Schmider, E., Ziegler, M., Danay, E., Beyer, L., & Bühner, M. (2010). Is it really robust? Reinvestigating the robustness of ANOVA against

violations of the normal distribution assumption. Methodology: European Journal of Research Methods for the Behavioral and Social

Sciences, 6(4), 147–151. https://doi.org/10.1027/1614-2241/a000016

Tabachnick, B. G., & Fidell, L. (2007). Experimental designs using ANOVA. Thomson.

Tamura, R., & Buelke-Sam, J. (1992). The use of repeated measures analyses in developmental toxicology studies. Neurotoxicology and Teratology,

(3), 205–21. https://doi.org/1.1016/0892-0362(92)90018-6

Tippey, K., Ritchey, P., & Ferris, T. (2015). Crossover-repeated measures designs: Clarifying common misconceptions for a valuable human

factors statistical technique. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 59(1), 342–346. https://doi.org/10.1177/1541931215591071

Vallejo, G., Fernández, P., & Livacic-Rojas, P. (2010). Pruebas robustas para modelos ANOVA de dos factores con varianzas heterogéneas

[Robust tests for two-way ANOVA models under heteroscedasticity]. Psicológica, 31(1), 129–148.

Vallejo, G., Fernández, M. P., Livacic-Rojas, P. E., & Tuero-Herrero, E. (2011). Comparison of modern methods for analyzing repeated

measures data with missing data. Multivariate Behavioral Research, 46(6), 900–937. https://doi.org/10.1080/00273171.2011.625320

Vallejo, G., & Lozano, L. (2006). Modelos de análisis para diseños multivariados de medidas repetidas [Multivariate repeated measures

designs]. Psicothema, 18(2), 293–299.

Verma, J. P. (2016). Repeated measures design for empirical researchers. Wiley.

Wilcox, R. R. (2022). Introduction to robust estimation and hypothesis testing (5th ed.). Academic Press.

Zhao, J., Wang, C., Totton, S., Cullen, J., & O’Connor, A. (2019). Reporting and analysis of repeated measurements in preclinical

animal experiments. PloS One, 14(8), Article e0220879. https://doi.org/1.1371/journal.pone.022087