Abstract
Background: With repeated measures, the traditional ANOVA F-statistic requires fulfillment of normality and sphericity. Bootstrap-F (B-F) has been proposed as a procedure for dealing with violation of these assumptions when conducting a one-way repeated measures ANOVA. However, evidence regarding its robustness and power is limited. Our aim is to extend knowledge about the behavior of B-F with a wider range of conditions. Method: A simulation study was performed, manipulating the number of repeated measures, sample sizes, epsilon values, and distribution shape. Results: B-F may become conservative with higher values of epsilon, and liberal under extreme violation of both normality and sphericity and small sample sizes. In these cases, B-F may be used with a more stringent alpha level (.025). The results also show that power is affected by sphericity: the lower the epsilon value, the larger the sample size required to ensure adequate power. Conclusions: B-F is robust under non-normality and non-sphericity with sample sizes larger than 20-25.
References
Arnau, J., Bendayan, R., Blanca, M. J., & Bono, R. (2014). Should we rely on the Kenward–Roger approximation when using linear mixed models if the groups have different distributions? British Journal of Mathematical and Statistical Psychology, 67(3), 408–429. https://doi.org/10.1111/ bmsp.12026
Arnau, J., Bono, R., Blanca, M. J., & Bendayan, R. (2012). Using the linear mixed model to analyze nonnormal data distributions in longitudinal designs. Behavior Research Methods, 44(4), 1224–1238. https://doi. org/10.3758/s13428-012-0196-y
Berglund, P., & Heeringa, S. (2014). Multiple imputation of missing data using SAS. SAS Institute Inc.
Berkovits, I., Hancock, G., & Nevitt, J. (2000). Bootstrap resampling approaches for repeated measure designs: Relative robustness to sphericity and normality violations. Educational and Psychological Measurement, 60(6), 877–892. https://doi.org/10.1177/00131640021970961
Blanca, M. J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2018). Effect of variance ratio on ANOVA robustness: Might 1.5 be the limit? Behavior Research Methods, 50(3), 937-962. https://doi.org/10.3758/ s13428-017-0918-2
Blanca, M. J., Arnau, J., García-Castro, F. J., Alarcón, R., & Bono, R. (2023a). Non-normal data in repeated measures: Impact on Type I error and power. Psicothema, 35(1), 21–29. https://doi.org/10.7334/psicothema2022.292 Blanca, M. J., Arnau, J., García-Castro, F. J., Alarcón, R., & Bono, R. (2023b). Repeated measures ANOVA and adjusted F-tests when sphericity is violated: Which procedure is best? Frontiers in Psychology, 14, Article
https://doi.org/10.3389/fpsyg.2023.1192453
Blanca, M. J., Alarcón, R., Arnau, J., García-Castro, F. J., & Bono, R. (2024). How to proceed when both normality and sphericity are violated in repeated measures ANOVA. Anales de Psicología / Annals of Psychology, 40(3), 466–480. https://doi.org/10.6018/analesps.594291
Bono, R., Arnau, J., & Vallejo, G. (2010). Modelización de diseños split-plot y estructuras de covarianza no estacionarias: un estudio de simulación [Modeling split-plot data and nonstationary covariance structures: A simulation study]. Escritos de Psicología / Psychological Writings, 3(3), 1–7. https://doi.org/10.5231/Psy.Writ.2010.2903
Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems II. Effect of inequality of variance and of correlation of error in the two-way classification. Annals of Mathematical Statistics, 25(3), 484–498. https://doi.org/10.1214/aoms/1177728717
Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. https://doi.org/10.1111/j.2044-8317.1978. tb00581.x
Brown, H., & Prescott, R. (2006). Applied mixed models in medicine (2nd edition). Wiley.
Chernick, M. R. (2008). Bootstrap methods: A guide for practitioners and researchers (2nd ed.). John Wiley & Sons, Inc.
Chernick, M. R., & LaBudde, R. A. (2011). An introduction to bootstrap methods with applications to R. John Wiley & Sons, Inc.
Christensen, A. P., & Golino, H. (2021). Estimating the stability of psychological dimensions via bootstrap exploratory graph analysis: A Monte Carlo simulation and tutorial. Psych, 3(3), 479–500. https://doi. org/10.3390/psych3030032
Cooper, J. A., & Garson, G. D. (2016). Power analysis. Statistical Associates Blue Book Series.
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7(1), 1–26. http://www.jstor.org/stable/2958830
Efron, B., & Gong, G. (1983). A leisurely look at the bootstrap, the jackknife, and cross-validation. American Statistician, 37(1), 36-48. https://doi. org/10.2307/2685844
Efron, B., & Tibshirani, R. J., (1993). An introduction to the bootstrap. Chapman & Hall.
Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43(4), 521–532. https://doi.org/10.1007/BF02293811 Geisser, S., & Greenhouse, S. W. (1958). An extension of Box’s results on the use of the F distribution in multivariate analysis. The Annals of Mathematical Statistics, 29(3) 885–891. https://doi.org/10.1214/aoms/1177706545
Greenhouse, S. W., & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika 24(2), 95–112. https://doi.org/10.1007/ BF02289823
Haverkamp, N., & Beauducel, A. (2017). Violation of the sphericity assumption and its effect on Type-I error rates in repeated measures ANOVA and multi-level linear models (MLM). Frontiers in Psychology, 8, Article 1841. https://doi.org/10.3389/fpsyg.2017.01841
Haverkamp, N., & Beauducel, A. (2019). Differences of Type I error rates for ANOVA and multilevel-linear-models using SAS and SPSS for repeated measures designs. Meta-Psychology, 3, Article MP.2018.898. https://doi.org/10.15626/mp.2018.898
Harwell, M. R., & Serlin, R. C. (1994). A Monte Carlo study of the Friedman test and some competitors in the single factor, repeated measures design with unequal covariances. Computational Statistics & Data Analysis, 17(1), 35–49. https://doi.org/10.1016/0167-9473(92)00060-5
Hayoz, S. (2007). Behavior of nonparametric tests in longitudinal design. 15th European young statisticians meeting. http://matematicas.unex.es/~idelpuerto/WEB_EYSM/Articles/ch_stefanie_hayoz_art.pdf
Hayes, A. F. (2017). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Publications. Huynh, H., & Feldt, L. S. (1976). Estimation of the Box correction for degrees of freedom from sample data in randomized block and split- plot designs. Journal of Educational Statistics, 1(1), 69–82. https://doi.org/10.2307/1164736
Keppel, G., & Wickens, T. D. (2004). Design and analysis: A researcher’s handbook (4th ed.). Prentice Hall.
Keselman, H. J., Algina, J., Kowalchuk, R. K., & Wolfinger, R. D. (1999). A comparison of recent approaches to the analysis of repeated measurements. British Journal of Mathematical and Statistical Psychology, 52(1), 63–78. https://doi.org/10.1348/000711099158964
Keselman, J. C., Lix, L. M., & Keselman, H. J. (1996). The analysis of repeated measurements: A quantitative research synthesis. British Journal of Mathematical and Statistical Psychology, 49(2), 275–298. https://doi.org/10.1111/j.2044-8317.1996.tb01089.x
Kherad-Pajouh, S., & Renaud, O. (2015). A general permutation approach for analyzing repeated measures ANOVA and mixed-model designs. Statistical Papers, 56(4), 947–967. https://doi.org/10.1007/s00362-014-0617-3
Kirk, R. E. (2013). Experimental design: Procedures for the behavioral sciences (4th ed.). Sage Publications.
Kowalchuk, R. K., Keselman, H. J., Algina, J., & Wolfinger, R. D. (2004). The analysis of repeated measurements with mixed-model adjusted F tests. Educational and Psychological Measurement, 64(2), 224–242. https://doi.org/10.1177/0013164403260196
Livacic-Rojas, P., Vallejo, G., & Fernández, P. (2010). Analysis of Type I error rates of univariate and multivariate procedures in repeated measures designs. Communications in Statistics – Simulation and Computation, 39(3), 624–664. https://doi.org/10.1080/03610910903548952
Mair, P., & Wilcox, R. (2020). Robust statistical methods in R using the WRS2 package. Behavior Research Methods, 52, 464–488. https://doi. org/10.3758/s13428-019-01246-w
Muller, K., Edwards, L., Simpson, S., & Taylor, D. (2007). Statistical tests with accurate size and power for balanced linear mixed models. Statistics in Medicine, 26(19), 3639–3660. https://doi.org/10.1002/sim.2827
Oberfeld, D., & Franke, T. (2013). Evaluating the robustness of repeated measures analyses: The case of small sample sizes and nonnormal data. Behavior Research Methods, 45(3), 792–812. https://doi.org/10.3758/s13428-012-0281-2
Tabachnick, B. G., & Fidell, L. S. (2007). Experimental design using ANOVA. Thomson Brooks/Cole.
Vallejo, G., Ato, M., Fernández, P., & Livacic-Rojas, P. (2013). Multilevel bootstrap analysis with assumptions violated. Psicothema, 25(4), 520-528. https://doi.org/10.7334/psicothema2013.58
Vallejo, G., Cuesta, M., Fernández, M., & Herrero, F. (2006). A comparison of the bootstrap-F, improved general approximation, and Brown- Forsythe multivariate approaches in a mixed repeated measures design. Educational and Psychological Measurement, 66(1), 35–62. https://doi. org/10.1177/0013164404273943
Vallejo, G., Fernández, M. P., Livacic-Rojas, P. E., & Tuero-Herrero, E. (2011) Comparison of modern methods for analyzing repeated measures data with missing values. Multivariate Behavioral Research, 46(6), 900– 937. https://doi.org/10.1080/00273171.2011.625320
Vallejo, G., Fernández, M. P., Tuero, E., & Livacic-Rojas, P. E. (2010). Análisis de medidas repetidas usando métodos de remuestreo [Analyzing repeated measures using resampling methods]. Anales de Psicología / Annals of Psychology, 26(2), 400–409.
Voelkle, M. C., & McKnight, P. E. (2012). One size fits all? A Monte- Carlo simulation on the relationship between repeated measures (M) ANOVA and latent curve modeling. Methodology, 8(1), 23–38. https:// doi.org/10.1027/1614-2241/a000044
Wilcox, R. R. (2003). Applying contemporary statistical techniques. Gulf Professional Publishing.
Wilcox, R. R. (2022). Introduction to robust estimation and hypothesis testing. Academic Press.