Today I would like to focus on the quality of information associated with simple statistical methods commonly used in the biomedical sciences and how they are reported in scientific publications. Emphasis is put on the Student t-test and the analysis of variance commonly referred to as ANOVA.

I will not even talk about more complicated methods just yet.

Weissgerber et al. carried out a systematic review to examine the quality of reporting for two statistical tests, t-tests and ANOVA, for papers published in a selection of physiology journals in June 2017.

quality
  • Of the 328 original research articles examined
    • 277 (84.5%) included an ANOVA or t-test or both
    • However, papers in our sample were routinely missing essential information about both types of tests:
    • 213 papers (95% of the papers that used ANOVA) did not contain the information needed to determine what type of ANOVA was performed
    • 26.7% of papers did not specify what post-hoc test was performed.
    • Most papers also omitted the information needed to verify ANOVA results.
    • Essential information about t-tests was also missing in many papers.
  • Measures that must be taken to improve the quality of reporting

The Problem

Transparent reporting is essential for the critical evaluation of studies. However, the reporting of statistical methods for studies in the biomedical sciences is often quite limited.

Moreover, the statistical methods section is invariably the weakest part of publications that I had to review over the past 30 years. Scientific publications rarely contain the information needed to fully appreciate, understand and verify the statistical data analysis that was carried out. This impacts how much we can rely of published findings, transparency, robustness, reproducibility, etc.

It is obviously insufficient to report that “A t-test was used” or “An ANOVA was used”. To make matters worse, the complete ANOVA table is seldom reported in publications.

The Solutions

Investigators must learn to report more than that. In order to assess the quality of the material & methods section, the needed information is as follows:

  • Sources of variation that are accounted for in the statistical analysis
    • Factors
    • Interactions between factors
    • Any other terms such as blocking variables or covariates
    • Interactions between factors and covariates
  • Authors also need to be more precise about the model that was used
    • Report t or F statistics
    • Degrees of freedom “df” (to check whether experimental units were omitted/removed without explanation)
    • Pairing
    • Blocking
    • Whether underlying test assumptions were met or not
    • Remedial measures used

Make Science Robust, Transparent & Reproducible

Researchers need to provide more details in order for reviewers and readers to fully understand how the statistical analysis of the data was carried out and to make sure that it mimics the structure upon which the data were collected. This step is crucial so that readers can get a precise idea of the scope & limitations of findings. This is needed in order to move towards a more robust, transparent and reproducible future in science.

Checklist for Reporting t-Tests Results

State whether the test was unpaired (for comparing independent groups) or paired (for non-independent data, including repeated measurements on the same individual or matched participants, specimens or samples).

  • State whether the alternative hypothesis is one- or two-sided and why it is the case.
  • State whether the test assumed equal or unequal variance between the 2 groups. Discuss how this was determined (visual inspection – which plot and/or using a variance homogeneity test).
  • Report :
    • t-statistic
    • degrees of freedom
    • exact p-value
  • To focus on the magnitude of the difference, it is strongly recommended to report effect sizes (mean of the outcome variable by group) along with confidence intervals.

Checklist for Reporting ANOVA Results

  • Specify the number of factors included in the ANOVA (i.e., one-, two-way or multi-way ANOVA).
  • For each factor, specify the name and level of the factor and state whether the factor was entered as a within-subjects factor or as a between-subjects factor.
  • Identify the relationship between factors: crossed or nested.
  • If several factors were included in the ANOVA, specify whether interaction terms were included and which ones.
  • Report whether blocking structures for controlling nuisance variables were used.
  • Any adjustment for covariates made? If so, ANOVA becomes ANCOVA (Analysis of Covariance).
  • Ideally report the ANOVA table for all outcome variables analysed.
  • If the ANOVA table cannot be reported, the following are minimal requirements
    • Report the F-statistic
    • Corresponding degrees of freedom (DF)
    • Exact p-value for each factor or interaction or other term incorporated into the ANOVA model.
  • Accounting for multiplicity.
    • Specify if a post-hoc test was performed for locating differences.
    • If post-hoc tests were performed, specify the type of post-hoc test and, if applicable, the test statistic and p-value.
  • To focus on the magnitude of the difference, it is strongly recommended to report effect sizes (mean of the outcome variable by group) along with confidence intervals.

Reference

Weissgerber TL, Garcia-Valencia O, Garovic VD, Milic NM, Winham SJ. Why we need to report more than ‘Data were Analyzed by t-tests or ANOVA’. Elife. 2018 Dec 21;7:e36163. doi: 10.7554/eLife.36163. PMID: 30574870; PMCID: PMC6326723.

Conclusions

  • Investigators need more training in biostatistics and biostatisticians should be more involved in the writing of scientific publications.
  • More guidelines for effective reporting in scientific publication should be written by biostatisticians.