Metagenomic data possess specific features that call for adapted statistical methods. Learn about the most commonly used statistical tools in this field. The workshop first reviews the mechanism of statistical hypothesis testing. Then focus is put on the understanding the data structure and the specific issues in omics data. Classical ANOVA and its Alternatives in Genomics are then covered. Finally, advanced methods and data visualisation tools are discussed.
Course Outline
The recommended duration for this course is 5 online session(s).
The most popular version of this course comprises 4 sessions described as follows:Session 1: Statistical Hypothesis Testing- Accounting for sampling variation in the decision-making process
- Mechanism underlying hypothesis testing
- Null and alternative hypotheses
- One- and two-sided tests
- Test statistic
- Decision rule: critical value and p-value
- Statistical vs. practical significance, effect size to detect
- Risk involved in hypothesis testing: confidence level and power
- Controlling risks
- General idea underlying sample size and power determination
- Specific features of metagenomics data
- Specific issues
- Differences with classical data
- Summarising data efficiently
- Multiplicity in statistical tests
- What is multiplicity?
- Identifying situations leading to multiplicity
- Dealing with multiplicity - Bonferroni, Tukey, Benjamini-Hochberg, etc.
- Refresher on basic principles underlying ANOVA
- Interpretation of an ANOVA table
- Significance of factors and interactions
- Interpretation of factor effects and interactions
- Principle underlying MANOVA
- Specifics of metagenomic data and alternative methods
- Principle of ANOSIM and PERMANOVA
- Comparison of centroids
- Heterogeneity of dispersion across groups
- Multiple comparisons
- Results visualisation
- Various distance measures in metagenomics
- Principle & selection of an appropriate metric
- An Overview of Visualisation Tools - Ordination Methods
- Principal Component Analysis (PCA)
- Correspondence Analysis (CA)
- MultiDimensional Scaling (MDS)
- Principal Coordinates Analysis (PCoA)
We just finished a series of training sessions on the Statistical Analysis of Metagenomics data, and I was impressed by Natalie’s delivery of the content. Despite its complexity, Natalie was able to explain it clearly to us, providing many well-made examples. I highly recommend her for your statistical training needs. We are eagerly looking forward to the next course!
The training we attended on biostatistical analysis of metagenomic data was greatly appreciated by all attendees. We demystified which type of statistical models to apply and how to interpret the data. Our instructor Natalie has a remarkable ability to communicate complex concepts in this field of application. I recommend her without any hesitation.