MAY 10, 2018 06:00 AM PDT

Differential Abundance Analysis for Microbial Marker-Gene Surveys

C.E. CREDITS: P.A.C.E. CE | Florida CE
  • Research Fellow, Dana-Farber Cancer Institute
      I am a Research Fellow in the Department of Biostatistics and Computational Biology at the Dana-Farber Cancer Institute and Department of Biostatistics at the Harvard TH Chan School of Public Health under the guidance of Professor John Quackenbush. Prior to joining Harvard I was a National Science Foundation Graduate Research Fellow at the University of Maryland, College Park where I received my Ph.D. in Applied Mathematics, Statistics and Scientific Computation.
      As a computer scientist and computational biologist, my interests are to develop computational methods for the analysis of high-throughput sequencing data. I also desire to develop software and support these methods as open-source software for the broader scientific community through Bioconductor and popular domain tools such as QIIME and Phyloseq. MetagenomeSeq, is my most popular tool developed and is in the top 5% of all Bioconductor packages downloaded in the last year with over 5,000 unique users. I am excited to leverage statistical and network methodologies in accounting for technological when identifying disease markers.


    We introduce a differential abundance analysis method for the analysis of sparse high-throughput data from large-scale surveys of marker genes for microbial communities. Our approach relies on cumulative sum scaling (CSS) normalization - a count data normalization technique - and the zero-inflated Gaussian (ZIG) model as a statistical method for detecting differential abundance of taxonomic features. ZIG differential abundance detection method accounts for bias introduced by the under-sampling of microbial communities commonly found in large-scale marker gene studies.  We have implemented these methods in the publicly available metagenomeSeq bioconductor package. In addition we highlight the utility of the method in a large scale study characterizing the diarrheal microbiome in young children from developing children. Diarrhea, a major cause of mortality and morbidity in young children from developing countries, leading to as many as 15% of all deaths in children under 5 years of age. While many causes of this disease are already known, conventional diagnostic approaches fail to detect a pathogen in up to 60% of diarrheal cases. Using our novel methodology Streptococci were found in our study to be statistically associated with diarrheal disease in general and more severe forms (such as dysentery) in particular.

    Show Resources
    Loading Comments...