1 research outputs found
A New Approach for Scalable Analysis of Microbial Communities
Microbial communities play important roles in the function and maintenance of
various biosystems, ranging from human body to the environment. Current methods
for analysis of microbial communities are typically based on taxonomic
phylogenetic alignment using 16S rRNA metagenomic or Whole Genome Sequencing
data. In typical characterizations of microbial communities, studies deal with
billions of micobial sequences, aligning them to a phylogenetic tree. We
introduce a new approach for the efficient analysis of microbial communities.
Our new reference-free analysis tech- nique is based on n-gram sequence
analysis of 16S rRNA data and reduces the processing data size dramatically (by
105 fold), without requiring taxonomic alignment. The proposed approach is
applied to characterize phenotypic microbial community differ- ences in
different settings. Specifically, we applied this approach in classification of
microbial com- munities across different body sites, characterization of oral
microbiomes associated with healthy and diseased individuals, and
classification of microbial communities longitudinally during the develop- ment
of infants. Different dimensionality reduction methods are introduced that
offer a more scalable analysis framework, while minimizing the loss in
classification accuracies. Among dimensionality re- duction techniques, we
propose a continuous vector representation for microbial communities, which can
widely be used for deep learning applications in microbial informatics