2 research outputs found

    Leveraging latent patterns in the study of living systems

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Computational and Systems Biology Program, 2019Cataloged from PDF version of thesis. "June 2019."Includes bibliographical references.The development of high-throughput techniques to observe and perturb biological systems has led to remarkable progress in the last several decades. From the tremendous amounts of data being accumulated, new opportunities have emerged, including the possibility of finding latent patterns in high-dimensional variables that are reflective of underlying biological processes. While these methods have led to countless discoveries and innovations, it is clear there is much more we could learn by measuring and perturbing at far greater scales. Here, I advance methods to understand and utilize latent patterns in new types of high-dimensional data. I devise a method of analyzing networks of 'frequency interactions' in 16S/18S time series data, showing that these can be used to identify microbial communities and associated environmental factors.Then, as part of a highly collaborative project, I show how latent patterns in single cell RNA-Seq can be used together with optimal transport analysis to identify cell types and cell type trajectories, regulatory pathways, and cell-cell interactions in a time-course of developmental reprogramming. I then step back to ask a fundamental question: how do we choose which observations and perturbations to make, and how many of each are necessary? I approach this question on the basis of the inherency of latent structure in biology, and on foundational mathematical results concerning the analysis of highly-structured data. I present the beginnings of a framework to formalize how random composite experiments can make biological discovery more efficient by leveraging latent patterns. I first show how to recover individual genomes using covariance patterns in a series of composite (meta-) genomic data.I then describe how random composite measurements and compressed sensing can be used to make gene expression profiling more efficient. Finally, I apply this idea to in situ imaging transcriptomics, demonstrating how many individual gene images can be efficiently recovered from a small number of composite gene images.by Brian Cleary.Ph. D.Ph.D. Massachusetts Institute of Technology, Computational and Systems Biology Progra

    Using viral load and epidemic dynamics to optimize pooled testing in resource-constrained settings

    Get PDF
    Virological testing is central to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) containment, but many settings face severe limitations on testing. Group testing offers a way to increase throughput by testing pools of combined samples; however, most proposed designs have not yet addressed key concerns over sensitivity loss and implementation feasibility. Here, we combined a mathematical model of epidemic spread and empirically derived viral kinetics for SARS-CoV-2 infections to identify pooling designs that are robust to changes in prevalence and to ratify sensitivity losses against the time course of individual infections. We show that prevalence can be accurately estimated across a broad range, from 0.02 to 20%, using only a few dozen pooled tests and using up to 400 times fewer tests than would be needed for individual identification. We then exhaustively evaluated the ability of different pooling designs to maximize the number of detected infections under various resource constraints, finding that simple pooling designs can identify up to 20 times as many true positives as individual testing with a given budget. Crucially, we confirmed that our theoretical results can be translated into practice using pooled human nasopharyngeal specimens by accurately estimating a 1% prevalence among 2304 samples using only 48 tests and through pooled sample identification in a panel of 960 samples. Our results show that accounting for variation in sampled viral loads provides a nuanced picture of how pooling affects sensitivity to detect infections. Using simple, practical group testing designs can vastly increase surveillance capabilities in resource-limited settings.National Institute of General Medical Sciences (Grant U54GM088558
    corecore