8 research outputs found

    Linear algebraic techniques in theoretical computer science and population genetics

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Mathematics, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 149-155).In this thesis, we present several algorithmic results for problems in spectral graph theory and computational biology. The first part concerns the problem of spectral sparsification. It is known that every dense graph can be approximated in a strong sense by a sparse subgraph, known as a spectral sparsifier of the graph. Furthermore, researchers have recently developed efficient algorithms for computing such approximations. We show how to make these algorithms faster, and also give a substantial improvement in space efficiency. Since sparsification is an important first step in speeding up approximation algorithms for many graph problems, our results have numerous applications. In the second part of the thesis, we consider the problem of inferring human population history from genetic data. We give an efficient and principled algorithm for using single nucleotide polymorphism (SNP) data to infer admixture history of various populations, and apply it to show that Europeans have evidence of mixture with ancient Siberians. Finally, we turn to the problem of RNA secondary structure design. In this problem, we want to find RNA sequences that fold to a given secondary structure. We propose a novel global sampling approach, based on the recently developed RNAmutants algorithm, and show that it has numerous desirable properties when compared to existing solutions. Our method can prove useful for developing the next generation of RNA design algorithms.by Alex Levin.Ph.D

    Scalable Probabilistic Model Selection for Network Representation Learning in Biological Network Inference

    Get PDF
    A biological system is a complex network of heterogeneous molecular entities and their interactions contributing to various biological characteristics of the system. Although the biological networks not only provide an elegant theoretical framework but also offer a mathematical foundation to analyze, understand, and learn from complex biological systems, the reconstruction of biological networks is an important and unsolved problem. Current biological networks are noisy, sparse and incomplete, limiting the ability to create a holistic view of the biological reconstructions and thus fail to provide a system-level understanding of the biological phenomena. Experimental identification of missing interactions is both time-consuming and expensive. Recent advancements in high-throughput data generation and significant improvement in computational power have led to novel computational methods to predict missing interactions. However, these methods still suffer from several unresolved challenges. It is challenging to extract information about interactions and incorporate that information into the computational model. Furthermore, the biological data are not only heterogeneous but also high-dimensional and sparse presenting the difficulty of modeling from indirect measurements. The heterogeneous nature and sparsity of biological data pose significant challenges to the design of deep neural network structures which use essentially either empirical or heuristic model selection methods. These unscalable methods heavily rely on expertise and experimentation, which is a time-consuming and error-prone process and are prone to overfitting. Furthermore, the complex deep networks tend to be poorly calibrated with high confidence on incorrect predictions. In this dissertation, we describe novel algorithms that address these challenges. In Part I, we design novel neural network structures to learn representation for biological entities and further expand the model to integrate heterogeneous biological data for biological interaction prediction. In part II, we develop a novel Bayesian model selection method to infer the most plausible network structures warranted by data. We demonstrate that our methods achieve the state-of-the-art performance on the tasks across various domains including interaction prediction. Experimental studies on various interaction networks show that our method makes accurate and calibrated predictions. Our novel probabilistic model selection approach enables the network structures to dynamically evolve to accommodate incrementally available data. In conclusion, we discuss the limitations and future directions for proposed works

    29th International Symposium on Algorithms and Computation: ISAAC 2018, December 16-19, 2018, Jiaoxi, Yilan, Taiwan

    Get PDF

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum
    corecore