556 research outputs found

    A Knowledge Gradient Policy for Sequencing Experiments to Identify the Structure of RNA Molecules Using a Sparse Additive Belief Model

    Full text link
    We present a sparse knowledge gradient (SpKG) algorithm for adaptively selecting the targeted regions within a large RNA molecule to identify which regions are most amenable to interactions with other molecules. Experimentally, such regions can be inferred from fluorescence measurements obtained by binding a complementary probe with fluorescence markers to the targeted regions. We use a biophysical model which shows that the fluorescence ratio under the log scale has a sparse linear relationship with the coefficients describing the accessibility of each nucleotide, since not all sites are accessible (due to the folding of the molecule). The SpKG algorithm uniquely combines the Bayesian ranking and selection problem with the frequentist â„“1\ell_1 regularized regression approach Lasso. We use this algorithm to identify the sparsity pattern of the linear model as well as sequentially decide the best regions to test before experimental budget is exhausted. Besides, we also develop two other new algorithms: batch SpKG algorithm, which generates more suggestions sequentially to run parallel experiments; and batch SpKG with a procedure which we call length mutagenesis. It dynamically adds in new alternatives, in the form of types of probes, are created by inserting, deleting or mutating nucleotides within existing probes. In simulation, we demonstrate these algorithms on the Group I intron (a mid-size RNA molecule), showing that they efficiently learn the correct sparsity pattern, identify the most accessible region, and outperform several other policies

    On Using Inductive Biases for Designing Deep Learning Architectures

    Get PDF
    Recent advancements in field of Artificial Intelligence, especially in the field of Deep Learning (DL), have paved way for new and improved solutions to complex problems occurring in almost all domains. Often we have some prior knowledge and beliefs of the underlying system of the problem at-hand which we want to capture in the corresponding deep learning architectures. Sometimes, it is not clear on how to include our prior beliefs into the traditionally recommended deep architectures like Recurrent neural networks, Convolutional neural networks, Variational Autoencoders and others. Often the post-hoc techniques of modifying these architectures are not straightforward and provide little performance gain. There have been efforts on developing domain specific architectures but those techniques are generally not transferable to other domains. We ask the question that can we come up with generic and intuitive techniques to design deep learning architectures that takes our prior knowledge of the system as an inductive bias? In this dissertation, we develop two novel approaches towards this end. The first one called `Cooperative Neural Networks' can incorporate the inductive bias from the underlying probabilistic graphical model representation of the domain. The second one called problem dependent `Unrolled Algorithms' parameterizes the recurrent structure of unrolling the iterations of an optimization algorithm for the objective function defining the problem. We found that the neural network architectures obtained from our approaches typically end up with very fewer learnable parameters and provide considerable improvement in run-time compared to other deep learning methods. We have successfully applied our techniques to solve Natural Language processing related tasks, doing sparse graph recovery and computational biology problems like doing gene regulatory network inference. Firstly, we introduce the Cooperative Neural Networks approach which is a new theoretical approach for implementing learning systems that can exploit both prior insights about the independence structure of the problem domain and the universal approximation capability of the deep neural networks. Specifically, we develop CoNN-sLDA model for the document classification task. We use the popular Latent Dirichlet Allocation graphical model as the inductive bias for the CoNN-sLDA model. We demonstrate a 23% reduction in error on the challenging MultiSent data set compared to state-of-the-art and also derived ways to make the learned representations more interpretable. Secondly, we elucidate the idea of using problem dependent `Unrolled Algorithms' for the sparse graph recovery task. We propose a deep learning architecture, GLAD, which uses an Alternating Minimization algorithm as our model inductive bias and learns the model parameters via supervised learning. We show that GLAD learns a very compact and effective model for recovering sparse graphs from data. We do an extensive theoretical analysis that strengthen our claims for using similar approaches for other problems as well. Finally, we further build up on the proposed `Unrolled Algorithm' technique for a challenging real world computational biology problem. To this end, we design GRNUlar, a novel deep learning framework for supervised learning of gene regulatory networks (GRNs) from single cell RNA-Sequencing data. Our framework incorporates two intertwined models. We first leverage the expressive ability of neural networks to capture complex dependencies between transcription factors and the corresponding genes they regulate, by developing a multi-task learning framework. Then, in order to capture sparsity of GRNs observed in the real world, we design an unrolled algorithm technique for our framework. Our deep architecture requires supervision for training, for which we repurpose existing synthetic data simulators that generate scRNA-Seq data guided by an underlying GRN. Experimental results demonstrate GRNUlar outperforms state-of-the-art methods on both synthetic and real datasets. Our work also demonstrates the novel and successful use of expression data simulators for supervised learning of GRN inference.Ph.D

    Women in Science 2016

    Get PDF
    Women in Science 2016 summarizes research done by Smith College’s Summer Research Fellowship (SURF) Program participants. Ever since its 1967 start, SURF has been a cornerstone of Smith’s science education. In 2016, 150 students participated in SURF (144 hosted on campus and nearby eld sites), supervised by 56 faculty mentor-advisors drawn from the Clark Science Center and connected to its eighteen science, mathematics, and engineering departments and programs and associated centers and units. At summer’s end, SURF participants were asked to summarize their research experiences for this publication.https://scholarworks.smith.edu/clark_womeninscience/1005/thumbnail.jp

    Deep Learning And Uncertainty Quantification: Methodologies And Applications

    Get PDF
    Uncertainty quantification is a recent emerging interdisciplinary area that leverages the power of statistical methods, machine learning models, numerical methods and data-driven approach to provide reliable inference for quantities of interest in natural science and engineering problems. In practice, the sources of uncertainty come from different aspects such as: aleatoric uncertainty where the uncertainty comes from the observations or is due to the stochastic nature of the problem; epistemic uncertainty where the uncertainty comes from inaccurate mathematical models, computational methods or model parametrization. Cope with the above different types of uncertainty, a successful and scalable model for uncertainty quantification requires prior knowledge in the problem, careful design of mathematical models, cautious selection of computational tools, etc. The fast growth in deep learning, probabilistic methods and the large volume of data available across different research areas enable researchers to take advantage of these recent advances to propose novel methodologies to solve scientific problems where uncertainty quantification plays important roles. The objective of this dissertation is to address the existing gaps and propose new methodologies for uncertainty quantification with deep learning methods and demonstrate their power in engineering applications. On the methodology side, we first present a generative adversarial framework to model aleatoric uncertainty in stochastic systems. Secondly, we leverage the proposed generative model with recent advances in physics-informed deep learning to learn the uncertainty propagation in solutions of partial differential equations. Thirdly, we introduce a simple and effective approach for posterior uncertainty quantification for learning nonlinear operators. Fourthly, we consider inverse problems of physical systems on identifying unknown forms and parameters in dynamical systems via observed noisy data. On the application side, we first propose an importance sampling approach for sequential decision making. Second, we propose a physics-informed neural network method to quantify the epistemic uncertainty in cardiac activation mapping modeling and conduct active learning. Third, we present an anto-encoder based framework for data augmentation and generation for data that is expensive to obtain such as single-cell RNA sequencing
    • …
    corecore