363 research outputs found

    Dispensability of Escherichia coli's latent pathways

    Full text link
    Gene-knockout experiments on single-cell organisms have established that expression of a substantial fraction of genes is not needed for optimal growth. This problem acquired a new dimension with the recent discovery that environmental and genetic perturbations of the bacterium Escherichia coli are followed by the temporary activation of a large number of latent metabolic pathways, which suggests the hypothesis that temporarily activated reactions impact growth and hence facilitate adaptation in the presence of perturbations. Here we test this hypothesis computationally and find, surprisingly, that the availability of latent pathways consistently offers no growth advantage, and tends in fact to inhibit growth after genetic perturbations. This is shown to be true even for latent pathways with a known function in alternate conditions, thus extending the significance of this adverse effect beyond apparently nonessential genes. These findings raise the possibility that latent pathway activation is in fact derivative of another, potentially suboptimal, adaptive response

    Optimization algorithms for inference and classification of genetic profiles from undersampled measurements

    Get PDF
    In this thesis, we tackle three different problems, all related to optimization techniques for inference and classification of genetic profiles. First, we extend the deterministic Non-negative Matrix Factorization (NMF) framework to the probabilistic case (PNMF). We apply the PNMF algorithm to cluster and classify DNA microarrays data. The proposed PNMF is shown to outperform the deterministic NMF and the sparse NMF algorithms in clustering stability and classification accuracy. Second, we propose SMURC: Small-sample MUltivariate Regression with Covariance estimation. Specifically, we consider a high dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. We show that, in this case, the maximum likelihood approach is senseless because the likelihood diverges. We propose a normalization of the likelihood function that guarantees convergence. Simulation results show that SMURC outperforms the regularized likelihood estimator with known covariance matrix and the state-of-the-art sparse Conditional Graphical Gaussian Model (sCGGM). In the third Chapter, we derive a new greedy algorithm that provides an exact sparse solution of the combinatorial l sub zero-optimization problem in an exponentially less computation time. Unlike other greedy approaches, which are only approximations of the exact sparse solution, the proposed greedy approach, called Kernel reconstruction, leads to the exact optimal solution

    Differential Equation Models and Numerical Methods for Reverse Engineering Genetic Regulatory Networks

    Get PDF
    This dissertation develops and analyzes differential equation-based mathematical models and efficient numerical methods and algorithms for genetic regulatory network identification. The primary objectives of the dissertation are to design, analyze, and test a general variational framework and numerical methods for seeking its approximate solutions for reverse engineering genetic regulatory networks from microarray datasets using the approach based on differential equation modeling. In the proposed variational framework, no structure assumption on the genetic network is presumed, instead, the network is solely determined by the microarray profile of the network components and is identified through a well chosen variational principle which minimizes a biological energy functional. The variational principle serves not only as a selection criterion to pick up the right biological solution of the underlying differential equation model but also provide an effective mathematical characterization of the small-world property of genetic regulatory networks which has been observed in lab experiments. Five specific models within the variational framework and efficient numerical methods and algorithms for computing their solutions are proposed and analyzed in the dissertation. Model validations using both synthetic network datasets and real world subnetwork datasets of Saccharomyces cerevisiae (yeast) and E. Coli are done on all five proposed variational models and a performance comparison vs some existing genetic regulatory network identification methods is also provided. As microarray data is typically noisy, in order to take into account the noise effect in the mathematical models, we propose a new approach based on stochastic differential equation modeling and generalize the deterministic variational framework to a stochastic variational framework which relies on stochastic optimization. Numerical algorithms are also proposed for computing solutions of the stochastic variational models. To address the important issue of post-processing computed networks to reflect the small-world property of underlying genetic regulatory networks, a novel threshholding technique based on the Random Matrix Theory is proposed and tested on various synthetic network datasets