24 research outputs found

    Genetic algorithm solution for double digest problem

    Get PDF
    The strongly NP-Hard Double Digest Problem, for reconstructing the physical map of DNA sequence, in now using for efficient genotyping. Most of the existing methods are inefficient in tackling large instances due to the large search space for the problem which grows as a factorial function (a!)(b!) of the numbers a and b of the DNA fragments generated by the two restriction enzymes. Also, none of the existing methods are able to handle the erroneous data. In this paper, we develop a novel method based on genetic algorithm for solving this problem and it is adapted to handle the erroneous data. Our genetic algorithm is implemented and compared with the other well-known existing algorithms. The obtained results show the efficiency (speedup) of our algorithm with respect to the other methods, specially for erroneous data

    Supervised Learning of Gene Regulatory Networks

    No full text
    Abstract Identifying the entirety of gene regulatory interactions in a biological system offers the possibility to determine the key molecular factors that affect important traits on the level of cells, tissues, and whole organisms. Despite the development of experimental approaches and technologies for identification of direct binding of transcription factors (TFs) to promoter regions of downstream target genes, computational approaches that utilize large compendia of transcriptomics data are still the predominant methods used to predict direct downstream targets of TFs, and thus reconstruct genome-wide gene-regulatory networks (GRNs). These approaches can broadly be categorized into unsupervised and supervised, based on whether data about known, experimentally verified gene-regulatory interactions are used in the process of reconstructing the underlying GRN. Here, we first describe the generic steps of supervised approaches for GRN reconstruction, since they have been recently shown to result in improved accuracy of the resulting networks? We also illustrate how they can be used with data from model organisms to obtain more accurate prediction of gene regulatory interactions. © 2020 The Authors. Basic Protocol 1: Construction of features used in supervised learning of gene regulatory interactions Basic Protocol 2: Learning the non-interacting TF-gene pairs Basic Protocol 3: Learning a classifier for gene regulatory interaction


    No full text

    Reaction lumping in metabolic networks for application with thermodynamic metabolic flux analysis

    No full text
    Thermodynamic metabolic flux analysis (TMFA) can narrow down the space of steady-state flux distributions, but requires knowledge of the standard Gibbs free energy for the modelled reactions. The latter are often not available due to unknown Gibbs free energy change of formation ,ΔfG0, {\Delta }_{f} G^{0}, of metabolites. To optimize the usage of data on thermodynamics in constraining a model, reaction lumping has been proposed to eliminate metabolites with unknown ΔfG0{\Delta }_{f} G^{0}. However, the lumping procedure has not been formalized nor implemented for systematic identification of lumped reactions. Here, we propose, implement, and test a combined procedure for reaction lumping, applicable to genome-scale metabolic models. It is based on identification of groups of metabolites with unknown ΔfG0{\Delta }_{f} G^{0}whose elimination can be conducted independently of the others via: (1) group implementation, aiming to eliminate an entire such group, and, if this is infeasible, (2) a sequential implementation to ensure that a maximal number of metabolites with unknown ΔfG0{\Delta }_{f} G^{0}are eliminated. Our comparative analysis with genome-scale metabolic models of Escherichia coli, Bacillus subtilis, and Homo sapiens shows that the combined procedure provides an efficient means for systematic identification of lumped reactions. We also demonstrate that TMFA applied to models with reactions lumped according to the proposed procedure lead to more precise predictions in comparison to the original models. The provided implementation thus ensures the reproducibility of the findings and their application with standard TMFA

    Maximization of non-idle enzymes improves the coverage of the estimated maximal in vivo enzyme catalytic rates in Escherichia coli

    No full text
    Constraint-based modeling approaches allow the estimation of maximal in vivo enzyme catalytic rates that can serve as proxies for enzyme turnover numbers. Yet, genome-scale flux profiling remains a challenge in deploying these approaches to catalogue proxies for enzyme catalytic rates across organisms.Here we formulate a constraint-based approach, termed NIDLE-flux, to estimate fluxes at a genome-scale level by using the principle of efficient usage of expressed enzymes. Using proteomics data from Escherichia coli, we show that the fluxes estimated by NIDLE-flux and the existing approaches are in excellent qualitative agreement (Pearson correlation > 0.9). We also find that the maximal in vivo catalytic rates estimated by NIDLE-flux exhibits a Pearson correlation of 0.74 with in vitro enzyme turnover numbers. However, NIDLE-flux results in a 1.4-fold increase in the size of the estimated maximal in vivo catalytic rates in comparison to the contenders. Integration of the maximum in vivo catalytic rates with publically available proteomics and metabolomics data provide a better match to fluxes estimated by NIDLE-flux. Therefore, NIDLE-flux facilitates more effective usage of proteomics data to estimate proxies for kcatomes.https://github.com/Rudan-X/NIDLE-flux-code.Supplementary data are available at Bioinformatics online

    Identification of flux trade-offs in metabolic networks

    No full text

    Characterization of effects of genetic variants via genome-scale metabolic modelling

    No full text
    Genome-scale metabolic networks for model plants and crops in combination with approaches from the constraint-based modelling framework have been used to predict metabolic traits and design metabolic engineering strategies for their manipulation. With the advances in technologies to generate large-scale genotyping data from natural diversity panels and other populations, genome-wide association and genomic selection have emerged as statistical approaches to determine genetic variants associated with and predictive of traits. Here, we review recent advances in constraint-based approaches that integrate genetic variants in genome-scale metabolic models to characterize their effects on reaction fluxes. Since some of these approaches have been applied in organisms other than plants, we provide a critical assessment of their applicability particularly in crops. In addition, we further dissect the inferred effects of genetic variants with respect to reaction rate constants, abundances of enzymes, and concentrations of metabolites, as main determinants of reaction fluxes and relate them with their combined effects on complex traits, like growth. Through this systematic review, we also provide a roadmap for future research to increase the predictive power of statistical approaches by coupling them with mechanistic models of metabolism