187 research outputs found

    A new computational method to split large biochemical networks into coherent subnets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Compared to more general networks, biochemical networks have some special features: while generally sparse, there are a small number of highly connected metabolite nodes; and metabolite nodes can also be divided into two classes: internal nodes with associated mass balance constraints and external ones without. Based on these features, reclassifying selected internal nodes (separators) to external ones can be used to divide a large complex metabolic network into simpler subnetworks. Selection of separators based on node connectivity is commonly used but affords little detailed control and tends to produce excessive fragmentation.</p> <p>The method proposed here (Netsplitter) allows the user to control separator selection. It combines local connection degree partitioning with global connectivity derived from random walks on the network, to produce a more even distribution of subnetwork sizes. Partitioning is performed progressively and the interactive visual matrix presentation used allows the user considerable control over the process, while incorporating special strategies to maintain the network integrity and minimise the information loss due to partitioning.</p> <p>Results</p> <p>Partitioning of a genome scale network of 1348 metabolites and 1468 reactions for <it>Arabidopsis thaliana </it>encapsulates 66% of the network into 10 medium sized subnets. Applied to the flavonoid subnetwork extracted in this way, it is shown that Netsplitter separates this naturally into four subnets with recognisable functionality, namely synthesis of lignin precursors, flavonoids, coumarin and benzenoids. A quantitative quality measure called <it>efficacy </it>is constructed and shows that the new method gives improved partitioning for several metabolic networks, including bacterial, plant and mammal species.</p> <p>Conclusions</p> <p>For the examples studied the Netsplitter method is a considerable improvement on the performance of connection degree partitioning, giving a better balance of subnet sizes with the removal of fewer mass balance constraints. In addition, the user can interactively control which metabolite nodes are selected for cutting and when to stop further partitioning as the desired granularity has been reached. Finally, the blocking transformation at the heart of the procedure provides a powerful visual display of network structure that may be useful for its exploration independent of whether partitioning is required.</p

    Incorporating background frequency improves entropy-based residue conservation measures

    Get PDF
    BACKGROUND: Several entropy-based methods have been developed for scoring sequence conservation in protein multiple sequence alignments. High scoring amino acid positions may correlate with structurally or functionally important residues. However, amino acid background frequencies are usually not taken into account in these entropy-based scoring schemes. RESULTS: We demonstrate that using a relative entropy measure that incorporates amino acid background frequency results in improved performance in identifying functional sites from protein multiple sequence alignments. CONCLUSION: Our results suggest that the application of appropriate background frequency information may lead to more biologically relevant results in many areas of bioinformatics

    The p53HMM algorithm: using profile hidden markov models to detect p53-responsive genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A computational method (called p53HMM) is presented that utilizes Profile Hidden Markov Models (PHMMs) to estimate the relative binding affinities of putative p53 response elements (REs), both p53 single-sites and cluster-sites. These models incorporate a novel "Corresponded Baum-Welch" training algorithm that provides increased predictive power by exploiting the redundancy of information found in the repeated, palindromic p53-binding motif. The predictive accuracy of these new models are compared against other predictive models, including position specific score matrices (PSSMs, or weight matrices). We also present a new dynamic acceptance threshold, dependent upon a putative binding site's distance from the Transcription Start Site (TSS) and its estimated binding affinity. This new criteria for classifying putative p53-binding sites increases predictive accuracy by reducing the false positive rate.</p> <p>Results</p> <p>Training a Profile Hidden Markov Model with corresponding positions matching a combined-palindromic p53-binding motif creates the best p53-RE predictive model. The p53HMM algorithm is available on-line: <url>http://tools.csb.ias.edu</url></p> <p>Conclusion</p> <p>Using Profile Hidden Markov Models with training methods that exploit the redundant information of the homotetramer p53 binding site provides better predictive models than weight matrices (PSSMs). These methods may also boost performance when applied to other transcription factor binding sites.</p

    A systems approach to identifying correlated gene targets for the loss of colour pigmentation in plants

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The numerous diverse metabolic pathways by which plant compounds can be produced make it difficult to predict how colour pigmentation is lost for different tissues and plants. This study employs mathematical and <it>in silico </it>methods to identify correlated gene targets for the loss of colour pigmentation in plants from a whole cell perspective based on the full metabolic network of <it>Arabidopsis</it>. This involves extracting a self-contained flavonoid subnetwork from the AraCyc database and calculating feasible metabolic routes or elementary modes (EMs) for it. Those EMs leading to anthocyanin compounds are taken to constitute the anthocyanin biosynthetic pathway (ABP) and their interplay with the rest of the EMs is used to study the minimal cut sets (MCSs), which are different combinations of reactions to block for eliminating colour pigmentation. By relating the reactions to their corresponding genes, the MCSs are used to explore the phenotypic roles of the ABP genes, their relevance to the ABP and the impact their eliminations would have on other processes in the cell.</p> <p>Results</p> <p>Simulation and prediction results of the effect of different MCSs for eliminating colour pigmentation correspond with existing experimental observations. Two examples are: i) two MCSs which require the simultaneous suppression of genes DFR and ANS to eliminate colour pigmentation, correspond to observational results of the same genes being co-regulated for eliminating floral pigmentation in <it>Aquilegia </it>and; ii) the impact of another MCS requiring CHS suppression, corresponds to findings where the suppression of the early gene CHS eliminated nearly all flavonoids but did not affect the production of volatile benzenoids responsible for floral scent.</p> <p>Conclusions</p> <p>From the various MCSs identified for eliminating colour pigmentation, several correlate to existing experimental observations, indicating that different MCSs are suitable for different plants, different cells, and different conditions and could also be related to regulatory genes. Being able to correlate the predictions with experimental results gives credence to the use of these mathematical and <it>in silico </it>analyses methods in the design of experiments. The methods could be used to prioritize target enzymes for different objectives to achieve desired outcomes, especially for less understood pathways.</p

    Surface Hardness Impairment of Quorum Sensing and Swarming for Pseudomonas aeruginosa

    Get PDF
    The importance of rhamnolipid to swarming of the bacterium Pseudomonas aeruginosa is well established. It is frequently, but not exclusively, observed that P. aeruginosa swarms in tendril patterns—formation of these tendrils requires rhamnolipid. We were interested to explain the impact of surface changes on P. aeruginosa swarm tendril development. Here we report that P. aeruginosa quorum sensing and rhamnolipid production is impaired when growing on harder semi-solid surfaces. P. aeruginosa wild-type swarms showed huge variation in tendril formation with small deviations to the “standard” swarm agar concentration of 0.5%. These macroscopic differences correlated with microscopic investigation of cells close to the advancing swarm edge using fluorescent gene reporters. Tendril swarms showed significant rhlA-gfp reporter expression right up to the advancing edge of swarming cells while swarms without tendrils (grown on harder agar) showed no rhlA-gfp reporter expression near the advancing edge. This difference in rhamnolipid gene expression can be explained by the necessity of quorum sensing for rhamnolipid production. We provide evidence that harder surfaces seem to limit induction of quorum sensing genes near the advancing swarm edge and these localized effects were sufficient to explain the lack of tendril formation on hard agar. We were unable to artificially stimulate rhamnolipid tendril formation with added acyl-homoserine lactone signals or increasing the carbon nutrients. This suggests that quorum sensing on surfaces is controlled in a manner that is not solely population dependent

    Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans

    Get PDF
    It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investi- gate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show differ- ent patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that can- not be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore struc- ture of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between spe- cies is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered

    Stoichiometric representation of geneproteinreaction associations leverages constraint-based analysis from reaction to gene-level phenotype prediction

    Get PDF
    Genome-scale metabolic reconstructions are currently available for hundreds of organisms. Constraint-based modeling enables the analysis of the phenotypic landscape of these organisms, predicting the response to genetic and environmental perturbations. However, since constraint-based models can only describe the metabolic phenotype at the reaction level, understanding the mechanistic link between genotype and phenotype is still hampered by the complexity of gene-protein-reaction associations. We implement a model transformation that enables constraint-based methods to be applied at the gene level by explicitly accounting for the individual fluxes of enzymes (and subunits) encoded by each gene. We show how this can be applied to different kinds of constraint-based analysis: flux distribution prediction, gene essentiality analysis, random flux sampling, elementary mode analysis, transcriptomics data integration, and rational strain design. In each case we demonstrate how this approach can lead to improved phenotype predictions and a deeper understanding of the genotype-to-phenotype link. In particular, we show that a large fraction of reaction-based designs obtained by current strain design methods are not actually feasible, and show how our approach allows using the same methods to obtain feasible gene-based designs. We also show, by extensive comparison with experimental 13C-flux data, how simple reformulations of different simulation methods with gene-wise objective functions result in improved prediction accuracy. The model transformation proposed in this work enables existing constraint-based methods to be used at the gene level without modification. This automatically leverages phenotype analysis from reaction to gene level, improving the biological insight that can be obtained from genome-scale models.DM was supported by the Portuguese Foundationfor Science and Technologythrough a post-doc fellowship (ref: SFRH/BPD/111519/ 2015). This study was supported by the PortugueseFoundationfor Science and Technology (FCT) under the scope of the strategic fundingof UID/BIO/04469/2013 unitand COMPETE2020 (POCI-01-0145-FEDER-006684) and BioTecNorte operation (NORTE-01-0145FEDER-000004) fundedby EuropeanRegional Development Fund under the scope of Norte2020Programa Operacional Regional do Norte. This project has received fundingfrom the European Union’s Horizon 2020 research and innovation programme under grant agreementNo 686070. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
    corecore