25 research outputs found

    Path2Models: large-scale generation of computational models from biochemical pathway maps

    Get PDF
    Background: Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data. Results: To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps. Conclusions: To date, the project has resulted in more than 140 000 freely available models. Such a resource can tremendously accelerate the development of mathematical models by providing initial starting models for simulation and analysis, which can be subsequently curated and further parameterized

    SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools

    Get PDF
    Background: Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing. Results: We present the Systems Biology Markup Language (SBML) Qualitative Models Package (“qual”), an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling network by three independent software tools. Furthermore, the collective effort to define the SBML qual format paved the way for the development of LogicalModel, an open-source model library, which will facilitate the adoption of the format as well as the collaborative development of algorithms to analyse qualitative models. Conclusions: SBML qual allows the exchange of qualitative models among a number of complementary software tools. SBML qual has the potential to promote collaborative work on the development of novel computational approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and signalling networks

    Linking the Epigenome to the Genome: Correlation of Different Features to DNA Methylation of CpG Islands

    Get PDF
    DNA methylation of CpG islands plays a crucial role in the regulation of gene expression. More than half of all human promoters contain CpG islands with a tissue-specific methylation pattern in differentiated cells. Still today, the whole process of how DNA methyltransferases determine which region should be methylated is not completely revealed. There are many hypotheses of which genomic features are correlated to the epigenome that have not yet been evaluated. Furthermore, many explorative approaches of measuring DNA methylation are limited to a subset of the genome and thus, cannot be employed, e.g., for genome-wide biomarker prediction methods. In this study, we evaluated the correlation of genetic, epigenetic and hypothesis-driven features to DNA methylation of CpG islands. To this end, various binary classifiers were trained and evaluated by cross-validation on a dataset comprising DNA methylation data for 190 CpG islands in HEPG2, HEK293, fibroblasts and leukocytes. We achieved an accuracy of up to 91% with an MCC of 0.8 using ten-fold cross-validation and ten repetitions. With these models, we extended the existing dataset to the whole genome and thus, predicted the methylation landscape for the given cell types. The method used for these predictions is also validated on another external whole-genome dataset. Our results reveal features correlated to DNA methylation and confirm or disprove various hypotheses of DNA methylation related features. This study confirms correlations between DNA methylation and histone modifications, DNA structure, DNA sequence, genomic attributes and CpG island properties. Furthermore, the method has been validated on a genome-wide dataset from the ENCODE consortium. The developed software, as well as the predicted datasets and a web-service to compare methylation states of CpG islands are available at http://www.cogsys.cs.uni-tuebingen.de/software/dna-methylation/

    Integrative Pathway-Based Approach for Genome-Wide Association Studies: Identification of New Pathways for Rheumatoid Arthritis and Type 1 Diabetes

    Get PDF
    Genome-wide association studies (GWAS) led to the identification of numerous novel loci for a number of complex diseases. Pathway-based approaches using genotypic data provide tangible leads which cannot be identified by single marker approaches as implemented in GWAS. The available pathway analysis approaches mainly differ in the employed databases and in the applied statistics for determining the significance of the associated disease markers. So far, pathway-based approaches using GWAS data failed to consider the overlapping of genes among different pathways or the influence of protein–interactions. We performed a multistage integrative pathway (MIP) analysis on three common diseases - Crohn's disease (CD), rheumatoid arthritis (RA) and type 1 diabetes (T1D) - incorporating genotypic, pathway, protein- and domain-interaction data to identify novel associations between these diseases and pathways. Additionally, we assessed the sensitivity of our method by studying the influence of the most significant SNPs on the pathway analysis by removing those and comparing the corresponding pathway analysis results. Apart from confirming many previously published associations between pathways and RA, CD and T1D, our MIP approach was able to identify three new associations between disease phenotypes and pathways. This includes a relation between the influenza-A pathway and RA, as well as a relation between T1D and the phagosome and toxoplasmosis pathways. These results provide new leads to understand the molecular underpinnings of these diseases. The developed software herein used is available at http://www.cogsys.cs.uni-tuebingen.de/software/GWASPathwayIdentifier/index.htm

    BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/bts508 Systems biology Advance Access publication August 24, 2012 Qualitative translation of relations from BioPAX to SBML qual

    No full text
    Motivation: The biological pathway exchange language (BioPAX) and the systems biology markup language (SBML) belong to the most popular modeling and data exchange languages in systems biology. The focus of SBML is quantitative modeling and dynamic simulation of models, whereas the BioPAX specification concentrates mainly on visualization and qualitative analysis of pathway maps. BioPAX describes reactions and relations. In contrast, SBML core exclusively describes quantitative processes such as reactions. With the SBML qualitative models extension (qual), it has recently also become possible to describe relations in SBML. Before the development of SBML qual, relations could not be properly translated into SBML. Until now, there exists no BioPAX to SBML converter that is fully capable of translating both reactions and relations. Results: The entire nature pathway interaction database has been converted from BioPAX (Level 2 and Level 3) into SBML (Level 3 Version 1) including both reactions and relations by using the new qual extension package. Additionally, we present the new webtool BioPAX2SBML for further BioPAX to SBML conversions. Compared with previous conversion tools, BioPAX2SBML is more comprehensive, more robust and more exact. Availability: BioPAX2SBML is freely available a

    Consistently significant pathways.

    No full text
    <p>To avoid that a pathway is only significant due to a small number of significant SNPs, we performed the multistage integrative pathway (MIP) analysis pipeline four times with different constraints. In the first MIP run, all SNPs are included. In the next three runs, only those having a <i>p</i>-value smaller than a threshold of 10<sup>−3</sup>, 10<sup>−4</sup>, and 10<sup>−5</sup>, respectively, are included. This table lists all pathways that had a <i>p</i>-value smaller than 0.05 during all four MIP runs. The complete results of each disease are shown in supplementary <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0078577#pone.0078577.s003" target="_blank">Tables S2</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0078577#pone.0078577.s004" target="_blank">S3</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0078577#pone.0078577.s005" target="_blank">S4</a>, and a more detailed overview of all four MIP runs which also includes a comparison to the literature is depicted in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0078577#pone.0078577.s006" target="_blank">Table S5</a>.</p

    Single feature class performance.

    No full text
    <p>Comparison of predictive performances of single feature classes. All values are taken from SVM predictions with feature files that only contain features belonging to the given class. Each prediction is an average of a ten-fold cross-validation with ten repetitions. The table shows the accuracy (ACC) and Matthews correlation coefficient (MCC) for each cell type and each feature class and is sorted by average MCC. Please note that the underlying data is imbalanced (because CpG islands tend to be unmethylated) and the average accuracy when assigning all CpG islands the unmethylated state is 0.71.</p
    corecore