845 research outputs found

    Uberon: towards a comprehensive multi-species anatomy ontology

    Get PDF
    The lack of a single unified species-neutral ontology covering the anatomy of a variety of metazoans is a hindrance to translating model organism research to human health. We have developed an Uber-anatomy ontology to fill this need, filling the gap between the CARO upper-level ontology and species-specific anatomical ontologies

    Analysis of the human diseasome reveals phenotype modules across common, genetic, and infectious diseases

    Get PDF
    Phenotypes are the observable characteristics of an organism arising from its response to the environment. Phenotypes associated with engineered and natural genetic variation are widely recorded using phenotype ontologies in model organisms, as are signs and symptoms of human Mendelian diseases in databases such as OMIM and Orphanet. Exploiting these resources, several computational methods have been developed for integration and analysis of phenotype data to identify the genetic etiology of diseases or suggest plausible interventions. A similar resource would be highly useful not only for rare and Mendelian diseases, but also for common, complex and infectious diseases. We apply a semantic text- mining approach to identify the phenotypes (signs and symptoms) associated with over 8,000 diseases. We demonstrate that our method generates phenotypes that correctly identify known disease-associated genes in mice and humans with high accuracy. Using a phenotypic similarity measure, we generate a human disease network in which diseases that share signs and symptoms cluster together, and we use this network to identify phenotypic disease modules

    The role of ontologies in biological and biomedical research: a functional perspective.

    Get PDF
    Ontologies are widely used in biological and biomedical research. Their success lies in their combination of four main features present in almost all ontologies: provision of standard identifiers for classes and relations that represent the phenomena within a domain; provision of a vocabulary for a domain; provision of metadata that describes the intended meaning of the classes and relations in ontologies; and the provision of machine-readable axioms and definitions that enable computational access to some aspects of the meaning of classes and relations. While each of these features enables applications that facilitate data integration, data access and analysis, a great potential lies in the possibility of combining these four features to support integrative analysis and interpretation of multimodal data. Here, we provide a functional perspective on ontologies in biology and biomedicine, focusing on what ontologies can do and describing how they can be used in support of integrative research. We also outline perspectives for using ontologies in data-driven science, in particular their application in structured data mining and machine learning applications.This is the final version of the article. It first appeared from Oxford University Press via http://dx.doi.org/10.1093/bib/bbv01

    Ontology-based cross-species integration and analysis of Saccharomyces cerevisiae phenotypes

    Get PDF
    Ontologies are widely used in the biomedical community for annotation and integration of databases. Formal definitions can relate classes from different ontologies and thereby integrate data across different levels of granularity, domains and species. We have applied this methodology to the Ascomycete Phenotype Ontology (APO), enabling the reuse of various orthogonal ontologies and we have converted the phenotype associated data found in the SGD following our proposed patterns. We have integrated the resulting data in the cross-species phenotype network PhenomeNET, and we make both the cross-species integration of yeast phenotypes and a similarity-based comparison of yeast phenotypes across species available in the PhenomeBrowser. Furthermore, we utilize our definitions and the yeast phenotype annotations to suggest novel functional annotations of gene products in yeast

    Engineering polymer informatics: Towards the computer-aided design of polymers

    Get PDF
    The computer-aided design of polymers is one of the holy grails of modern chemical informatics and of significant interest for a number of communities in polymer science. The paper outlines a vision for the in silico design of polymers and presents an information model for polymers based on modern semantic web technologies, thus laying the foundations for achieving the vision

    A computational framework for inferring species dynamics and interactions with applications in microbiota ecology

    Get PDF
    We present MBPert, a generic computational framework for inferring species interactions and predicting dynamics in time-evolving ecosystems from perturbation and time-series data. In this work, we contextualize the framework in microbial ecosystem modeling by coupling a modified generalized Lotka-Volterra formulation with machine learning optimization. Unlike traditional methods that rely on gradient matching, MBPert leverages numerical solutions of differential equations and iterative parameter estimation to robustly capture microbial dynamics. The framework is assessed within the context of two experimental scenarios: (i) paired before-and-after measurements under targeted perturbations, and (ii) longitudinal time-series data with time-dependent perturbations. Extensive simulation studies, benchmarking on standardized MTIST datasets, and application to Clostridium difficile infection in mice and repeated antibiotic perturbations of human gut micribiota, demonstrate that MBPert accurately recapitulates species interactions and predicts system dynamics. Our results highlight MBPert as a powerful and flexible tool for mechanistic insight into microbiota ecology, with broad potential applicability to other complex dynamical systems.</p

    Analysis of translesion polymerases in colorectal cancer cells following cetuximab treatment:A network perspective

    Get PDF
    IntroductionAdaptive mutagenesis observed in colorectal cancer (CRC) cells upon exposure to EGFR inhibitors contributes to the development of resistance and recurrence. Multiple investigations have indicated a parallel between cancer cells and bacteria in terms of exhibiting adaptive mutagenesis. This phenomenon entails a transient and coordinated escalation of error-prone translesion synthesis polymerases (TLS polymerases), resulting in mutagenesis of a magnitude sufficient to drive the selection of resistant phenotypes.MethodsIn this study, we conducted a comprehensive pan-transcriptome analysis of the regulatory framework within CRC cells, with the objective of identifying potential transcriptome modules encompassing certain translesion polymerases and the associated transcription factors (TFs) that govern them. Our sampling strategy involved the collection of transcriptomic data from tumors treated with cetuximab, an EGFR inhibitor, untreated CRC tumors, and colorectal-derived cell lines, resulting in a diverse dataset. Subsequently, we identified co-regulated modules using weighted correlation network analysis with a minKMEtostay threshold set at 0.5 to minimize false-positive module identifications and mapped the modules to STRING annotations. Furthermore, we explored the putative TFs influencing these modules using KBoost, a kernel PCA regression model.ResultsOur analysis did not reveal a distinct transcriptional profile specific to cetuximab treatment. Moreover, we elucidated co-expression modules housing genes, for example, POLK, POLI, POLQ, REV1, POLN, and POLM. Specifically, POLK, POLI, and POLQ were assigned to the “blue” module, which also encompassed critical DNA damage response enzymes, for example. BRCA1, BRCA2, MSH6, and MSH2. To delineate the transcriptional control of this module, we investigated associated TFs, highlighting the roles of prominent cancer-associated TFs, such as CENPA, HNF1A, and E2F7.ConclusionWe found that translesion polymerases are co-regulated with DNA mismatch repair and cell cycle-associated factors. We did not, however, identified any networks specific to cetuximab treatment indicating that the response to EGFR inhibitors relates to a general stress response mechanism

    The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions.

    Get PDF
    BACKGROUND: The practice and research of medicine generates considerable quantities of data and model resources (DMRs). Although in principle biomedical resources are re-usable, in practice few can currently be shared. In particular, the clinical communities in physiology and pharmacology research, as well as medical education, (i.e. PPME communities) are facing considerable operational and technical obstacles in sharing data and models. FINDINGS: We outline the efforts of the PPME communities to achieve automated semantic interoperability for clinical resource documentation in collaboration with the RICORDO project. Current community practices in resource documentation and knowledge management are overviewed. Furthermore, requirements and improvements sought by the PPME communities to current documentation practices are discussed. The RICORDO plan and effort in creating a representational framework and associated open software toolkit for the automated management of PPME metadata resources is also described. CONCLUSIONS: RICORDO is providing the PPME community with tools to effect, share and reason over clinical resource annotations. This work is contributing to the semantic interoperability of DMRs through ontology-based annotation by (i) supporting more effective navigation and re-use of clinical DMRs, as well as (ii) sustaining interoperability operations based on the criterion of biological similarity. Operations facilitated by RICORDO will range from automated dataset matching to model merging and managing complex simulation workflows. In effect, RICORDO is contributing to community standards for resource sharing and interoperability.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    PhenomeNET: a whole-phenome approach to disease gene discovery

    Get PDF
    Phenotypes are investigated in model organisms to understand and reveal the molecular mechanisms underlying disease. Phenotype ontologies were developed to capture and compare phenotypes within the context of a single species. Recently, these ontologies were augmented with formal class definitions that may be utilized to integrate phenotypic data and enable the direct comparison of phenotypes between different species. We have developed a method to transform phenotype ontologies into a formal representation, combine phenotype ontologies with anatomy ontologies, and apply a measure of semantic similarity to construct the PhenomeNET cross-species phenotype network. We demonstrate that PhenomeNET can identify orthologous genes, genes involved in the same pathway and gene–disease associations through the comparison of mutant phenotypes. We provide evidence that the Adam19 and Fgf15 genes in mice are involved in the tetralogy of Fallot, and, using zebrafish phenotypes, propose the hypothesis that the mammalian homologs of Cx36.7 and Nkx2.5 lie in a pathway controlling cardiac morphogenesis and electrical conductivity which, when defective, cause the tetralogy of Fallot phenotype. Our method implements a whole-phenome approach toward disease gene discovery and can be applied to prioritize genes for rare and orphan diseases for which the molecular basis is unknown

    A novel generative adversarial networks modelling for the class imbalance problem in high dimensional omics data

    Get PDF
    Class imbalance remains a large problem in high-throughput omics analyses, causing bias towards the over-represented class when training machine learning-based classifiers. Oversampling is a common method used to balance classes, allowing for better generalization of the training data. More naive approaches can introduce other biases into the data, being especially sensitive to inaccuracies in the training data, a problem considering the characteristically noisy data obtained in healthcare. This is especially a problem with high-dimensional data. A generative adversarial network-based method is proposed for creating synthetic samples from small, high-dimensional data, to improve upon other more naive generative approaches. The method was compared with ‘synthetic minority over-sampling technique’ (SMOTE) and ‘random oversampling’ (RO). Generative methods were validated by training classifiers on the balanced data
    corecore