393 research outputs found

    Improving ontologies by automatic reasoning and evaluation of logical definitions

    Get PDF
    BACKGROUND: Ontologies are widely used to represent knowledge in biomedicine. Systematic approaches for detecting errors and disagreements are needed for large ontologies with hundreds or thousands of terms and semantic relationships. A recent approach of defining terms using logical definitions is now increasingly being adopted as a method for quality control as well as for facilitating interoperability and data integration. RESULTS: We show how automated reasoning over logical definitions of ontology terms can be used to improve ontology structure. We provide the Java software package GULO (Getting an Understanding of LOgical definitions), which allows fast and easy evaluation for any kind of logically decomposed ontology by generating a composite OWL ontology from appropriate subsets of the referenced ontologies and comparing the inferred relationships with the relationships asserted in the target ontology. As a case study we show how to use GULO to evaluate the logical definitions that have been developed for the Mammalian Phenotype Ontology (MPO). CONCLUSIONS: Logical definitions of terms from biomedical ontologies represent an important resource for error and disagreement detection. GULO gives ontology curators a fast and simple tool for validation of their work

    An ontological foundation for ocular phenotypes and rare eye diseases.

    Get PDF
    BACKGROUND: The optical accessibility of the eye and technological advances in ophthalmic diagnostics have put ophthalmology at the forefront of data-driven medicine. The focus of this study is rare eye disorders, a group of conditions whose clinical heterogeneity and geographic dispersion make data-driven, evidence-based practice particularly challenging. Inter-institutional collaboration and information sharing is crucial but the lack of standardised terminology poses an important barrier. Ontologies are computational tools that include sets of vocabulary terms arranged in hierarchical structures. They can be used to provide robust terminology standards and to enhance data interoperability. Here, we discuss the development of the ophthalmology-related component of two well-established biomedical ontologies, the Human Phenotype Ontology (HPO; includes signs, symptoms and investigation findings) and the Orphanet Rare Disease Ontology (ORDO; includes rare disease nomenclature/nosology). METHODS: A variety of approaches were used including automated matching to existing resources and extensive manual curation. To achieve the latter, a study group including clinicians, patient representatives and ontology developers from 17 countries was formed. A broad range of terms was discussed and validated during a dedicated workshop attended by 60 members of the group. RESULTS: A comprehensive, structured and well-defined set of terms has been agreed on including 1106 terms relating to ocular phenotypes (HPO) and 1202 terms relating to rare eye disease nomenclature (ORDO). These terms and their relevant annotations can be accessed in http://www.human-phenotype-ontology.org/ and http://www.orpha.net/ ; comments, corrections, suggestions and requests for new terms can be made through these websites. This is an ongoing, community-driven endeavour and both HPO and ORDO are regularly updated. CONCLUSIONS: To our knowledge, this is the first effort of such scale to provide terminology standards for the rare eye disease community. We hope that this work will not only improve coding and standardise information exchange in clinical care and research, but also it will catalyse the transition to an evidence-based precision ophthalmology paradigm

    An ontological foundation for ocular phenotypes and rare eye diseases.

    Get PDF
    BACKGROUND: The optical accessibility of the eye and technological advances in ophthalmic diagnostics have put ophthalmology at the forefront of data-driven medicine. The focus of this study is rare eye disorders, a group of conditions whose clinical heterogeneity and geographic dispersion make data-driven, evidence-based practice particularly challenging. Inter-institutional collaboration and information sharing is crucial but the lack of standardised terminology poses an important barrier. Ontologies are computational tools that include sets of vocabulary terms arranged in hierarchical structures. They can be used to provide robust terminology standards and to enhance data interoperability. Here, we discuss the development of the ophthalmology-related component of two well-established biomedical ontologies, the Human Phenotype Ontology (HPO; includes signs, symptoms and investigation findings) and the Orphanet Rare Disease Ontology (ORDO; includes rare disease nomenclature/nosology). METHODS: A variety of approaches were used including automated matching to existing resources and extensive manual curation. To achieve the latter, a study group including clinicians, patient representatives and ontology developers from 17 countries was formed. A broad range of terms was discussed and validated during a dedicated workshop attended by 60 members of the group. RESULTS: A comprehensive, structured and well-defined set of terms has been agreed on including 1106 terms relating to ocular phenotypes (HPO) and 1202 terms relating to rare eye disease nomenclature (ORDO). These terms and their relevant annotations can be accessed in http://www.human-phenotype-ontology.org/ and http://www.orpha.net/ ; comments, corrections, suggestions and requests for new terms can be made through these websites. This is an ongoing, community-driven endeavour and both HPO and ORDO are regularly updated. CONCLUSIONS: To our knowledge, this is the first effort of such scale to provide terminology standards for the rare eye disease community. We hope that this work will not only improve coding and standardise information exchange in clinical care and research, but also it will catalyse the transition to an evidence-based precision ophthalmology paradigm

    The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species.

    Get PDF
    In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven\u27t been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics

    Phenotypic overlap in the contribution of individual genes to CNV pathogenicity revealed by cross-species computational analysis of single-gene mutations in humans, mice and zebrafish

    Get PDF
    SUMMARY Numerous disease syndromes are associated with regions of copy number variation (CNV) in the human genome and, in most cases, the pathogenicity of the CNV is thought to be related to altered dosage of the genes contained within the affected segment. However, establishing the contribution of individual genes to the overall pathogenicity of CNV syndromes is difficult and often relies on the identification of potential candidates through manual searches of the literature and online resources. We describe here the development of a computational framework to comprehensively search phenotypic information from model organisms and single-gene human hereditary disorders, and thus speed the interpretation of the complex phenotypes of CNV disorders. There are currently more than 5000 human genes about which nothing is known phenotypically but for which detailed phenotypic information for the mouse and/or zebrafish orthologs is available. Here, we present an ontology-based approach to identify similarities between human disease manifestations and the mutational phenotypes in characterized model organism genes; this approach can therefore be used even in cases where there is little or no information about the function of the human genes. We applied this algorithm to detect candidate genes for 27 recurrent CNV disorders and identified 802 gene-phenotype associations, approximately half of which involved genes that were previously reported to be associated with individual phenotypic features and half of which were novel candidates. A total of 431 associations were made solely on the basis of model organism phenotype data. Additionally, we observed a striking, statistically significant tendency for individual disease phenotypes to be associated with multiple genes located within a single CNV region, a phenomenon that we denote as pheno-clustering. Many of the clusters also display statistically significant similarities in protein function or vicinity within the protein-protein interaction network. Our results provide a basis for understanding previously un-interpretable genotype-phenotype correlations in pathogenic CNVs and for mobilizing the large amount of model organism phenotype data to provide insights into human genetic disorders

    Exact score distribution computation for ontological similarity searches

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO). We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact <it>P</it>-value of a given score.</p> <p>Results</p> <p>In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a <it>P</it>-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact <it>P</it>-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling.</p> <p>Conclusions</p> <p>The new algorithm enables for the first time exact <it>P</it>-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: <url>https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/</url></p

    The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species

    Get PDF
    In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven’t been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics
    corecore