660 research outputs found

    Leveraging Terminological Resources for Mapping between Rare

    Get PDF

    Biomedical ontology alignment: An approach based on representation learning

    Get PDF
    While representation learning techniques have shown great promise in application to a number of different NLP tasks, they have had little impact on the problem of ontology matching. Unlike past work that has focused on feature engineering, we present a novel representation learning approach that is tailored to the ontology matching task. Our approach is based on embedding ontological terms in a high-dimensional Euclidean space. This embedding is derived on the basis of a novel phrase retrofitting strategy through which semantic similarity information becomes inscribed onto fields of pre-trained word vectors. The resulting framework also incorporates a novel outlier detection mechanism based on a denoising autoencoder that is shown to improve performance. An ontology matching system derived using the proposed framework achieved an F-score of 94% on an alignment scenario involving the Adult Mouse Anatomical Dictionary and the Foundational Model of Anatomy ontology (FMA) as targets. This compares favorably with the best performing systems on the Ontology Alignment Evaluation Initiative anatomy challenge. We performed additional experiments on aligning FMA to NCI Thesaurus and to SNOMED CT based on a reference alignment extracted from the UMLS Metathesaurus. Our system obtained overall F-scores of 93.2% and 89.2% for these experiments, thus achieving state-of-the-art results

    A Simple Standard for Sharing Ontological Mappings (SSSOM).

    Get PDF
    Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec. Database URL: http://w3id.org/sssom/spec

    Knowledge-based Biomedical Data Science 2019

    Full text link
    Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table

    iASiS Open Data Graph: Automated Semantic Integration of Disease-Specific Knowledge

    Full text link
    In biomedical research, unified access to up-to-date domain-specific knowledge is crucial, as such knowledge is continuously accumulated in scientific literature and structured resources. Identifying and extracting specific information is a challenging task and computational analysis of knowledge bases can be valuable in this direction. However, for disease-specific analyses researchers often need to compile their own datasets, integrating knowledge from different resources, or reuse existing datasets, that can be out-of-date. In this study, we propose a framework to automatically retrieve and integrate disease-specific knowledge into an up-to-date semantic graph, the iASiS Open Data Graph. This disease-specific semantic graph provides access to knowledge relevant to specific concepts and their individual aspects, in the form of concept relations and attributes. The proposed approach is implemented as an open-source framework and applied to three diseases (Lung Cancer, Dementia, and Duchenne Muscular Dystrophy). Exemplary queries are presented, investigating the potential of this automatically generated semantic graph as a basis for retrieval and analysis of disease-specific knowledge.Comment: 6 pages, 2 figures, accepted in IEEE 33rd International Symposium on Computer Based Medical Systems (CBMS2020

    Intégration de ressources en recherche translationnelle : une approche unificatrice en support des systèmes de santé "apprenants"

    Get PDF
    Learning health systems (LHS) are gradually emerging and propose a complimentary approach to translational research challenges by implementing close coupling of health care delivery, research and knowledge translation. To support coherent knowledge sharing, the system needs to rely on an integrated and efficient data integration platform. The framework and its theoretical foundations presented here aim at addressing this challenge. Data integration approaches are analysed in light of the requirements derived from LHS activities and data mediation emerges as the one most adapted for a LHS. The semantics of clinical data found in biomedical sources can only be fully derived by taking into account, not only information from the structural models (field X of table Y), but also terminological information (e.g. International Classification of Disease 10th revision) used to encode facts. The unified framework proposed here takes this into account. The platform has been implemented and tested in context of the TRANSFoRm endeavour, a European project funded by the European commission. It aims at developing a LHS including clinical activities in primary care. The mediation model developed for the TRANSFoRm project, the Clinical Data Integration Model, is presented and discussed. Results from TRANSFoRm use-cases are presented. They illustrate how a unified data sharing platform can support and enhance prospective research activities in context of a LHS. In the end, the unified mediation framework presented here allows sufficient expressiveness for the TRANSFoRm needs. It is flexible, modular and the CDIM mediation model supports the requirements of a primary care LHS.Les systèmes de santé "apprenants" (SSA) présentent une approche complémentaire et émergente aux problèmes de la recherche translationnelle en couplant de près les soins de santé, la recherche et le transfert de connaissances. Afin de permettre un flot d’informations cohérent et optimisé, le système doit se doter d’une plateforme intégrée de partage de données. Le travail présenté ici vise à proposer une approche de partage de données unifiée pour les SSA. Les grandes approches d’intégration de données sont analysées en fonction du SSA. La sémantique des informations cliniques disponibles dans les sources biomédicales est la résultante des connaissances des modèles structurelles des sources mais aussi des connaissances des modèles terminologiques utilisés pour coder l’information. Les mécanismes de la plateforme unifiée qui prennent en compte cette interdépendance sont décrits. La plateforme a été implémentée et testée dans le cadre du projet TRANSFoRm, un projet européen qui vise à développer un SSA. L’instanciation du modèle de médiation pour le projet TRANSFoRm, le Clinical Data Integration Model est analysée. Sont aussi présentés ici les résultats d’un des cas d’utilisation de TRANSFoRm pour supporter la recherche afin de donner un aperçu concret de l’impact de la plateforme sur le fonctionnement du SSA. Au final, la plateforme unifiée d’intégration proposée ici permet un niveau d’expressivité suffisant pour les besoins de TRANSFoRm. Le système est flexible et modulaire et le modèle de médiation CDIM couvre les besoins exprimés pour le support des activités d’un SSA comme TRANSFoRm
    • …
    corecore