2,206 research outputs found

    Biomedical ontology alignment: An approach based on representation learning

    Get PDF
    While representation learning techniques have shown great promise in application to a number of different NLP tasks, they have had little impact on the problem of ontology matching. Unlike past work that has focused on feature engineering, we present a novel representation learning approach that is tailored to the ontology matching task. Our approach is based on embedding ontological terms in a high-dimensional Euclidean space. This embedding is derived on the basis of a novel phrase retrofitting strategy through which semantic similarity information becomes inscribed onto fields of pre-trained word vectors. The resulting framework also incorporates a novel outlier detection mechanism based on a denoising autoencoder that is shown to improve performance. An ontology matching system derived using the proposed framework achieved an F-score of 94% on an alignment scenario involving the Adult Mouse Anatomical Dictionary and the Foundational Model of Anatomy ontology (FMA) as targets. This compares favorably with the best performing systems on the Ontology Alignment Evaluation Initiative anatomy challenge. We performed additional experiments on aligning FMA to NCI Thesaurus and to SNOMED CT based on a reference alignment extracted from the UMLS Metathesaurus. Our system obtained overall F-scores of 93.2% and 89.2% for these experiments, thus achieving state-of-the-art results

    Visualization for biomedical ontologies alignment

    Get PDF
    Tese de mestrado, Bioinformática e Biologia Computacional (Bioinformática), Universidade de Lisboa, Faculdade de Ciências, 2016Desde o início do século, a investigação biomédica e a prática clínica levaram a uma acumulação de grandes quantidades de informação, por exemplo, os dados resultantes da sequenciação genómica ou os registos médicos. As ontologias fornecem um modelo estruturado com o intuito de representar o conhecimento e têm sido bem sucedidas no domínio biomédico na melhoria da interoperabilidade e partilha. O desenvolvimento desconectado das ontologias biomédicas levou à criação de modelos que apresentam domínios idênticos ou sobrepostos. As técnicas de emparelhamento de ontologias foram desenvolvidas afim de estabelecer ligações significativas entre as classes das ontologias, por outras palavras, para criar alinhamentos. Para alcançar um alinhamento ótimo é, não só importante melhorar as técnicas de emparelhamentos mas também criar as ferramentas necessárias para que possa existir intervenção humana, particularmente na visualização. Apesar da importância da intervenção de utilizadores e da visualização no emparelhamento de ontologias, poucos sistemas o suportam, sobretudo para grandes e complexas ontologias como as do domínio biomédico, concretamente no contexto da revisão de alinhamentos e interpretação de incoerências lógicas. O objetivo central desta tese consistiu na investigação dos principais paradigmas de visualização de ontologias, no contexto do alinhamento de ontologias biomédicas, e desenvolver abordagens de visualização e interação que vão de encontro a estes desafios. O trabalho desenvolvido levou, então, à criação de um novo módulo de visualização para um sistema de emparelhamento do state of the art que suporta a revisão de alinhamentos, e à construção de uma ferramenta online que visa ajudar o utilizador a compreender os conflitos encontrados nos alinhamentos, ambos baseados numa abordagem de visualização de subgrafos. Ambas as contribuições foram avaliadas em pequena escala, por testes a utilizadores que revelaram a relevância da visualização de subgrafos contra a visualização em árvore, mais comum no domínio biomédico.Since the begin of the century, biomedical research and clinical practice have resulted in the accumulation of very large amounts of information, e.g. data from genomic sequencing or medical records. Ontologies provide a structured model to represent knowledge and have been quite successful in the biomedical domain at improving interoperability and sharing. The disconnected development of biomedical ontologies has led to the creation of models that have overlapping or even equal domains. Ontology matching techniques were developed to establish meaningful connections between classes of the ontologies, in other words to create alignments. In order to achieve an optimal alignment, it is not only important to improve the matching techniques but also to create the necessary tools for human intervention, namely in visualization. Despite the importance of user intervention and visualization in ontology matching, few systems support these, especially for large and complex ontologies such as those in the biomedical domain, specifically in the context of the alignment revision and logical incoherence explanation. The central objective of this thesis was to investigate the main ontology visualization paradigms, in the context of biomedical ontology matching, and to develop visualization and interaction approaches addressing those challenges. The work developed lead to the creation of a new visualization module for a state of the art ontology matching system, that supports the alignment review, and to the construction of an online tool that aims to help the user understand the conflicts found in the alignments both based on a subgraph visualization approach. Both contributions were evaluated, in a small-scale, by user tests that revealed the relevance of subgraph visualization versus the more common tree visualization for the biomedical domain

    Dividing the Ontology Alignment Task with Semantic Embeddings and Logic-based Modules

    Get PDF
    Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In this paper we present an approach that combines a neural embedding model and logic-based modules to accurately divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Alignment Evaluation Initiative. The results are encouraging and suggest that the proposed method is adequate in practice and can be integrated within the workflow of systems unable to cope with very large ontologies

    Building an effective and efficient background knowledge resource to enhance ontology matching

    Get PDF
    International audienceOntology matching is critical for data integration and interoperability. Original ontology matching approaches relied solely on the content of the ontologies to align. However, these approaches are less effective when equivalent concepts have dissimilar labels and are structured with different modeling views. To overcome this semantic heterogeneity, the community has turned to the use of external background knowledge resources. Several methods have been proposed to select ontologies, other than the ones to align, as background knowledge to enhance a given ontology-matching task. However, these methods return a set of complete ontologies, while, in most cases, only fragments of the returned ontologies are effective for discovering new mappings. In this article, we propose an approach to select and build a background knowledge resource with just the right concepts chosen from a set of ontologies, which improves efficiency without loss of effectiveness. The use of background knowledge in ontology matching is a double-edged sword: while it may increase recall (i.e., retrieve more correct mappings), it may lower precision (i.e., produce more incorrect mappings). Therefore, we propose two methods to select the most relevant mappings from the candidate ones: (1)~a selection based on a set of rules and (2)~a selection based on supervised machine learning. Our experiments, conducted on two Ontology Alignment Evaluation Initiative (OAEI) datasets, confirm the effectiveness and efficiency of our approach. Moreover, the F-measure values obtained with our approach are very competitive to those of the state-of-the-art matchers exploiting background knowledge resources
    corecore