6 research outputs found

    Complex matching of RDF datatype properties

    Get PDF
    Property mapping is a fundamental component of ontology matching, and yet there is little support that goes beyond the identification of single property matches. Real data often requires some degree of composition, trivially exemplified by the mapping of "first name" and "last name" to "full name" on one end, to complex matchings, such as parsing and pairing symbol/digit strings to SSN numbers, at the other end of the spectrum. In this paper, we propose a two-phase instance-based technique for complex datatype property matching. Phase 1 computes the Estimate Mutual Information matrix of the property values to (1) find simple, 1:1 matches, and (2) compute a list of possible complex matches. Phase 2 applies Genetic Programming to the much reduced search space of candidate matches to find complex matches. We conclude with experimental results that illustrate how the technique works. Furthermore, we show that the proposed technique greatly improves results over those obtained if the Estimate Mutual Information matrix or the Genetic Programming techniques were to be used independently. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40285-2_18

    Compound matching for multiple ontologies

    Get PDF
    Tese de mestrado, Bioinformática e Biologia Computacional (Bioinformática) Universidade de Lisboa, Faculdade de Ciências, 2019As áreas científicas multidisciplinares como a Biomédica, usam normalmente redes de ontologias para suportar aplicações como anotação, integração, pesquisa e análise de dados. Estas redes podem ser construídas usando técnicas de correspondência de ontologias, no entanto a maioria das abordagens existentes é limitada a correspondências entre duas ontologias, sendo a grande maioria das equivalências simples. Em cenários de múltiplos domínios, é necessário encontrar correspondências mais complexas, que podem envolver várias ontologias, ou seja, correspondências compostas. Esta dissertação propõe um novo algoritmo de alinhamentos compostos, capaz de criar correspondências entre uma classe de origem e uma expressão de classe, relacionando múltiplas classes de múltiplas ontologias alvo. Trata das limitações de abordagens anteriores, que apenas consideraram duas classes de duas ontologias alvo. O algoritmo é baseado nas abordagens eficientes de correspondência léxica do AgreementMakerLight. Uma avaliação automática foi realizada contra alinhamentos de referência parciais usando métricas de avaliação clássicas e também novas, mais adequadas para a avaliação do alinhamento composto. Apesar dos resultados com métricas clássicas serem algo limitados (um facto ao qual não ajuda a incompletude dos alinhamentos de referência), as novas métricas de avaliação, projetadas para medir a utilidade de uma correspondência num cenário de alinhamento interativo, são promissoras, com menor precisão, mas com valores de recall entre 80-98%.Multi-domain areas, such as the biomedical field, routinely employ networks of ontologies to support applications such as data annotation, integration, search and analysis. These networks can be built using ontology matching techniques, however most existing approaches are limited to matches between two ontologies, the large majority being simple equivalences. In multi-domain scenarios, there is a need to discover more complex mappings, that may involve multiple ontologies, i.e. compound mappings. This thesis proposes a novel compound matching algorithm, able to compose mappings between a source class and a class expression relating multiple classes from multiple target ontologies. It addresses the limitations of previous approaches that only considered two target classes from two target ontologies. The algorithm is based on the efficient lexical matching approaches in AgreementMakerLight. An automatic evaluation was carried against partial reference alignments using both classical and novel evaluation metrics more suited to compound alignment evaluation. Despite results with classical metrics being rather poor (a fact not helped by the incompleteness of the reference alignments), the novel evaluation metrics, designed to measure the usefulness of a mapping in an interactive alignment scenario are promising, with lower precision, but recall values in the 80-98% range

    Methods for Matching of Linked Open Social Science Data

    Get PDF
    In recent years, the concept of Linked Open Data (LOD), has gained popularity and acceptance across various communities and domains. Science politics and organizations claim that the potential of semantic technologies and data exposed in this manner may support and enhance research processes and infrastructures providing research information and services. In this thesis, we investigate whether these expectations can be met in the domain of the social sciences. In particular, we analyse and develop methods for matching social scientific data that is published as Linked Data, which we introduce as Linked Open Social Science Data. Based on expert interviews and a prototype application, we investigate the current consumption of LOD in the social sciences and its requirements. Following these insights, we first focus on the complete publication of Linked Open Social Science Data by extending and developing domain-specific ontologies for representing research communities, research data and thesauri. In the second part, methods for matching Linked Open Social Science Data are developed that address particular patterns and characteristics of the data typically used in social research. The results of this work contribute towards enabling a meaningful application of Linked Data in a scientific domain

    [en] COMPLEX MATCHING OF RDF DATATYPE PROPERTIES

    No full text
    corecore