69 research outputs found
Recommended from our members
Results of the ontology alignment evaluation initiative 2019
The Ontology Alignment Evaluation Initiative (OAEI) aims at comparing ontology matching systems on precisely defined test cases. These test cases can be based on ontologies of different levels of complexity (from simple thesauri to expressive OWL ontologies) and use different evaluation modalities (e.g., blind evaluation, open evaluation, or consensus). The OAEI 2019 campaign offered 11 tracks with 29 test cases, and was attended by 20 participants. This paper is an overall presentation of that campaign
LogMap family participation in the OAEI2018
We present the participation of LogMap and its variants in the OAEI 2018 campaign. The LogMap project started in January 2011 with the objective of developing a scalable and logic-based ontology matching system. This is our eight participation in the OAEI and the experience has so far been very positive. LogMap is one of the few systems that participates in (almost) all OAEI tracks
Recommended from our members
SemTab 2019: Resources to Benchmark Tabular Data to Knowledge Graph Matching Systems
Tabular data to Knowledge Graph matching is the process of assigning semantic tags from knowledge graphs (e.g., Wikidata or DBpedia) to the elements of a table. This task is a challenging problem for various reasons, including the lack of metadata (e.g., table and column names), the noisiness, heterogeneity, incompleteness and ambiguity in the data. The results of this task provide significant insights about potentially highly valuable tabular data, as recent works have shown, enabling a new family of data analytics and data science applications. Despite significant amount of work on various flavors of this problem, there is a lack of a common framework to conduct a systematic evaluation of state-of-the-art systems. The creation of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab) aims at filling this gap. In this paper, we report about the datasets, infrastructure and lessons learned from the first edition of the SemTab challenge
Alinhamento de vocabulário de domínio utilizando os sistemas AML e LogMap
Introduction: In the context of the Semantic Web, interoperability among
heterogeneous ontologies is a challenge due to several factors, among which semantic ambiguity and redundancy stand out. To overcome these challenges, systems and algorithms are adopted to align different ontologies. In this study, it is understood that controlled vocabularies are a particular form of ontology.
Objective: to obtain a vocabulary resulting from the alignment and fusion of the Vocabularies Scientific Domains and Scientific Areas of the Foundation for Science and Technology, - FCT, European Science Vocabulary - EuroSciVoc and United Nations Educational, Scientific and Cultural Organization - UNESCO nomenclature for fields of Science and Technology, in the Computing Sciences domain, to be used
in the IViSSEM project. Methodology: literature review on systems/algorithms for
ontology alignment, using the Preferred Reporting Items for Systematic Reviews
and Meta-Analyses - PRISMA methodology; alignment of the three vocabularies;
and validation of the resulting vocabulary by means of a Delphi study. Results: we
proceeded to analyze the 25 ontology alignment systems and variants that
participated in at least one track of the Ontology Alignment Evaluation Initiative
competition between 2018 and 2019. From these systems, Agreement Maker Light
and Log Map were selected to perform the alignment of the three vocabularies,
making a cut to the area of Computer Science. Conclusion: The vocabulary was
obtained from Agreement Maker Light for having presented a better performance.
At the end, a vocabulary with 98 terms was obtained in the Computer Science
domain to be adopted by the IViSSEM project. The alignment resulted from the
vocabularies used by FCT (Portugal), with the one adopted by the European Union
(EuroSciVoc) and another one from the domain of Science & Technology
(UNESCO). This result is beneficial to other universities and projects, as well as to
FCT itself.Introdução: No contexto da Web Semântica, a interoperabilidade entre ontologias heterogêneas é um desafio devido a diversos fatores entre os quais se destacam a ambiguidade e a redundância semântica. Para superar tais desafios, adota-se sistemas e algoritmos para alinhamento de diferentes ontologias. Neste estudo, entende-se que vocabulários controlados são uma forma particular de ontologias.
Objetivo: obter um vocabulário resultante do alinhamento e fusão dos vocabulários
Domínios Científicos e Áreas Científicas da Fundação para Ciência e Tecnologia, - FCT, European Science Vocabulary - EuroSciVoc e Organização das Nações Unidas para a Educação, a Ciência e a Cultura - UNESCO nomenclature for fields of Science and
Technology, no domínio Ciências da Computação, para ser usado no âmbito do projeto IViSSEM. Metodologia: revisão da literatura sobre sistemas/algoritmos para
alinhamento de ontologias, utilizando a metodologia Preferred Reporting Items for Systematic Reviews and Meta-Analyses - PRISMA; alinhamento dos três
vocabulários; e validação do vocabulário resultante por meio do estudo Delphi.
Resultados: procedeu-se à análise dos 25 sistemas de alinhamento de ontologias e
variantes que participaram de pelo menos uma track da competição Ontology
Alignment Evaluation Iniciative entre 2018 e 2019. Destes sistemas foram
selecionados Agreement Maker Light e LogMap para realizar o alinhamento dos três
vocabulários, fazendo um recorte para a área da Ciência da Computação.
Conclusão: O vocabulário foi obtido a partir do Agreement Maker Light por ter
apresentado uma melhor performance. Ao final foi obtido o vocabulário, com 98
termos, no domínio da Ciência da Computação a ser adotado pelo projeto IViSSEM.
O alinhamento resultou dos vocabulários utilizados pela FCT (Portugal), com o
adotado pela União Europeia (EuroSciVoc) e outro do domínio da
Ciência&Tecnologia (UNESCO). Esse resultado é proveitoso para outras
universidades e projetos, bem como para a própria FCT
Recommended from our members
Matching disease and phenotype ontologies in the ontology alignment evaluation initiative
Background: The disease and phenotype track was designed to evaluate the relative performance of ontology matching systems that generate mappings between source ontologies. Disease and phenotype ontologies are important for applications such as data mining, data integration and knowledge management to support translational science in drug discovery and understanding the genetics of disease.
Results: Eleven systems (out of 21 OAEI participating systems) were able to cope with at least one of the tasks in the Disease and Phenotype track. AML, FCA-Map, LogMap(Bio) and PhenoMF systems produced the top results for ontology matching in comparison to consensus alignments. The results against manually curated mappings proved to be more difficult most likely because these mapping sets comprised mostly subsumption relationships rather than equivalence. Manual assessment of unique equivalence mappings showed that AML, LogMap(Bio) and PhenoMF systems have the highest precision results.
Conclusions: Four systems gave the highest performance for matching disease and phenotype ontologies. These systems coped well with the detection of equivalence matches, but struggled to detect semantic similarity. This deserves more attention in the future development of ontology matching systems. The findings of this evaluation show that such systems could help to automate equivalence matching in the workflow of curators, who maintain ontology mapping services in numerous domains such as disease and phenotype
Recommended from our members
Results of the ontology alignment evaluation initiative 2017
Ontology matching consists of finding correspondences between semantically related entities of different ontologies. The Ontology Alignment Evaluation Initiative (OAEI) aims at comparing ontology matching systems on precisely defined test cases. These test cases can be based on ontologies of different levels of complexity (from simple thesauri to expressive OWL ontologies) and use different evaluation modalities (e.g., blind evaluation, open evaluation, or consensus). The OAEI 2017 campaign offered 9 tracks with 23 test cases, and was attended by 21 participants. This paper is an overall presentation of that campaign
Investigating semantic similarity for biomedical ontology alignment
Tese de mestrado, Bioinformática e Biologia Computacional (Bioinformática) Universidade de Lisboa, Faculdade de Ciências, 2017A heterogeneidade dos dados biomédicos e o crescimento exponencial da informação dentro desse domínio tem levado à utilização de ontologias, que codificam o conhecimento de forma computacionalmente tratável. O desenvolvimento de uma ontologia decorre, em geral, com base nos requisitos da equipa que a desenvolve, podendo levar à criação de ontologias diferentes e potencialmente incompatíveis por várias equipas de investigação. Isto implica que as várias ontologias existentes para codificar conhecimento biomédico possam, entre elas, sofrer de heterogeneidade: mesmo quando o domínio por elas codificado é idêntico, os conceitos podem ser representados de formas diferentes, com diferente especificidade e/ou granularidade. Para minimizar estas diferenças e criar representações mais standard e aceites pela comunidade, foram desenvolvidos algoritmos (matchers) que encontrassem pontes de conhecimento (mappings) entre as ontologias de forma a alinharem-nas. O tipo de algoritmos mais utilizados no Alinhamento de Ontologias (AO) são os que utilizam a informação léxica (isto é, os nomes, sinónimos e descrições dos conceitos) para calcular as semelhanças entre os conceitos a serem mapeados. Uma abordagem complementar a esses algoritmos é a utilização de Background Knowledge (BK) como forma de aumentar o número de sinónimos usados e assim aumentar a cobertura do alinhamento produzido. Uma alternativa aos algoritmos léxicos são os algoritmos estruturais que partem do pressuposto que as ontologias foram desenvolvidas com pontos de vista semelhantes – realidade pouco comum. Surge então o tema desta dissertação onde toma-se partido da Semelhança Semântica (SS) para o desenvolvimento de novos algoritmos de AO. É de salientar que até ao momento a utilização de SS no Alinhamento de Ontologias é cingida à verificação de mappings e não à sua procura. Esta dissertação apresenta o desenvolvimento, implementação e avaliação de dois algoritmos que utilizam SS, ambos usados como forma de estender alinhamentos produzidos previamente, um para encontrar mappings de equivalências e o outro de subsunção (onde um conceito de uma ontologia é mapeado como sendo descendente do conceito proveniente de outra ontologia). Os algoritmos propostos foram implementados no AML que é um sistema topo de gama em Alinhamento de Ontologias. O algoritmo de equivalência demonstrou uma melhoria de até 0.2% em termos de F-measure em comparação com o alinhamento âncora utilizado; e um aumento de até 11.3% quando comparado a outro sistema topo de gama (LogMapLt) que não utiliza BK. É importante referir que, dentro do espaço de procura do algoritmo o Recall variou entre 66.7% e 100%. Já o algoritmo de subsunção apresentou precisão entre 75.9% e 95% (avaliado manualmente).The heterogeneity of biomedical data and the exponential growth of the information within this domain has led to the usage of ontologies, which encode knowledge in a computationally tractable way. Usually, the ontology’s development is based on the requirements of the research team, which means that ontologies of the same domain can be different and potentially incompatible among several research teams. This fact implies that the various existing ontologies encoding biomedical knowledge can, among them, suffer from heterogeneity: even when the encoded domain is identical, the concepts may be represented in different ways, with different specificity and/or granularity. To minimize these differences and to create representations that are more standard and accepted by the community, algorithms (known as matchers) were developed to search for bridges of knowledge (known as mappings) between the ontologies, in order to align them. The most commonly used type of matchers in Ontology Matching (OM) are the ones taking advantage of the lexical information (names, synonyms and textual description of the concepts) to calculate the similarities between the concepts to be mapped. A complementary approach to those algorithms is the usage of Background Knowledge (BK) as a way to increase the number of synonyms used, and further increase of the coverage of the produced alignment. An alternative to lexical algorithms are the structural ones which assume that the ontologies were developed with similar points of view - an unusual reality. The theme of this dissertation is to take advantage of Semantic Similarity (SS) for the development of new OM algorithms. It is important to emphasize that the use of SS in Ontology Alignment has, until now, been limited to the verification of mappings and not to its search. This dissertation presents the development, implementation, and evaluation of two algorithms that use SS. Both algorithms were used to extend previously produced alignments, one to search for equivalence and the other for subsumption mappings (where a concept of an ontology is mapped as descendant from a concept from another ontology). The proposed algorithms were implemented in AML, which is a top performing system in Ontology Matching. The equivalence algorithm showed an improvement in F-measure up to 0.2% when compared to the anchor alignment; and an increase of up to 11.3% when compared to another high-end system (LogMapLt) which lacks the usage of BK. It is important to note that, within the search space of the algorithm, the Recall ranged from 66.7% to 100%. On the other hand, the subsumption algorithm presented an accuracy between 75.9% and 95% (manually evaluated)
Contextualized Structural Self-supervised Learning for Ontology Matching
Ontology matching (OM) entails the identification of semantic relationships
between concepts within two or more knowledge graphs (KGs) and serves as a
critical step in integrating KGs from various sources. Recent advancements in
deep OM models have harnessed the power of transformer-based language models
and the advantages of knowledge graph embedding. Nevertheless, these OM models
still face persistent challenges, such as a lack of reference alignments,
runtime latency, and unexplored different graph structures within an end-to-end
framework. In this study, we introduce a novel self-supervised learning OM
framework with input ontologies, called LaKERMap. This framework capitalizes on
the contextual and structural information of concepts by integrating implicit
knowledge into transformers. Specifically, we aim to capture multiple
structural contexts, encompassing both local and global interactions, by
employing distinct training objectives. To assess our methods, we utilize the
Bio-ML datasets and tasks. The findings from our innovative approach reveal
that LaKERMap surpasses state-of-the-art systems in terms of alignment quality
and inference time. Our models and codes are available here:
https://github.com/ellenzhuwang/lakermap
Building an effective and efficient background knowledge resource to enhance ontology matching
International audienceOntology matching is critical for data integration and interoperability. Original ontology matching approaches relied solely on the content of the ontologies to align. However, these approaches are less effective when equivalent concepts have dissimilar labels and are structured with different modeling views. To overcome this semantic heterogeneity, the community has turned to the use of external background knowledge resources. Several methods have been proposed to select ontologies, other than the ones to align, as background knowledge to enhance a given ontology-matching task. However, these methods return a set of complete ontologies, while, in most cases, only fragments of the returned ontologies are effective for discovering new mappings. In this article, we propose an approach to select and build a background knowledge resource with just the right concepts chosen from a set of ontologies, which improves efficiency without loss of effectiveness. The use of background knowledge in ontology matching is a double-edged sword: while it may increase recall (i.e., retrieve more correct mappings), it may lower precision (i.e., produce more incorrect mappings). Therefore, we propose two methods to select the most relevant mappings from the candidate ones: (1)~a selection based on a set of rules and (2)~a selection based on supervised machine learning. Our experiments, conducted on two Ontology Alignment Evaluation Initiative (OAEI) datasets, confirm the effectiveness and efficiency of our approach. Moreover, the F-measure values obtained with our approach are very competitive to those of the state-of-the-art matchers exploiting background knowledge resources
- …