Search CORE

12 research outputs found

Comparison of ontology alignment systems across single matching task via the McNemar's test

Author: Atashin Amir Ahooye
Hofman Wout
Mohammadi Majid
Tan Yao-Hua
Publication venue
Publication date: 01/01/2018
Field of study

Ontology alignment is widely-used to find the correspondences between different ontologies in diverse fields.After discovering the alignments,several performance scores are available to evaluate them.The scores typically require the identified alignment and a reference containing the underlying actual correspondences of the given ontologies.The current trend in the alignment evaluation is to put forward a new score(e.g., precision, weighted precision, etc.)and to compare various alignments by juxtaposing the obtained scores. However,it is substantially provocative to select one measure among others for comparison.On top of that, claiming if one system has a better performance than one another cannot be substantiated solely by comparing two scalars.In this paper,we propose the statistical procedures which enable us to theoretically favor one system over one another.The McNemar's test is the statistical means by which the comparison of two ontology alignment systems over one matching task is drawn.The test applies to a 2x2 contingency table which can be constructed in two different ways based on the alignments,each of which has their own merits/pitfalls.The ways of the contingency table construction and various apposite statistics from the McNemar's test are elaborated in minute detail.In the case of having more than two alignment systems for comparison, the family-wise error rate is expected to happen. Thus, the ways of preventing such an error are also discussed.A directed graph visualizes the outcome of the McNemar's test in the presence of multiple alignment systems.From this graph, it is readily understood if one system is better than one another or if their differences are imperceptible.The proposed statistical methodologies are applied to the systems participated in the OAEI 2016 anatomy track, and also compares several well-known similarity metrics for the same matching problem

arXiv.org e-Print Archive

TU Delft Repository

Recommended from our members

Matching disease and phenotype ontologies in the ontology alignment evaluation initiative

Author: Alam-Faruque Y.
Harrow I.
Jimenez-Ruiz E.
Koch M.
Malone J.
Markel S.
Romacker M.
Splendiani A.
Waaler A.
Woollard P.
Publication venue: BMC
Publication date: 01/01/2017
Field of study

Background: The disease and phenotype track was designed to evaluate the relative performance of ontology matching systems that generate mappings between source ontologies. Disease and phenotype ontologies are important for applications such as data mining, data integration and knowledge management to support translational science in drug discovery and understanding the genetics of disease. Results: Eleven systems (out of 21 OAEI participating systems) were able to cope with at least one of the tasks in the Disease and Phenotype track. AML, FCA-Map, LogMap(Bio) and PhenoMF systems produced the top results for ontology matching in comparison to consensus alignments. The results against manually curated mappings proved to be more difficult most likely because these mapping sets comprised mostly subsumption relationships rather than equivalence. Manual assessment of unique equivalence mappings showed that AML, LogMap(Bio) and PhenoMF systems have the highest precision results. Conclusions: Four systems gave the highest performance for matching disease and phenotype ontologies. These systems coped well with the detection of equivalence matches, but struggled to detect semantic similarity. This deserves more attention in the future development of ontology matching systems. The findings of this evaluation show that such systems could help to automate equivalence matching in the workflow of curators, who maintain ontology mapping services in numerous domains such as disease and phenotype

City Research Online

ZENODO

Directory of Open Access Journals

The Novartis Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

NORA - Norwegian Open Research Archives

Results of the Ontology Alignment Evaluation Initiative 2015

Author: Cheatham Michelle
Dragisic Zlatan
Euzenat Jérôme
Faria Daniel
Ferrara Alfio
Flouris Giorgos
Fundulaki Irini
Granada Roger
Ivanova Valentina
Jiménez-Ruiz Ernesto
Lambrix Patrick
Montanelli Stefano
Pesquita Catia
Saveta Tzanina
Shvaiko Pavel
Solimando Alessandro
Trojahn dos Santos Cassia
Zamazal Ondrej
Publication venue: No commercial editor.
Publication date: 01/01/2015
Field of study

cheatham2016aInternational audienceOntology matching consists of finding correspondences between semantically related entities of two ontologies. OAEI campaigns aim at comparing ontology matching systems on precisely defined test cases. These test cases can use ontologies of different nature (from simple thesauri to expressive OWL ontologies) and use different modalities, e.g., blind evaluation, open evaluation and consensus. OAEI 2015 offered 8 tracks with 15 test cases followed by 22 participants. Since 2011, the campaign has been using a new evaluation modality which provides more automation to the evaluation. This paper is an overall presentation of the OAEI 2015 campaign

HAL-CentraleSupelec

Scientific Publications of the University of Toulouse II Le Mirail

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Open Archive Toulouse Archive Ouverte

Hal-Diderot

HAL-Rennes 1

Integrating phenotype ontologies with PhenomeNET

Author: Gkoutos Georgios
Hoehndorf Robert
Rodríguez-García Miguel Ángel
Schofield Paul N.
Publication venue
Publication date: 22/12/2016
Field of study

Abstract Background Integration and analysis of phenotype data from humans and model organisms is a key challenge in building our understanding of normal biology and pathophysiology. However, the range of phenotypes and anatomical details being captured in clinical and model organism databases presents complex problems when attempting to match classes across species and across phenotypes as diverse as behaviour and neoplasia. We have previously developed PhenomeNET, a system for disease gene prioritization that includes as one of its components an ontology designed to integrate phenotype ontologies. While not applicable to matching arbitrary ontologies, PhenomeNET can be used to identify related phenotypes in different species, including human, mouse, zebrafish, nematode worm, fruit fly, and yeast. Results Here, we apply the PhenomeNET to identify related classes from two phenotype and two disease ontologies using automated reasoning. We demonstrate that we can identify a large number of mappings, some of which require automated reasoning and cannot easily be identified through lexical approaches alone. Combining automated reasoning with lexical matching further improves results in aligning ontologies. Conclusions PhenomeNET can be used to align and integrate phenotype ontologies. The results can be utilized for biomedical analyses in which phenomena observed in model organisms are used to identify causative genes and mutations underlying human disease

Aberystwyth Research Portal

University of Birmingham Research Portal

Directory of Open Access Journals

Apollo (Cambridge)

Proceedings of The Tenth International Workshop on Ontology Matching (OM-2015)

Author: Cheatham Michelle
Euzenat Jérôme
Hassanzadeh Oktie
Ichise Ryutaro
Jiménez-Ruiz Ernesto
Shvaiko Pavel
Publication venue: No commercial editor.
Publication date: 01/01/2016
Field of study

shvaiko2016aInternational audienceno abstrac

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Integrating phenotype ontologies with PhenomeNET

Author: A Goldfain
AN Tigrine
C Mungall
C Mungall
C Pesquita
C Pesquita
CJ Mungall
CL Smith
CL Smith
D Faria
D Sardana
E Jiménez-Ruiz
E Santos
GV Gkoutos
GV Gkoutos
I Boudellioua
J Amberger
JP Balhoff
KL McGary
LM Schriml
M Ashburner
M Zhao
NF Noy
O Bodenreider
OC Lorena
P Resnik
PN Robinson
PN Schofield
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
R Hoehndorf
S Harispe
S Köhler
S Sarntivijai
SM Bello
T Fawcett
WA Kibbe
WE Djeddi
WM Dahdul
Y Kazakov
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Investigating semantic similarity for biomedical ontology alignment

Author: Mott Isabela
Publication venue
Publication date: 01/01/2017
Field of study

Tese de mestrado, Bioinformática e Biologia Computacional (Bioinformática) Universidade de Lisboa, Faculdade de Ciências, 2017A heterogeneidade dos dados biomédicos e o crescimento exponencial da informação dentro desse domínio tem levado à utilização de ontologias, que codificam o conhecimento de forma computacionalmente tratável. O desenvolvimento de uma ontologia decorre, em geral, com base nos requisitos da equipa que a desenvolve, podendo levar à criação de ontologias diferentes e potencialmente incompatíveis por várias equipas de investigação. Isto implica que as várias ontologias existentes para codificar conhecimento biomédico possam, entre elas, sofrer de heterogeneidade: mesmo quando o domínio por elas codificado é idêntico, os conceitos podem ser representados de formas diferentes, com diferente especificidade e/ou granularidade. Para minimizar estas diferenças e criar representações mais standard e aceites pela comunidade, foram desenvolvidos algoritmos (matchers) que encontrassem pontes de conhecimento (mappings) entre as ontologias de forma a alinharem-nas. O tipo de algoritmos mais utilizados no Alinhamento de Ontologias (AO) são os que utilizam a informação léxica (isto é, os nomes, sinónimos e descrições dos conceitos) para calcular as semelhanças entre os conceitos a serem mapeados. Uma abordagem complementar a esses algoritmos é a utilização de Background Knowledge (BK) como forma de aumentar o número de sinónimos usados e assim aumentar a cobertura do alinhamento produzido. Uma alternativa aos algoritmos léxicos são os algoritmos estruturais que partem do pressuposto que as ontologias foram desenvolvidas com pontos de vista semelhantes – realidade pouco comum. Surge então o tema desta dissertação onde toma-se partido da Semelhança Semântica (SS) para o desenvolvimento de novos algoritmos de AO. É de salientar que até ao momento a utilização de SS no Alinhamento de Ontologias é cingida à verificação de mappings e não à sua procura. Esta dissertação apresenta o desenvolvimento, implementação e avaliação de dois algoritmos que utilizam SS, ambos usados como forma de estender alinhamentos produzidos previamente, um para encontrar mappings de equivalências e o outro de subsunção (onde um conceito de uma ontologia é mapeado como sendo descendente do conceito proveniente de outra ontologia). Os algoritmos propostos foram implementados no AML que é um sistema topo de gama em Alinhamento de Ontologias. O algoritmo de equivalência demonstrou uma melhoria de até 0.2% em termos de F-measure em comparação com o alinhamento âncora utilizado; e um aumento de até 11.3% quando comparado a outro sistema topo de gama (LogMapLt) que não utiliza BK. É importante referir que, dentro do espaço de procura do algoritmo o Recall variou entre 66.7% e 100%. Já o algoritmo de subsunção apresentou precisão entre 75.9% e 95% (avaliado manualmente).The heterogeneity of biomedical data and the exponential growth of the information within this domain has led to the usage of ontologies, which encode knowledge in a computationally tractable way. Usually, the ontology’s development is based on the requirements of the research team, which means that ontologies of the same domain can be different and potentially incompatible among several research teams. This fact implies that the various existing ontologies encoding biomedical knowledge can, among them, suffer from heterogeneity: even when the encoded domain is identical, the concepts may be represented in different ways, with different specificity and/or granularity. To minimize these differences and to create representations that are more standard and accepted by the community, algorithms (known as matchers) were developed to search for bridges of knowledge (known as mappings) between the ontologies, in order to align them. The most commonly used type of matchers in Ontology Matching (OM) are the ones taking advantage of the lexical information (names, synonyms and textual description of the concepts) to calculate the similarities between the concepts to be mapped. A complementary approach to those algorithms is the usage of Background Knowledge (BK) as a way to increase the number of synonyms used, and further increase of the coverage of the produced alignment. An alternative to lexical algorithms are the structural ones which assume that the ontologies were developed with similar points of view - an unusual reality. The theme of this dissertation is to take advantage of Semantic Similarity (SS) for the development of new OM algorithms. It is important to emphasize that the use of SS in Ontology Alignment has, until now, been limited to the verification of mappings and not to its search. This dissertation presents the development, implementation, and evaluation of two algorithms that use SS. Both algorithms were used to extend previously produced alignments, one to search for equivalence and the other for subsumption mappings (where a concept of an ontology is mapped as descendant from a concept from another ontology). The proposed algorithms were implemented in AML, which is a top performing system in Ontology Matching. The equivalence algorithm showed an improvement in F-measure up to 0.2% when compared to the anchor alignment; and an increase of up to 11.3% when compared to another high-end system (LogMapLt) which lacks the usage of BK. It is important to note that, within the search space of the algorithm, the Recall ranged from 66.7% to 100%. On the other hand, the subsumption algorithm presented an accuracy between 75.9% and 95% (manually evaluated)

Universidade de Lisboa: Repositório.UL

Exploiting general-purpose background knowledge for automated schema matching

Author: Portisch Jan
Publication venue
Publication date: 01/01/2022
Field of study

The schema matching task is an integral part of the data integration process. It is usually the first step in integrating data. Schema matching is typically very complex and time-consuming. It is, therefore, to the largest part, carried out by humans. One reason for the low amount of automation is the fact that schemas are often defined with deep background knowledge that is not itself present within the schemas. Overcoming the problem of missing background knowledge is a core challenge in automating the data integration process. In this dissertation, the task of matching semantic models, so-called ontologies, with the help of external background knowledge is investigated in-depth in Part I. Throughout this thesis, the focus lies on large, general-purpose resources since domain-specific resources are rarely available for most domains. Besides new knowledge resources, this thesis also explores new strategies to exploit such resources. A technical base for the development and comparison of matching systems is presented in Part II. The framework introduced here allows for simple and modularized matcher development (with background knowledge sources) and for extensive evaluations of matching systems. One of the largest structured sources for general-purpose background knowledge are knowledge graphs which have grown significantly in size in recent years. However, exploiting such graphs is not trivial. In Part III, knowledge graph em- beddings are explored, analyzed, and compared. Multiple improvements to existing approaches are presented. In Part IV, numerous concrete matching systems which exploit general-purpose background knowledge are presented. Furthermore, exploitation strategies and resources are analyzed and compared. This dissertation closes with a perspective on real-world applications

MAnnheim DOCument Server

Ontology Matching: OM-2018: Proceedings of the ISWC Workshop

Author: Cheatham Michelle
Euzenat Jérôme
Hassanzadeh Oktie
Jiménez-Ruiz Ernesto
Shvaiko Pavel
Publication venue: No commercial editor.
Publication date: 01/01/2018
Field of study

International audienceno abstrac

INRIA a CCSD electronic archive server