Search CORE

113 research outputs found

Evaluating ontology matchers on real-world financial services data models

Author: Hladik Michael
Paulheim Heiko
Portisch Jan
Publication venue: RWTH
Publication date: 01/01/2019
Field of study

Financial data in enterprises is often stored using different data models, yet, it needs to be integrated in order to foster comprehensive evaluations. Conceptually, each of those data models can be understood as an ontology, and automated ontology matching can be applied as a first step towards data integration. In this paper, we analyze the performance of existing ontology matching tools for matching financial data models. The data has been provided by SAP SE and consists of real data schemas that are used in the financial services area and mappings between them. We have created five data sets by translating enterprise data schemas to ontologies and expert mappings to ontology alignment gold standards. We evaluate state of the art ontology matchers on our newly created data set. Our experiments show that current matching systems struggle to handle enterprise data sets and achieve significantly lower scores compared to data sets of other evaluation initiatives

MAnnheim DOCument Server

Automatic schema matching utilizing hypernymy relations extracted from the web

Author: Portisch Jan
Publication venue
Publication date: 01/01/2018
Field of study

This thesis explores how a large corpus of Is-a statements can be exploited for the task of schema matching

MAnnheim DOCument Server

Structural Graph-based Metamodel Matching

Author: Voigt Konrad
Publication venue
Publication date: 02/11/2011
Field of study

Data integration has been, and still is, a challenge for applications processing multiple heterogeneous data sources. Across the domains of schemas, ontologies, and metamodels, this imposes the need for mapping specifications, i.e. the task of discovering semantic correspondences between elements. Support for the development of such mappings has been researched, producing matching systems that automatically propose mapping suggestions. However, especially in the context of metamodel matching the result quality of state of the art matching techniques leaves room for improvement. Although the traditional approach of pair-wise element comparison works on smaller data sets, its quadratic complexity leads to poor runtime and memory performance and eventually to the inability to match, when applied on real-world data. The work presented in this thesis seeks to address these shortcomings. Thereby, we take advantage of the graph structure of metamodels. Consequently, we derive a planar graph edit distance as metamodel similarity metric and mining-based matching to make use of redundant information. We also propose a planar graph-based partitioning to cope with large-scale matching. These techniques are then evaluated using real-world mappings from SAP business integration scenarios and the MDA community. The results demonstrate improvement in quality and managed runtime and memory consumption for large-scale metamodel matching

Technische Universität Dresden: Qucosa

Uncertainty in Automated Ontology Matching: Lessons Learned from an Empirical Experimentation

Author: Osman Inès
Pileggi Salvatore F.
Yahia Sadok Ben
Publication venue
Publication date: 18/10/2023
Field of study

Data integration is considered a classic research field and a pressing need within the information science community. Ontologies play a critical role in such a process by providing well-consolidated support to link and semantically integrate datasets via interoperability. This paper approaches data integration from an application perspective, looking at techniques based on ontology matching. An ontology-based process may only be considered adequate by assuming manual matching of different sources of information. However, since the approach becomes unrealistic once the system scales up, automation of the matching process becomes a compelling need. Therefore, we have conducted experiments on actual data with the support of existing tools for automatic ontology matching from the scientific community. Even considering a relatively simple case study (i.e., the spatio-temporal alignment of global indicators), outcomes clearly show significant uncertainty resulting from errors and inaccuracies along the automated matching process. More concretely, this paper aims to test on real-world data a bottom-up knowledge-building approach, discuss the lessons learned from the experimental results of the case study, and draw conclusions about uncertainty and uncertainty management in an automated ontology matching process. While the most common evaluation metrics clearly demonstrate the unreliability of fully automated matching solutions, properly designed semi-supervised approaches seem to be mature for a more generalized application

arXiv.org e-Print Archive

Predicting the content of peer-to-peer interactions

Author: Besana Paolo
Publication venue: The University of Edinburgh
Publication date: 01/01/2009
Field of study

Software agents interact to solve tasks, the details of which need to be described in a language understandable by all the actors involved. Ontologies provide a formalism for defining both the domain of the task and the terminology used to describe it. However, finding a shared ontology has proved difficult: different institutions and developers have different needs and formalise them in different ontologies. In a closed environment it is possible to force all the participants to share the same ontology, while in open and distributed environments ontology mapping can provide interoperability between heterogeneous interacting actors. However, conventional mapping systems focus on acquiring static information, and on mapping whole ontologies, which is infeasible in open systems. This thesis shows a different approach to the problem of heterogeneity. It starts from the intuitive idea that when similar situations arise, similar interactions are performed. If the interactions between actors are specified in formal scripts, shared by all the participants, then when the same situation arises, the same script is used. The main hypothesis that this thesis aims to demonstrate is that by analysing different runs of these scripts it is possible to create a statistical model of the interactions, that reflect the frequency of terms in messages and of ontological relations between terms in different messages. The model is then used during a run of a known interaction to compute the probability distribution for terms in received messages. The probability distribution provides additional information, contextual to the interaction, that can be used by a traditional ontology matcher in order to improve efficiency, by reducing the comparisons to the most likely ones given the context, and possibly both recall and precision, in particular helping disambiguation. The ability to create a model that reflects real phenomena in this sort of environment is evaluated by analysing the quality of the predictions, in particular verifying how various features of the interactions, such as their non-stationarity, affect the predictions. The actual improvements to a matcher we developed are also evaluated. The overall results are very promising, as using the predictor can lower the overall computation time for matching by ten times, while maintaining or in some cases improving recall and precision

Edinburgh Research Archive

Proceedings of the 15th ISWC workshop on Ontology Matching (OM 2020)

Author: Euzenat Jérôme
Hassanzadeh Oktie
Jiménez-Ruiz Ernesto
Shvaiko Pavel
Trojahn dos Santos Cassia
Publication venue: CEUR.org
Publication date: 01/01/2020
Field of study

15th International Workshop on Ontology Matching co-located with the 19th International Semantic Web Conference (ISWC 2020)International audienc

INRIA a CCSD electronic archive server

Exploiting general-purpose background knowledge for automated schema matching

Author: Portisch Jan
Publication venue
Publication date: 01/01/2022
Field of study

The schema matching task is an integral part of the data integration process. It is usually the first step in integrating data. Schema matching is typically very complex and time-consuming. It is, therefore, to the largest part, carried out by humans. One reason for the low amount of automation is the fact that schemas are often defined with deep background knowledge that is not itself present within the schemas. Overcoming the problem of missing background knowledge is a core challenge in automating the data integration process. In this dissertation, the task of matching semantic models, so-called ontologies, with the help of external background knowledge is investigated in-depth in Part I. Throughout this thesis, the focus lies on large, general-purpose resources since domain-specific resources are rarely available for most domains. Besides new knowledge resources, this thesis also explores new strategies to exploit such resources. A technical base for the development and comparison of matching systems is presented in Part II. The framework introduced here allows for simple and modularized matcher development (with background knowledge sources) and for extensive evaluations of matching systems. One of the largest structured sources for general-purpose background knowledge are knowledge graphs which have grown significantly in size in recent years. However, exploiting such graphs is not trivial. In Part III, knowledge graph em- beddings are explored, analyzed, and compared. Multiple improvements to existing approaches are presented. In Part IV, numerous concrete matching systems which exploit general-purpose background knowledge are presented. Furthermore, exploitation strategies and resources are analyzed and compared. This dissertation closes with a perspective on real-world applications

MAnnheim DOCument Server

Semantic Systems. The Power of AI and Knowledge Graphs

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book constitutes the refereed proceedings of the 15th International Conference on Semantic Systems, SEMANTiCS 2019, held in Karlsruhe, Germany, in September 2019. The 20 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 88 submissions. They cover topics such as: web semantics and linked (open) data; machine learning and deep learning techniques; semantic information management and knowledge integration; terminology, thesaurus and ontology management; data mining and knowledge discovery; semantics in blockchain and distributed ledger technologies

OAPEN Library

WEB recommendations for E-commerce websites

Author: Golovin Mykola
Publication venue
Publication date: 02/03/2010
Field of study

In this part of the thesis we have investigated how the navigation utilizing web recommendations can be implemented on the e-commerce websites based on integrated data sources. The integrated e-commerce websites are an interesting use case for web recommendations. One of the reasons for this interest is that many modern, large and economically successful e-commerce websites follow the integrated approach. Another reason is that especially in the integrated environment, due to the lack of the pre-defined semantic connections between the data, the web recommendations step forward as means of enabling user navigation. In this chapter we have presented the architecture for the websites based on integrated data sources named EC-Fuice. We have also presented the prototypical implementation of our architecture which serves as a proof-of-concept and investigated the challenges of creating navigation on an integrated website. The following issues were addressed in this part of the thesis: Combination of several state-of-the-art tools and techniques in the fields of databases, data integration, ontology matching and web engineering into one generic architecture for creating integrated websites. Comparative experiments with several techniques for instance matching (also known as record linkage or duplicate detection). Investigation on using the ontology matching to facilitate the instance matching. Comparative experiments with several techniques for ontology matching. Investigations on the instance-based ontology matching and the possibilities for combining instance-based ontology matching with other techniques for ontology matching. Investigation of the possibilities to improve user navigation in the integrated data environment with different types of web recommendations. Review of the related work in the fields of data integration and ontology matching and discussion of the contact points between the research described here and other related projects. The main contributions of the research described in this part of the thesis are the EC-Fuice architecture, the novel method for matching e-commerce ontologies based on combination of instance information and metadata information, the experimental results of ontology and instance matching performed by different matching algorithms and the classification of the types of recommendations which can be used on an integrated e-commerce website

Qucosa - Publikationsserver der Universität Leipzig