Search CORE

21,676 research outputs found

Industrial-strength schema matching

Author: Christoph Quix
Do H. H.
Madhavan J.
Melnik S.
Michalis Petropoulos
Miller R. J.
Mork P.
Papakonstantinou Y.
Philip A. Bernstein
Sergey Melnik
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Using Element Clustering to Increase the Efficiency of XML Schema Matching

Author: Jonker Willem
Keulen Maurice van
Smiljanic Marko
Publication venue
Publication date: 01/01/2006
Field of study

Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research

Crossref

University of Twente Research Information

Nonparametric Bayesian Modeling for Automated Database Schema Matching

Author: Ferragut Erik M.
Laska Jason
Publication venue
Publication date: 06/07/2015
Field of study

The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models

arXiv.org e-Print Archive

Crossref

XML Matchers: approaches and challenges

Author: Agreste Santa
De Meo Pasquale
Ferrara Emilio
Ursino Domenico
Publication venue: 'Elsevier BV'
Publication date: 10/07/2014
Field of study

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

A Large Scale Dataset for the Evaluation of Ontology Matching Systems

Author: Avesani Paolo
Giunchiglia Fausto
Shvaiko Pavel
Yatskevich Mikalai
Publication venue
Publication date: 01/01/2008
Field of study

Recently, the number of ontology matching techniques and systems has increased significantly. This makes the issue of their evaluation and comparison more severe. One of the challenges of the ontology matching evaluation is in building large scale evaluation datasets. In fact, the number of possible correspondences between two ontologies grows quadratically with respect to the numbers of entities in these ontologies. This often makes the manual construction of the evaluation datasets demanding to the point of being infeasible for large scale matching tasks. In this paper we present an ontology matching evaluation dataset composed of thousands of matching tasks, called TaxME2. It was built semi-automatically out of the Google, Yahoo and Looksmart web directories. We evaluated TaxME2 by exploiting the results of almost two dozen of state of the art ontology matching systems. The experiments indicate that the dataset possesses the desired key properties, namely it is error-free, incremental, discriminative, monotonic, and hard for the state of the art ontology matching systems. The paper has been accepted for publication in "The Knowledge Engineering Review", Cambridge Universty Press (ISSN: 0269-8889, EISSN: 1469-8005)

CiteSeerX

Archivio della ricerca - Fondazione Bruno Kessler

Unitn-eprints Research

Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project

Author: E. Izquierdo
I. Kompatsiaris
J. R. Casas
M. G. Strintzis
Noel E
Noel E. O&apos
P. Migliorati
R. Leonardi
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2003
Field of study

The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system

CiteSeerX

Archivio istituzionale della ricerca - Università di Brescia

DCU Online Research Access Service

Category Theory and Model-Driven Engineering: From Formal Semantics to Design Patterns and Beyond

Author: Antkiewicz
Batory
Bernstein
Bohannon
Boronat
Czarnecki
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Diskin
Ehrig
Fiadeiro
Fiadeiro
Fiadeiro
Fiadeiro
Foster
Goguen
Hermann
Hofmann
Johnson
Johnson
Johnson
Johnson
José Fiadeiro
Jurack
Liang
Makkai
Matsuda
Pottinger
Rossini
Rossini
Rutle
Rutle
Selic
Shaw
Spaccapietra
Stevens
Thomas Soboll
Tom Maibaum
Ulrike Golas
Xiong
Zinovy Diskin
Publication venue: 'Open Publishing Association'
Publication date: 01/08/2012
Field of study

There is a hidden intrigue in the title. CT is one of the most abstract mathematical disciplines, sometimes nicknamed "abstract nonsense". MDE is a recent trend in software development, industrially supported by standards, tools, and the status of a new "silver bullet". Surprisingly, categorical patterns turn out to be directly applicable to mathematical modeling of structures appearing in everyday MDE practice. Model merging, transformation, synchronization, and other important model management scenarios can be seen as executions of categorical specifications. Moreover, the paper aims to elucidate a claim that relationships between CT and MDE are more complex and richer than is normally assumed for "applied mathematics". CT provides a toolbox of design patterns and structural principles of real practical value for MDE. We will present examples of how an elementary categorical arrangement of a model management scenario reveals deficiencies in the architecture of modern tools automating the scenario.Comment: In Proceedings ACCAT 2012, arXiv:1208.430

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals