Search CORE

4,953 research outputs found

XML Matchers: approaches and challenges

Author: Agreste Santa
De Meo Pasquale
Ferrara Emilio
Ursino Domenico
Publication venue: 'Elsevier BV'
Publication date: 10/07/2014
Field of study

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

Information Integration - the process of integration, evolution and versioning

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: University of Twente, Centre for Telematica and Information Technology (CTIT)
Publication date: 01/01/2005
Field of study

At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

University of Twente Research Information

Automated schema matching techniques: an exploratory study

Author: Rose Ellen
Sun Xiao Long
Publication venue: 'Massey University'
Publication date: 01/01/2003
Field of study

Manual schema matching is a problem for many database applications that use multiple data sources including data warehousing and e-commerce applications. Current research attempts to address this problem by developing algorithms to automate aspects of the schema-matching task. In this paper, an approach using an external dictionary facilitates automated discovery of the semantic meaning of database schema terms. An experimental study was conducted to evaluate the performance and accuracy of five schema-matching techniques with the proposed approach, called SemMA. The proposed approach and results are compared with two existing semi-automated schema-matching approaches and suggestions for future research are made

Massey Research Online

Ontology-based data access mapping generation using data, schema,query, and mapping knowledge

Author: Dimou Anastasia
Heyvaert Pieter
Mannens Erik
Verborgh Ruben
Publication venue
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography

Data integration through service-based mediation for web-enabled information systems

Author: Pahl Claus
Zhu Yaoling
Publication venue: 'IGI Global'
Publication date: 01/01/2008
Field of study

The Web and its underlying platform technologies have often been used to integrate existing software and information systems. Traditional techniques for data representation and transformations between documents are not sufficient to support a flexible and maintainable data integration solution that meets the requirements of modern complex Web-enabled software and information systems. The difficulty arises from the high degree of complexity of data structures, for example in business and technology applications, and from the constant change of data and its representation. In the Web context, where the Web platform is used to integrate different organisations or software systems, additionally the problem of heterogeneity arises. We introduce a specific data integration solution for Web applications such as Web-enabled information systems. Our contribution is an integration technology framework for Web-enabled information systems comprising, firstly, a data integration technique based on the declarative specification of transformation rules and the construction of connectors that handle the integration and, secondly, a mediator architecture based on information services and the constructed connectors to handle the integration process

Irish Universities

DCU Online Research Access Service

XML Schema Clustering with Semantic and Hierarchical Similarity Measures

Author: Iryadi Wina
Nayak Richi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis

Crossref

Queensland University of Technology ePrints Archive

Putting Context into Schema Matching

Author: Bohannon Philip
Elnahrawy Eiman
Fan Wenfei
Flaster Michael
Publication venue
Publication date: 01/01/2006
Field of study

Edinburgh Research Explorer

Design of the shared Environmental Information System (SEIS) and development of a web-based GIS interface

Author: Bonazountas Marc
Camilleri Tim
Martirano Giacomo
Trypitsidis Anestis
Publication venue: University of Malta. Faculty of Social Wellbeing
Publication date: 01/01/2014
Field of study

Chapter 5The Shared Environmental Information System (SEIS) is a collaborative initiative of the European Commission (EC) and the European Environment Agency (EEA) aimed to establish an integrated and shared EU-wide environmental information system together with the Member States. SEIS presents the European vision on environmental information interoperability. It is a set of high-level principles & workflow-processes that organize the collection, exchange, and use of environmental data & information aimed to: • Modernise the way in which information required by environmental legislation is made available to member states or EC instruments; • Streamline reporting processes and repeal overlaps or obsolete reporting obligations; • Stimulate similar developments at international conventions; • Standardise according to INSPIRE when possible; and • Introduce the SDI (spatial database infrastructure) principle EU-wide. SEIS is a system and workflow of operations that offers technical capabilities geared to meet concept expectations. In that respect, SEIS shows the way and sets up the workflow effectively in a standardise way (e.g, INSPIRE) to: • Collect Data from Spatial Databases, in situ sensors, statistical databases, earth observation readings (e.g., EOS, GMES), marine observation using standard data transfer protocols (ODBC, SOS, ft p, etc). • Harmonise collected data (including data check/data integrity) according to best practices proven to perform well, according to the INSPIRE Directive 2007/2/EC (1) Annexes I: II: III: plus INSPIRE Implementation Rules for data not specified in above mentioned Annexes. • Harmonise collected data according to WISE (Water Information System from Europe) or Ozone-web. • Process, aggregate harmonise data so to extract information in a format understandable by wider audiences (e.g., Eurostat, enviro-indicators). • Document information to fulfi l national reporting obligations towards EU bodies (e.g., the JRC, EEA, DGENV, Eurostat) • Store and publish information for authorised end-users (e.g., citizens, institutions). This paper presents the development and integration of the SEIS-Malta Geoportal. The first section outlines EU Regulations on INSPIRE and Aarhus Directives. The second covers the architecture and the implementation of SEIS-Malta Geoportal. The third discusses the results and successful implementation of the Geoportal.peer-reviewe

OAR@UM