Search CORE

4,904 research outputs found

Validation of mappings between schemas

Author: Farré Tost Carles
Rull Guillem
Teniente López Ernest
Urpí Tubella Antoni
Publication venue
Publication date: 01/01/2007
Field of study

Mappings between schemas are key elements in several contexts such as data exchange, data integration, peer data management systems, etc. In all these contexts, the process of designing a mapping requires the participation of a mapping designer that needs a way to validate the mapping being defined, i.e., to check whether the mapping is in fact what the designer intended. However, to date very little work has directly focused on the effective validation of schema mappings. In this paper, we present a new approach for validating schema mappings that allows the mapping designer to ask questions about the accomplishment of certain desirable properties of these mappings. We consider four properties of mappings: mapping satisfiability, mapping inference, query answerability and mapping losslessness. We reformulate these properties in terms of the problem of checking the liveliness of a derived predicate. We emphasize that this approach is independent of any particular method for liveliness checking and, to show the feasibility of our approach, we use an implementation of the CQC Method and provide some experimental results.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Validation of schema mappings with nested queries

Author: Farré Tost Carles
Rull Fort Guillem
Teniente López Ernest
Urpí Tubella Antoni
Publication venue
Publication date: 01/01/2012
Field of study

With the emergence of the Web and the wide use of XML for representing data, the ability to map not only flat relational but also nested data has become crucial. The design of schema mappings is a semi-automatic process. A human designer is needed to guide the process, choose among mapping candidates, and successively refine the mapping. The designer needs a way to figure out whether the mapping is what was intended. Our approach to mapping validation allows the designer to check whether the mapping satisfies certain desirable properties. In this paper, we focus on the validation of mappings between nested relational schemas, in which the mapping assertions are either inclusions or equalities of nested queries. We focus on the nested relational setting since most XML’s Document Type Definitions (DTDs) can be represented in this model. We perform the validation by reasoning on the schemas and mapping definition. We take into account the integrity constraints defined on both the source and target schema.Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

UK utility data integration: overcoming schematic heterogeneity

Author: Beck A.R.
Bennett B.
Cohn AG
Fu G.
Ramage S.
Sanderson M.
Stell J.G.
Tagg C
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 31/10/2008
Field of study

In this paper we discuss syntactic, semantic and schematic issues which inhibit the integration of utility data in the UK. We then focus on the techniques employed within the VISTA project to overcome schematic heterogeneity. A Global Schema based architecture is employed. Although automated approaches to Global Schema definition were attempted the heterogeneities of the sector were too great. A manual approach to Global Schema definition was employed. The techniques used to define and subsequently map source utility data models to this schema are discussed in detail. In order to ensure a coherent integrated model, sub and cross domain validation issues are then highlighted. Finally the proposed framework and data flow for schematic integration is introduced

Crossref

White Rose Research Online

Constraint-based Query Distribution Framework for an Integrated Global Schema

Author: Iftikhar Nadeem
Malik Ahmad Kamran
Qadir Muhammad Abdul
Usman Muhammad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Distributed heterogeneous data sources need to be queried uniformly using global schema. Query on global schema is reformulated so that it can be executed on local data sources. Constraints in global schema and mappings are used for source selection, query optimization,and querying partitioned and replicated data sources. The provided system is all XML-based which poses query in XML form, transforms, and integrates local results in an XML document. Contributions include the use of constraints in our existing global schema which help in source selection and query optimization, and a global query distribution framework for querying distributed heterogeneous data sources.Comment: The Proceedings of the 13th INMIC 2009), Dec. 14-15, 2009, Islamabad, Pakistan. Pages 1 - 6 Print ISBN: 978-1-4244-4872-2 INSPEC Accession Number: 11072575 Date of Current Version : 15 January 201

arXiv.org e-Print Archive

VBN

A framework for utility data integration in the UK

Author: Beck AR
Bennett B
Cohn AG
Fu G
Stell JG
Publication venue: 'Informa UK Limited'
Publication date: 04/10/2007
Field of study

In this paper we investigate various factors which prevent utility knowledge from being fully exploited and suggest that integration techniques can be applied to improve the quality of utility records. The paper suggests a framework which supports knowledge and data integration. The framework supports utility integration at two levels: the schema and data level. Schema level integration ensures that a single, integrated geospatial data set is available for utility enquiries. Data level integration improves utility data quality by reducing inconsistency, duplication and conflicts. Moreover, the framework is designed to preserve autonomy and distribution of utility data. The ultimate aim of the research is to produce an integrated representation of underground utility infrastructure in order to gain more accurate knowledge of the buried services. It is hoped that this approach will enable us to understand various problems associated with utility data, and to suggest some potential techniques for resolving them

Crossref

White Rose Research Online

Save up to 99% of your time in mapping validation

Author: Autayeu Aliaksandr
Giunchiglia Fausto
Maltese Vincenzo
Publication venue
Publication date: 01/08/2010
Field of study

Identifying semantic correspondences between different vocabularies has been recognized as a fundamental step towards achieving interoperability. Several manual and automatic techniques have been recently proposed. Fully manual approaches are very precise, but extremely costly. Conversely, automatic approaches tend to fail when domain specific background knowledge is needed. Consequently, they typically require a manual validation step. Yet, when the number of computed correspondences is very large, the validation phase can be very expensive. In order to reduce the problems above, we propose to compute the minimal set of correspondences, that we call the minimal mapping, which are sufficient to compute all the other ones. We show that by concentrating on such correspondences we can save up to 99% of the manual checks required for validation

Unitn-eprints Research

Assessing and refining mappings to RDF to improve dataset quality

Author: Dimou Anastasia
Freudenberg Markus
Hellmann Sebastian
Kontokostas Dimitirs
Lehmann Jens
Mannens Erik
Van de Walle Rik
Verborgh Ruben
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

RDF dataset quality assessment is currently performed primarily after data is published. However, there is neither a systematic way to incorporate its results into the dataset nor the assessment into the publishing workflow. Adjustments are manually -but rarely- applied. Nevertheless, the root of the violations which often derive from the mappings that specify how the RDF dataset will be generated, is not identified. We suggest an incremental, iterative and uniform validation workflow for RDF datasets stemming originally from (semi-) structured data (e.g., CSV, XML, JSON). In this work, we focus on assessing and improving their mappings. We incorporate (i) a test-driven approach for assessing the mappings instead of the RDF dataset itself, as mappings reflect how the dataset will be formed when generated; and (ii) perform semi-automatic mapping refinements based on the results of the quality assessment. The proposed workflow is applied to diverse cases, e.g., large, crowdsourced datasets such as DBpedia, or newly generated, such as iLastic. Our evaluation indicates the efficiency of our workflow, as it significantly improves the overall quality of an RDF dataset in the observed cases

Ghent University Academic Bibliography

Using Element Clustering to Increase the Efficiency of XML Schema Matching

Author: Jonker Willem
Keulen Maurice van
Smiljanic Marko
Publication venue
Publication date: 01/01/2006
Field of study

Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research

Crossref

University of Twente Research Information

A schema-only approach to validate XML schema mappings

Author: Farré Tost Carles
Rull Fort Guillem
Teniente López Ernest
Urpí Tubella Antoni
Publication venue
Publication date: 01/01/2010
Field of study

Since the emergence of the Web, the ability to map XML data between different data sources has become crucial. Defining a mapping is however not a fully automatic process. The designer needs to figure out whether the mapping is what was intended. Our approach to this validation consists of defining and checking certain desirable properties of mappings. We translate the XML schemas and the mapping into first-order logic formalism and apply a reasoning mechanism to check the desirable properties automatically, without assuming any particular instantiation of the schemas.Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC