Search CORE

4,258 research outputs found

UK utility data integration: overcoming schematic heterogeneity

Author: Beck A.R.
Bennett B.
Cohn AG
Fu G.
Ramage S.
Sanderson M.
Stell J.G.
Tagg C
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 31/10/2008
Field of study

In this paper we discuss syntactic, semantic and schematic issues which inhibit the integration of utility data in the UK. We then focus on the techniques employed within the VISTA project to overcome schematic heterogeneity. A Global Schema based architecture is employed. Although automated approaches to Global Schema definition were attempted the heterogeneities of the sector were too great. A manual approach to Global Schema definition was employed. The techniques used to define and subsequently map source utility data models to this schema are discussed in detail. In order to ensure a coherent integrated model, sub and cross domain validation issues are then highlighted. Finally the proposed framework and data flow for schematic integration is introduced

Crossref

White Rose Research Online

Bioinformatics service reconciliation by heterogeneous schema transformation

Author: Martin Nigel
Poulovassilis Alexandra
Zamboulis Lucas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2007
Field of study

This paper focuses on the problem of bioinformatics service reconciliation in a generic and scalable manner so as to enhance interoperability in a highly evolving field. Using XML as a common representation format, but also supporting existing flat-file representation formats, we propose an approach for the scalable semi-automatic reconciliation of services, possibly invoked from within a scientific workflows tool. Service reconciliation may use the AutoMed heterogeneous data integration system as an intermediary service, or may use AutoMed to produce services that mediate between services. We discuss the application of our approach for the reconciliation of services in an example bioinformatics workflow. The main contribution of this research is an architecture for the scalable reconciliation of bioinformatics services

Birkbeck Institutional Research Online

Semantic Web Based Relational Database Access With Conflict Resolution

Author: Khazalah Fayez
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2015
Field of study

This thesis focuses on (1) accessing relational databases through Semantic Web technologies and (2) resolving conflicts that usually arises when integrating data from heterogeneous source schemas and/or instances. In the first part of the thesis, we present an approach to access relational databases using Semantic Web technologies. Our approach is built on top of Ontop framework for Ontology Based Data Access. It extracts both Ontop mappings and an equivalent OWL ontology from an existing database schema. The end users can then access the underlying data source through SPARQL queries. The proposed approach takes into consideration the different relationships between the entities of the database schema when it extracts the mapping and the equivalent ontology. Instead of extracting a flat ontology that is an exact copy of the database schema, it extracts a rich ontology. The extracted ontology can also be used as an intermediary between a domain ontology and the underlying database schema. Our approach covers independent or master entities that do not have foreign references, dependent or detailed entities that have some foreign keys that reference other entities, recursive entities that contain some self references, binary join entities that relate two entities together, and n-ary join entities that map two or more entities in an n-ary relation. The implementation results indicate that the extracted Ontop mappings and ontology are accurate. i.e., end users can query all data (using SPARQL) from the underlying database source in the same way as if they have written SQL queries. In the second part, we present an overview of the conflict resolution approaches in both conventional data integration systems and collaborative data sharing communities. We focus on the latter as it supports the needs of scientific communities for data sharing and collaboration. We first introduce the purpose of the study, and present a brief overview of data integration. Next, we talk about the problem of inconsistent data in conventional integration systems, and we summarize the conflict handling strategies used to handle such inconsistent data. Then we focus on the problem of conflict resolution in collaborative data sharing communities. A collaborative data sharing community is a group of users who agree to share a common database instance, such that all users have access to the shared instance and they can add to, update, and extend this shared instance. We discuss related works that adopt different conflict resolution strategies in the area of collaborative data sharing, and we provide a comparison between them. We find that a Collaborative Data Sharing System (CDSS) can best support the needs of certain communities such as scientific communities. We then discuss some open research opportunities to improve the efficiency and performance of the CDSS. Finally, we summarize our work so far towards achieving these open research directions

Digital Commons@Wayne State University

Data Management in the APPA System

Author: Akbarinia Reza
Martins Vidal
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

International audienceCombining Grid and P2P technologies can be exploited to provide high-level data sharing in large-scale distributed environments. However, this combination must deal with two hard problems: the scale of the network and the dynamic behavior of the nodes. In this paper, we present our solution in APPA (Atlas Peer-to-Peer Architecture), a data management system with high-level services for building large-scale distributed applications. We focus on data availability and data discovery which are two main requirements for implementing large-scale Grids. We have validated APPA's services through a combination of experimentation over Grid5000, which is a very large Grid experimental platform, and simulation using SimJava. The results show very good performance in terms of communication cost and response time

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Team Integration

Author: Do Son Thanh
Publication venue
Publication date: 03/04/2016
Field of study

We leverage theoretical advances and the multi-user nature of \emph{argumentation}. The overall contributions of our work are as follows. We model the schema matching network and the reconciliation process, where we relate the experts' assertions and the constraints of the matching network to an \emph{argumentation framework}. Our representation not only captures the experts' belief and their explanations, but also enables to reason about these captured inputs. On top of this representation, we develop support techniques for experts to detect conflicts in a set of their assertions. Then we guide the conflict resolution by offering two primitives: \emph{conflict-structure interpretation} and \emph{what-if analysis}. While the former presents meaningful interpretations for the conflicts and various heuristic metrics, the latter can greatly help the experts to understand the consequences of their own decisions as well as those of others. Last but not least, we implement an argumentation-based negotiation support tool for schema matching (ArgSM), which realizes our methods to help the experts in the collaborative task

Infoscience - École polytechnique fédérale de Lausanne

Report of the Stanford Linked Data Workshop

Author: Calter Mimi
Glaser Hugh
Keller Michael A
Persons Jerry
Publication venue: Council on Library and Information Resources
Publication date: 01/10/2011
Field of study

The Stanford University Libraries and Academic Information Resources (SULAIR) with the Council on Library and Information Resources (CLIR) conducted at week-long workshop on the prospects for a large scale, multi-national, multi-institutional prototype of a Linked Data environment for discovery of and navigation among the rapidly, chaotically expanding array of academic information resources. As preparation for the workshop, CLIR sponsored a survey by Jerry Persons, Chief Information Architect emeritus of SULAIR that was published originally for workshop participants as background to the workshop and is now publicly available. The original intention of the workshop was to devise a plan for such a prototype. However, such was the diversity of knowledge, experience, and views of the potential of Linked Data approaches that the workshop participants turned to two more fundamental goals: building common understanding and enthusiasm on the one hand and identifying opportunities and challenges to be confronted in the preparation of the intended prototype and its operation on the other. In pursuit of those objectives, the workshop participants produced:1. a value statement addressing the question of why a Linked Data approach is worth prototyping;2. a manifesto for Linked Libraries (and Museums and Archives and …);3. an outline of the phases in a life cycle of Linked Data approaches;4. a prioritized list of known issues in generating, harvesting & using Linked Data;5. a workflow with notes for converting library bibliographic records and other academic metadata to URIs;6. examples of potential “killer apps” using Linked Data: and7. a list of next steps and potential projects.This report includes a summary of the workshop agenda, a chart showing the use of Linked Data in cultural heritage venues, and short biographies and statements from each of the participants

Southampton (e-Prints Soton)