Search CORE

6,754 research outputs found

Information Integration - the process of integration, evolution and versioning

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: University of Twente, Centre for Telematica and Information Technology (CTIT)
Publication date: 01/01/2005
Field of study

At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

University of Twente Research Information

DataHub: Collaborative Data Science & Dataset Version Management at Scale

Author: Bhardwaj Anant
Bhattacherjee Souvik
Chavan Amit
Deshpande Amol
Elmore Aaron J.
Madden Samuel
Parameswaran Aditya G.
Publication venue
Publication date: 02/09/2014
Field of study

Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving users the ability to create, branch, merge, difference and search large, divergent collections of datasets, and (b) a platform, DataHub, that gives users the ability to perform collaborative data analysis building on this version control system. We outline the challenges in providing dataset version control at scale.Comment: 7 page

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Model Matching Challenge: Benchmarks for Ecore and BPMN Diagrams

Author: Müller Klaus
Pietsch Pit
Rumpe Bernhard
Publication venue
Publication date: 01/01/2013
Field of study

In the last couple of years, Model Driven Engineering (MDE) gained a prominent role in the context of software engineering. In the MDE paradigm, models are considered first level artifacts which are iteratively developed by teams of programmers over a period of time. Because of this, dedicated tools for versioning and management of models are needed. A central functionality within this group of tools is model comparison and differencing. In two disjunct research projects, we identified a group of general matching problems where state-of-the-art comparison algorithms delivered low quality results. In this article, we will present five edit operations which are the cause for these low quality results. The reasons why the algorithms fail, as well as possible solutions, are also discussed. These examples can be used as benchmarks by model developers to assess the quality and applicability of a model comparison tool for a given model type.Comment: 7 pages, 7 figure

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

OntoMaven: Maven-based Ontology Development and Management of Distributed Ontology Repositories

Author: Paschke Adrian
Publication venue
Publication date: 27/09/2013
Field of study

In collaborative agile ontology development projects support for modular reuse of ontologies from large existing remote repositories, ontology project life cycle management, and transitive dependency management are important needs. The Apache Maven approach has proven its success in distributed collaborative Software Engineering by its widespread adoption. The contribution of this paper is a new design artifact called OntoMaven. OntoMaven adopts the Maven-based development methodology and adapts its concepts to knowledge engineering for Maven-based ontology development and management of ontology artifacts in distributed ontology repositories.Comment: Pre-print submission to 9th International Workshop on Semantic Web Enabled Software Engineering (SWESE2013). Berlin, Germany, December 2-5, 201

arXiv.org e-Print Archive

CiteSeerX

Guidelines for a Dynamic Ontology - Integrating Tools of Evolution and Versioning in Ontology

Author: Cruz Christophe
Nicolle Christophe
Pittet Perrine
Publication venue
Publication date: 26/10/2011
Field of study

Ontologies are built on systems that conceptually evolve over time. In addition, techniques and languages for building ontologies evolve too. This has led to numerous studies in the field of ontology versioning and ontology evolution. This paper presents a new way to manage the lifecycle of an ontology incorporating both versioning tools and evolution process. This solution, called VersionGraph, is integrated in the source ontology since its creation in order to make it possible to evolve and to be versioned. Change management is strongly related to the model in which the ontology is represented. Therefore, we focus on the OWL language in order to take into account the impact of the changes on the logical consistency of the ontology like specified in OWL DL

arXiv.org e-Print Archive

HAL-uB

A logic programming framework for modeling temporal objects

Author: Kesim FN
Sergot M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1996
Field of study

Published versio

Crossref

Bilkent University Institutional Repository

Spiral - Imperial College Digital Repository

Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

Author: A. Bassi
A. Thomasian
B. Allcock
B.S. White
G. Antoniu
G. Antoniu
G. Antoniu
K. Douglas
M. Nicola
M.A. Casey
O. Tatebe
P. Honeyman
P.Z. Kunszt
R. Bolze
R. Jin
R.C. Merkle
S. Rhea
Publication venue
Publication date: 01/01/2008
Field of study

This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1