6,754 research outputs found
Information Integration - the process of integration, evolution and versioning
At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud
In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources
DataHub: Collaborative Data Science & Dataset Version Management at Scale
Relational databases have limited support for data collaboration, where teams
collaboratively curate and analyze large datasets. Inspired by software version
control systems like git, we propose (a) a dataset version control system,
giving users the ability to create, branch, merge, difference and search large,
divergent collections of datasets, and (b) a platform, DataHub, that gives
users the ability to perform collaborative data analysis building on this
version control system. We outline the challenges in providing dataset version
control at scale.Comment: 7 page
Model Matching Challenge: Benchmarks for Ecore and BPMN Diagrams
In the last couple of years, Model Driven Engineering (MDE) gained a
prominent role in the context of software engineering. In the MDE paradigm,
models are considered first level artifacts which are iteratively developed by
teams of programmers over a period of time. Because of this, dedicated tools
for versioning and management of models are needed. A central functionality
within this group of tools is model comparison and differencing. In two
disjunct research projects, we identified a group of general matching problems
where state-of-the-art comparison algorithms delivered low quality results. In
this article, we will present five edit operations which are the cause for
these low quality results. The reasons why the algorithms fail, as well as
possible solutions, are also discussed. These examples can be used as
benchmarks by model developers to assess the quality and applicability of a
model comparison tool for a given model type.Comment: 7 pages, 7 figure
OntoMaven: Maven-based Ontology Development and Management of Distributed Ontology Repositories
In collaborative agile ontology development projects support for modular
reuse of ontologies from large existing remote repositories, ontology project
life cycle management, and transitive dependency management are important
needs. The Apache Maven approach has proven its success in distributed
collaborative Software Engineering by its widespread adoption. The contribution
of this paper is a new design artifact called OntoMaven. OntoMaven adopts the
Maven-based development methodology and adapts its concepts to knowledge
engineering for Maven-based ontology development and management of ontology
artifacts in distributed ontology repositories.Comment: Pre-print submission to 9th International Workshop on Semantic Web
Enabled Software Engineering (SWESE2013). Berlin, Germany, December 2-5, 201
Guidelines for a Dynamic Ontology - Integrating Tools of Evolution and Versioning in Ontology
Ontologies are built on systems that conceptually evolve over time. In
addition, techniques and languages for building ontologies evolve too. This has
led to numerous studies in the field of ontology versioning and ontology
evolution. This paper presents a new way to manage the lifecycle of an ontology
incorporating both versioning tools and evolution process. This solution,
called VersionGraph, is integrated in the source ontology since its creation in
order to make it possible to evolve and to be versioned. Change management is
strongly related to the model in which the ontology is represented. Therefore,
we focus on the OWL language in order to take into account the impact of the
changes on the logical consistency of the ontology like specified in OWL DL
A logic programming framework for modeling temporal objects
Published versio
Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme
This paper addresses the problem of efficiently storing and accessing massive
data blocks in a large-scale distributed environment, while providing efficient
fine-grain access to data subsets. This issue is crucial in the context of
applications in the field of databases, data mining and multimedia. We propose
a data sharing service based on distributed, RAM-based storage of data, while
leveraging a DHT-based, natively parallel metadata management scheme. As
opposed to the most commonly used grid storage infrastructures that provide
mechanisms for explicit data localization and transfer, we provide a
transparent access model, where data are accessed through global identifiers.
Our proposal has been validated through a prototype implementation whose
preliminary evaluation provides promising results
- …