6,754 research outputs found

    Information Integration - the process of integration, evolution and versioning

    Get PDF
    At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

    DataHub: Collaborative Data Science & Dataset Version Management at Scale

    Get PDF
    Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version control systems like git, we propose (a) a dataset version control system, giving users the ability to create, branch, merge, difference and search large, divergent collections of datasets, and (b) a platform, DataHub, that gives users the ability to perform collaborative data analysis building on this version control system. We outline the challenges in providing dataset version control at scale.Comment: 7 page

    Model Matching Challenge: Benchmarks for Ecore and BPMN Diagrams

    Get PDF
    In the last couple of years, Model Driven Engineering (MDE) gained a prominent role in the context of software engineering. In the MDE paradigm, models are considered first level artifacts which are iteratively developed by teams of programmers over a period of time. Because of this, dedicated tools for versioning and management of models are needed. A central functionality within this group of tools is model comparison and differencing. In two disjunct research projects, we identified a group of general matching problems where state-of-the-art comparison algorithms delivered low quality results. In this article, we will present five edit operations which are the cause for these low quality results. The reasons why the algorithms fail, as well as possible solutions, are also discussed. These examples can be used as benchmarks by model developers to assess the quality and applicability of a model comparison tool for a given model type.Comment: 7 pages, 7 figure

    OntoMaven: Maven-based Ontology Development and Management of Distributed Ontology Repositories

    Full text link
    In collaborative agile ontology development projects support for modular reuse of ontologies from large existing remote repositories, ontology project life cycle management, and transitive dependency management are important needs. The Apache Maven approach has proven its success in distributed collaborative Software Engineering by its widespread adoption. The contribution of this paper is a new design artifact called OntoMaven. OntoMaven adopts the Maven-based development methodology and adapts its concepts to knowledge engineering for Maven-based ontology development and management of ontology artifacts in distributed ontology repositories.Comment: Pre-print submission to 9th International Workshop on Semantic Web Enabled Software Engineering (SWESE2013). Berlin, Germany, December 2-5, 201

    Guidelines for a Dynamic Ontology - Integrating Tools of Evolution and Versioning in Ontology

    Full text link
    Ontologies are built on systems that conceptually evolve over time. In addition, techniques and languages for building ontologies evolve too. This has led to numerous studies in the field of ontology versioning and ontology evolution. This paper presents a new way to manage the lifecycle of an ontology incorporating both versioning tools and evolution process. This solution, called VersionGraph, is integrated in the source ontology since its creation in order to make it possible to evolve and to be versioned. Change management is strongly related to the model in which the ontology is represented. Therefore, we focus on the OWL language in order to take into account the impact of the changes on the logical consistency of the ontology like specified in OWL DL

    A logic programming framework for modeling temporal objects

    Get PDF
    Published versio

    Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

    Get PDF
    This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results
    • …
    corecore