27,464 research outputs found

    Schema Migration from Relational Databases to NoSQL Databases with Graph Transformation and Selective Denormalization

    Get PDF
    We witnessed a dramatic increase in the volume, variety and velocity of data leading to the era of big data. The structure of data has become highly flexible leading to the development of many storage systems that are different from the traditional structured relational databases where data is stored in “tables,” with columns representing the lowest granularity of data. Although relational databases are still predominant in the industry, there has been a major drift towards alternative database systems that support unstructured data with better scalability leading to the popularity of “Not Only SQL.” Migration from relational databases to NoSQL databases has become a significant area of interest when it involves enormous volumes of data with a large number of concurrent users. Many migration methodologies have been proposed each focusing a specific NoSQL family. This paper proposes a heuristics based graph transformation method to migrate a relational database to MongoDB called Graph Transformation with Selective Denormalization and compares the migration with a table level denormalization method. Although this paper focuses on MongoDB, the heuristics algorithm is generalized enough to be applied to other NoSQL families. Experimental evaluation with TPC-H shows that Graph Transformation with Selective Denormalization migration method has lower query execution times with lesser hardware footprint like lower space requirement, disk I/O, CPU utilization compared to that of table level denormalization

    An epistemic approach to model uncertainty in data-graphs

    Full text link
    Graph databases are becoming widely successful as data models that allow to effectively represent and process complex relationships among various types of data. As with any other type of data repository, graph databases may suffer from errors and discrepancies with respect to the real-world data they intend to represent. In this work we explore the notion of probabilistic unclean graph databases, previously proposed for relational databases, in order to capture the idea that the observed (unclean) graph database is actually the noisy version of a clean one that correctly models the world but that we know partially. As the factors that may be involved in the observation can be many, e.g, all different types of clerical errors or unintended transformations of the data, we assume a probabilistic model that describes the distribution over all possible ways in which the clean (uncertain) database could have been polluted. Based on this model we define two computational problems: data cleaning and probabilistic query answering and study for both of them their corresponding complexity when considering that the transformation of the database can be caused by either removing (subset) or adding (superset) nodes and edges.Comment: 25 pages, 3 figure

    A framework for integrating and transforming between ontologies and relational databases

    Get PDF
    Bridging the gap between ontologies, expressed in the Web Ontology Language (OWL), and relational databases is a necessity for realising the Semantic Web vision. Relational databases are considered a good solution for storing and processing ontologies with a large amount of data. Moreover, the vast majority of current websites store data in relational databases, and therefore being able to generate ontologies from such databases is important to support the development of the Semantic Web. Most of the work concerning this topic has either (1) extracted an OWL ontology from an existing relational database that represents as exactly as possible the relational schema, using a limited range of OWL modelling constructs, or (2) extracted a relational database from an existing OWL ontology, that represents as much as possible the OWL ontology. By way of contrast, this thesis proposes a general framework for transforming and mapping between ontologies and databases, via an intermediate low-level Hyper-graph Data Model. The transformation between relational and OWL schemas is expressed using directional Both-As-View mappings, allowing a precise definition of the equivalence between the two schemas, hence data can be mapped back and forth between them. In particular, for a given OWL ontology, we interpret the expressive axioms either as triggers, conforming to the Open-World Assumption, that performs a forward-chaining materialisation of inferred data, or as constraints, conforming to the Closed-World Assumption, that performs a consistency checking. With regards to extracting ontologies from relational databases, we transform a relational database into an exact OWL ontology, then enhance it with rich OWL 2 axioms, using a combination of schema and data analysis. We then apply machine learning algorithms to rank the suggested axioms based on past users’ relevance. A proof-of-concept tool, OWLRel, has been implemented, and a number of well-known ontologies and databases have been used to evaluate the approach and the OWLRel tool.Open Acces
    • …
    corecore