87 research outputs found

    Migrating relational databases into object-based and XML databases

    Get PDF
    Rapid changes in information technology, the emergence of object-based and WWW applications, and the interest of organisations in securing benefits from new technologies have made information systems re-engineering in general and database migration in particular an active research area. In order to improve the functionality and performance of existing systems, the re-engineering process requires identifying and understanding all of the components of such systems. An underlying database is one of the most important component of information systems. A considerable body of data is stored in relational databases (RDBs), yet they have limitations to support complex structures and user-defined data types provided by relatively recent databases such as object-based and XML databases. Instead of throwing away the large amount of data stored in RDBs, it is more appropriate to enrich and convert such data to be used by new systems. Most researchers into the migration of RDBs into object-based/XML databases have concentrated on schema translation, accessing and publishing RDB data using newer technology, while few have paid attention to the conversion of data, and the preservation of data semantics, e.g., inheritance and integrity constraints. In addition, existing work does not appear to provide a solution for more than one target database. Thus, research on the migration of RDBs is not fully developed. We propose a solution that offers automatic migration of an RDB as a source into the recent database technologies as targets based on available standards such as ODMG 3.0, SQL4 and XML Schema. A canonical data model (CDM) is proposed to bridge the semantic gap between an RDB and the target databases. The CDM preserves and enhances the metadata of existing RDBs to fit in with the essential characteristics of the target databases. The adoption of standards is essential for increased portability, flexibility and constraints preservation. This thesis contributes a solution for migrating RDBs into object-based and XML databases. The solution takes an existing RDB as input, enriches its metadata representation with the required explicit semantics, and constructs an enhanced relational schema representation (RSR). Based on the RSR, a CDM is generated which is enriched with the RDB's constraints and data semantics that may not have been explicitly expressed in the RDB metadata. The CDM so obtained facilitates both schema translation and data conversion. We design sets of rules for translating the CDM into each of the three target schemas, and provide algorithms for converting RDB data into the target formats based on the CDM. A prototype of the solution has been implemented, which generates the three target databases. Experimental study has been conducted to evaluate the prototype. The experimental results show that the target schemas resulting from the prototype and those generated by existing manual mapping techniques were comparable. We have also shown that the source and target databases were equivalent, and demonstrated that the solution, conceptually and practically, is feasible, efficient and correct

    Converting relational databases into object relational databases

    Get PDF
    This paper proposes an approach for migrating existing Relational DataBases (RDBs) into Object-Relational DataBases (ORDBs). The approach is superior to existing proposals as it can generate not only the target schema but also the data instances. The solution takes an existing RDB as input, enriches its metadata representation with required semantics, and generates an enhanced canonical data model, which captures essential characteristics of the target ORDB, and is suitable for migration. A prototype has been developed, which migrates successfully RDBs into ORDBs (Oracle 11g) based on the canonical model. The experimental results were very encouraging, demonstrating that the proposed approach is feasible, efficient and correct

    Database independent Migration of Objects into an Object-Relational Database

    Get PDF
    This paper reports on the CERN-based WISDOM project which is studying the serialisation and deserialisation of data to/from an object database (objectivity) and ORACLE 9i.Comment: 26 pages, 18 figures; CMS CERN Conference Report cr02_01

    Migrating relational databases into object-based and XML databases

    Get PDF
    Rapid changes in information technology, the emergence of object-based and WWW applications, and the interest of organisations in securing benefits from new technologies have made information systems re-engineering in general and database migration in particular an active research area. In order to improve the functionality and performance of existing systems, the re-engineering process requires identifying and understanding all of the components of such systems. An underlying database is one of the most important component of information systems. A considerable body of data is stored in relational databases (RDBs), yet they have limitations to support complex structures and user-defined data types provided by relatively recent databases such as object-based and XML databases. Instead of throwing away the large amount of data stored in RDBs, it is more appropriate to enrich and convert such data to be used by new systems. Most researchers into the migration of RDBs into object-based/XML databases have concentrated on schema translation, accessing and publishing RDB data using newer technology, while few have paid attention to the conversion of data, and the preservation of data semantics, e.g., inheritance and integrity constraints. In addition, existing work does not appear to provide a solution for more than one target database. Thus, research on the migration of RDBs is not fully developed. We propose a solution that offers automatic migration of an RDB as a source into the recent database technologies as targets based on available standards such as ODMG 3.0, SQL4 and XML Schema. A canonical data model (CDM) is proposed to bridge the semantic gap between an RDB and the target databases. The CDM preserves and enhances the metadata of existing RDBs to fit in with the essential characteristics of the target databases. The adoption of standards is essential for increased portability, flexibility and constraints preservation. This thesis contributes a solution for migrating RDBs into object-based and XML databases. The solution takes an existing RDB as input, enriches its metadata representation with the required explicit semantics, and constructs an enhanced relational schema representation (RSR). Based on the RSR, a CDM is generated which is enriched with the RDB's constraints and data semantics that may not have been explicitly expressed in the RDB metadata. The CDM so obtained facilitates both schema translation and data conversion. We design sets of rules for translating the CDM into each of the three target schemas, and provide algorithms for converting RDB data into the target formats based on the CDM. A prototype of the solution has been implemented, which generates the three target databases. Experimental study has been conducted to evaluate the prototype. The experimental results show that the target schemas resulting from the prototype and those generated by existing manual mapping techniques were comparable. We have also shown that the source and target databases were equivalent, and demonstrated that the solution, conceptually and practically, is feasible, efficient and correct.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    The Sloan Digital Sky Survey Science Archive: Migrating a Multi-Terabyte Astronomical Archive from Object to Relational DBMS

    Full text link
    The Sloan Digital Sky Survey Science Archive is the first in a series of multi-Terabyte digital archives in Astronomy and other data-intensive sciences. To facilitate data mining in the SDSS archive, we adapted a commercial database engine and built specialized tools on top of it. Originally we chose an object-oriented database management system due to its data organization capabilities, platform independence, query performance and conceptual fit to the data. However, after using the object database for the first couple of years of the project, it soon began to fall short in terms of its query support and data mining performance. This was as much due to the inability of the database vendor to respond our demands for features and bug fixes as it was due to their failure to keep up with the rapid improvements in hardware performance, particularly faster RAID disk systems. In the end, we were forced to abandon the object database and migrate our data to a relational database. We describe below the technical issues that we faced with the object database and how and why we migrated to relational technology

    Managing Schema Change in an Heterogeneous Environment

    Get PDF
    Change is inevitable even for persistent information. Effectively managing change of persistent information, which includes the specification, execution and the maintenance of any derived information, is critical and must be addressed by all database systems. Today, for every data model there exists a well-defined set of change primitives that can alter both the structure (the schema) and the data. Several proposals also exist for incrementally propagating a primitive change to any derived information (or view). However, existing support is lacking in two ways. First, change primitives as presented in literature are very limiting in terms of their capabilities allowing users to simply add or remove schema elements. More complex types of changes such the merging or splitting of schema elements are not supported in a principled manner. Second, algorithms for maintaining derived information often do not account for the potential heterogeneity between the source and the target. The goal of this dissertation is to provide solutions that address these two key issues. The first part of this dissertation addresses the challenge of expressing a rich complex set of changes. We propose the SERF (Schema Evolution through an Extensible, Re-usable and Flexible) framework that allows users to perform a wide range of complex user-defined schema transformations. Our approach combines existing schema evolution primitives using OQL (object query language) as the glue logic. Within the context of this work, we look at the different domains in which SERF can be applied, including web site management. To further enrich our framework, we also investigate the optimization and verification of SERF transformations. The second part of this dissertation addresses the problem of maintaining views in the face of source changes when the source and the view are not in the same data model. With today\u27s increasing heterogeneity in information structure, it is critical that maintenance of views addresses the data model boundaries. However, view definitions that go across data models are limited to hard-coded algorithms, thereby making it difficult to develop general maintenance algorithms. We provide a two-step solution for this problem. We have developed a cross algebra, that defines views such that there is no restriction that forces the view and the source data models to be the same. We then define update propagation algorithms that can propagate changes from source to target irrespective of the exact translation and the data models. We validate our ideas by applying them to translation and change propagation between the XML and relational data models

    Subsumption between queries to object-oriented databases

    Get PDF
    Most work on query optimization in relational and object-oriented databases has concentrated on tuning algebraic expressions and the physical access to the database contents. The attention to semantic query optimization, however, has been restricted due to its inherent complexity. We take a second look at semantic query optimization in object-oriented databases and find that reasoning techniques for concept languages developed in Artificial Intelligence apply to this problem because concept languages have been tailored for efficiency and their semantics is compatible with class and query definitions in object-oriented databases. We propose a query optimizer that recognizes subset relationships between a query and a view (a simpler query whose answer is stored) in polynomial time

    Translating Relational Conceptual Schema to Object-Oriented Schema

    Get PDF
    A multidatabase is a confederation of preexisting distributed, heterogeneous, and autonomous database system. The integration process is essential in the effort of forming a distributed, heterogeneous database system. This process generally consists of two main phases, which are conceptual schema translation phase and followed by the integration phase. In our research, we have proposed an alternative translation approach to convert relational database schema to object--oriented database schema. The translation approach consists of a set of translation rules, which are based on inclusion dependencies, key attributes and types of attributes. A database schema translation tool prototype, called RETOO (Relational-to-Object-Oriented) is then developed based on the proposed translation approach. RETOO receives a relational database schema as input data and generate an object-oriented database schema as the output data.RETOO operates semi-automatically, especially in the process of identifying operations for each class. This is because relational data model does not provide the behavioural information of every entity. The translation approach and RETOO database translation tool prototype are not only able to maintain the semantics of the relational database schema, but also enhance the semantics of the translated object-oriented schema via object-oriented data modelling concepts

    Protein Structure Data Management System

    Get PDF
    With advancement in the development of the new laboratory instruments and experimental techniques, the protein data has an explosive increasing rate. Therefore how to efficiently store, retrieve and modify protein data is becoming a challenging issue that most biological scientists have to face and solve. Traditional data models such as relational database lack of support for complex data types, which is a big issue for protein data application. Hence many scientists switch to the object-oriented databases since object-oriented nature of life science data perfectly matches the architecture of object-oriented databases, but there are still a lot of problems that need to be solved in order to apply OODB methodologies to manage protein data. One major problem is that the general-purpose OODBs do not have any built-in data types for biological research and built-in biological domain-specific functional operations. In this dissertation, we present an application system with built-in data types and built-in biological domain-specific functional operations that extends the Object-Oriented Database (OODB) system by adding domain-specific additional layers Protein-QL, Protein Algebra Architecture and Protein-OODB above OODB to manage protein structure data. This system is composed of three parts: 1) Client API to provide easy usage for different users. 2) Middleware including Protein-QL, Protein Algebra Architecture and Protein-OODB is designed to implement protein domain specific query language and optimize the complex queries, also it capsulates the details of the implementation such that users can easily understand and master Protein-QL. 3) Data Storage is used to store our protein data. This system is for protein domain, but it can be easily extended into other biological domains to build a bio-OODBMS. In this system, protein, primary, secondary, and tertiary structures are defined as internal data types to simplify the queries in Protein-QL such that the domain scientists can easily master the query language and formulate data requests, and EyeDB is used as the underlying OODB to communicate with Protein-OODB. In addition, protein data is usually stored as PDB format and PDB format is old, ambiguous, and inadequate, therefore, PDB data curation will be discussed in detail in the dissertation
    corecore