19 research outputs found

    Migrating relational databases into object-based and XML databases

    Get PDF
    Rapid changes in information technology, the emergence of object-based and WWW applications, and the interest of organisations in securing benefits from new technologies have made information systems re-engineering in general and database migration in particular an active research area. In order to improve the functionality and performance of existing systems, the re-engineering process requires identifying and understanding all of the components of such systems. An underlying database is one of the most important component of information systems. A considerable body of data is stored in relational databases (RDBs), yet they have limitations to support complex structures and user-defined data types provided by relatively recent databases such as object-based and XML databases. Instead of throwing away the large amount of data stored in RDBs, it is more appropriate to enrich and convert such data to be used by new systems. Most researchers into the migration of RDBs into object-based/XML databases have concentrated on schema translation, accessing and publishing RDB data using newer technology, while few have paid attention to the conversion of data, and the preservation of data semantics, e.g., inheritance and integrity constraints. In addition, existing work does not appear to provide a solution for more than one target database. Thus, research on the migration of RDBs is not fully developed. We propose a solution that offers automatic migration of an RDB as a source into the recent database technologies as targets based on available standards such as ODMG 3.0, SQL4 and XML Schema. A canonical data model (CDM) is proposed to bridge the semantic gap between an RDB and the target databases. The CDM preserves and enhances the metadata of existing RDBs to fit in with the essential characteristics of the target databases. The adoption of standards is essential for increased portability, flexibility and constraints preservation. This thesis contributes a solution for migrating RDBs into object-based and XML databases. The solution takes an existing RDB as input, enriches its metadata representation with the required explicit semantics, and constructs an enhanced relational schema representation (RSR). Based on the RSR, a CDM is generated which is enriched with the RDB's constraints and data semantics that may not have been explicitly expressed in the RDB metadata. The CDM so obtained facilitates both schema translation and data conversion. We design sets of rules for translating the CDM into each of the three target schemas, and provide algorithms for converting RDB data into the target formats based on the CDM. A prototype of the solution has been implemented, which generates the three target databases. Experimental study has been conducted to evaluate the prototype. The experimental results show that the target schemas resulting from the prototype and those generated by existing manual mapping techniques were comparable. We have also shown that the source and target databases were equivalent, and demonstrated that the solution, conceptually and practically, is feasible, efficient and correct

    The mediated data integration (MeDInt) : An approach to the integration of database and legacy systems

    Get PDF
    The information required for decision making by executives in organizations is normally scattered across disparate data sources including databases and legacy systems. To gain a competitive advantage, it is extremely important for executives to be able to obtain one unique view of information in an accurate and timely manner. To do this, it is necessary to interoperate multiple data sources, which differ structurally and semantically. Particular problems occur when applying traditional integration approaches, for example, the global schema needs to be recreated when the component schema has been modified. This research investigates the following heterogeneities between heterogeneous data sources: Data Model Heterogeneities, Schematic Heterogeneities and Semantic Heterogeneities. The problems of existing integration approaches are reviewed and solved by introducing and designing a new integration approach to logically interoperate heterogeneous data sources and to resolve three previously classified heterogeneities. The research attempts to reduce the complexity of the integration process by maximising the degree of automation. Mediation and wrapping techniques are employed in this research. The Mediated Data Integration (MeDint) architecture has been introduced to integrate heterogeneous data sources. Three major elements, the MeDint Mediator, wrappers, and the Mediated Data Model (MDM) play important roles in the integration of heterogeneous data sources. The MeDint Mediator acts as an intermediate layer transforming queries to sub-queries, resolving conflicts, and consolidating conflict-resolved results. Wrappers serve as translators between the MeDint Mediator and data sources. Both the mediator and wrappers arc well-supported by MDM, a semantically-rich data model which can describe or represent heterogeneous data schematically and semantically. Some organisational information systems have been tested and evaluated using the MeDint architecture. The results have addressed all the research questions regarding the interoperability of heterogeneous data sources. In addition, the results also confirm that the Me Dint architecture is able to provide integration that is transparent to users and that the schema evolution does not affect the integration

    Migrating relational databases into object-based and XML databases

    Get PDF
    Rapid changes in information technology, the emergence of object-based and WWW applications, and the interest of organisations in securing benefits from new technologies have made information systems re-engineering in general and database migration in particular an active research area. In order to improve the functionality and performance of existing systems, the re-engineering process requires identifying and understanding all of the components of such systems. An underlying database is one of the most important component of information systems. A considerable body of data is stored in relational databases (RDBs), yet they have limitations to support complex structures and user-defined data types provided by relatively recent databases such as object-based and XML databases. Instead of throwing away the large amount of data stored in RDBs, it is more appropriate to enrich and convert such data to be used by new systems. Most researchers into the migration of RDBs into object-based/XML databases have concentrated on schema translation, accessing and publishing RDB data using newer technology, while few have paid attention to the conversion of data, and the preservation of data semantics, e.g., inheritance and integrity constraints. In addition, existing work does not appear to provide a solution for more than one target database. Thus, research on the migration of RDBs is not fully developed. We propose a solution that offers automatic migration of an RDB as a source into the recent database technologies as targets based on available standards such as ODMG 3.0, SQL4 and XML Schema. A canonical data model (CDM) is proposed to bridge the semantic gap between an RDB and the target databases. The CDM preserves and enhances the metadata of existing RDBs to fit in with the essential characteristics of the target databases. The adoption of standards is essential for increased portability, flexibility and constraints preservation. This thesis contributes a solution for migrating RDBs into object-based and XML databases. The solution takes an existing RDB as input, enriches its metadata representation with the required explicit semantics, and constructs an enhanced relational schema representation (RSR). Based on the RSR, a CDM is generated which is enriched with the RDB's constraints and data semantics that may not have been explicitly expressed in the RDB metadata. The CDM so obtained facilitates both schema translation and data conversion. We design sets of rules for translating the CDM into each of the three target schemas, and provide algorithms for converting RDB data into the target formats based on the CDM. A prototype of the solution has been implemented, which generates the three target databases. Experimental study has been conducted to evaluate the prototype. The experimental results show that the target schemas resulting from the prototype and those generated by existing manual mapping techniques were comparable. We have also shown that the source and target databases were equivalent, and demonstrated that the solution, conceptually and practically, is feasible, efficient and correct.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Converting relational databases into object relational databases

    Get PDF
    This paper proposes an approach for migrating existing Relational DataBases (RDBs) into Object-Relational DataBases (ORDBs). The approach is superior to existing proposals as it can generate not only the target schema but also the data instances. The solution takes an existing RDB as input, enriches its metadata representation with required semantics, and generates an enhanced canonical data model, which captures essential characteristics of the target ORDB, and is suitable for migration. A prototype has been developed, which migrates successfully RDBs into ORDBs (Oracle 11g) based on the canonical model. The experimental results were very encouraging, demonstrating that the proposed approach is feasible, efficient and correct

    Semiautomatic generation of CORBA interfaces for databases in molecular biology

    Get PDF
    The amount and complexity of genome related data is growing quickly. This highly interrelated data is distributed at many different sites, stored in numerous different formats, and maintained by independent data providers. CORBA, the industry standard for distributed computing, offers the opportunity to make implementation differences and distribution transparent and thereby helps to combine disparate data sources and application programs. In this thesis, the different aspects of CORBA access to molecular biology data are examined in detail. The work is motivated by a concrete application for distributed genome maps. Then, the different design issues relevant to the implementation of CORBA access layers are surveyed and evaluated. The most important of these issues is the question of how to represent data in a CORBA environment using the interface definition language IDL. Different representations have different advantages and disadvantages and the best representation is highly application specific. It is therefore in general impossible to generate a CORBA wrapper automatically for a given database. On the other hand, coding a server for each application manually is tedious and error prone. Therefore, a method is presented for the semiautomatic generation of CORBA wrappers for relational databases. A declarative language is described, which is used to specify the mapping between relations and IDL constructs. Using a set of such mapping rules, a CORBA server can be generated automatically. Additionally, the declarative mapping language allows for the support of ad-hoc queries, which are based on the IDL definitions

    Melis: an incremental method for the lexical annotation of domain ontologies

    Get PDF
    In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELIS is the incremental process: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of ME LIS as a standalone tool and as a component integrated in MOMIS

    The Integration of Product Data with Workflow Management Systems Through a Common Data Model

    Get PDF
    Traditionally product models, and their definitions, have been handled separately from process models and their definitions. In industry, each has been managed by database systems defined for their specific domain, e.g. Product Data Management (PDM) for product definitions and Workflow Management System (WfM) for process definitions. There is little or no overlap between these two views of systems even though product and process information interact over the complete life cycle from design to production..

    Managing complex taxonomic data in an object-oriented database.

    Get PDF
    This thesis addresses the problem of multiple overlapping classifications in object-oriented databases through the example of plant taxonomy. These multiple overlapping classifications are independent simple classifications that share information (nodes and leaves), therefore overlap. Plant taxonomy was chosen as the motivational application domain because taxonomic classifications are especially complex and have changed over long periods of time, therefore overlap in a significant manner. This work extracts basic requirements for the support of multiple overlapping classifications in general, and in the context of plant taxonomy in particular. These requirements form the basis on which a prototype is defmed and built. The prototype, an extended object-oriented database, is extended from an object-oriented model based on ODMG through the provision of a relationship management mechanism. These relationships form the main feature used to build classifications. This emphasis on relationships allows the description of classifications orthogonal to the classified data (for reuse and integration of the mechanism with existing databases and for classification of non co-operating data), and allows an easier and more powerful management of semantic data (both within and without a classification). Additional mechanisms such as integrity constraints are investigated and implemented. Finally, the implementation of the prototype is presented and is evaluated, from the point of view of both usability and expressiveness (using plant taxonomy as an application), and its performance as a database system. This evaluation shows that the prototype meets the needs of taxonomists
    corecore