45 research outputs found

    Migrating relational databases into object-based and XML databases

    Get PDF
    Rapid changes in information technology, the emergence of object-based and WWW applications, and the interest of organisations in securing benefits from new technologies have made information systems re-engineering in general and database migration in particular an active research area. In order to improve the functionality and performance of existing systems, the re-engineering process requires identifying and understanding all of the components of such systems. An underlying database is one of the most important component of information systems. A considerable body of data is stored in relational databases (RDBs), yet they have limitations to support complex structures and user-defined data types provided by relatively recent databases such as object-based and XML databases. Instead of throwing away the large amount of data stored in RDBs, it is more appropriate to enrich and convert such data to be used by new systems. Most researchers into the migration of RDBs into object-based/XML databases have concentrated on schema translation, accessing and publishing RDB data using newer technology, while few have paid attention to the conversion of data, and the preservation of data semantics, e.g., inheritance and integrity constraints. In addition, existing work does not appear to provide a solution for more than one target database. Thus, research on the migration of RDBs is not fully developed. We propose a solution that offers automatic migration of an RDB as a source into the recent database technologies as targets based on available standards such as ODMG 3.0, SQL4 and XML Schema. A canonical data model (CDM) is proposed to bridge the semantic gap between an RDB and the target databases. The CDM preserves and enhances the metadata of existing RDBs to fit in with the essential characteristics of the target databases. The adoption of standards is essential for increased portability, flexibility and constraints preservation. This thesis contributes a solution for migrating RDBs into object-based and XML databases. The solution takes an existing RDB as input, enriches its metadata representation with the required explicit semantics, and constructs an enhanced relational schema representation (RSR). Based on the RSR, a CDM is generated which is enriched with the RDB's constraints and data semantics that may not have been explicitly expressed in the RDB metadata. The CDM so obtained facilitates both schema translation and data conversion. We design sets of rules for translating the CDM into each of the three target schemas, and provide algorithms for converting RDB data into the target formats based on the CDM. A prototype of the solution has been implemented, which generates the three target databases. Experimental study has been conducted to evaluate the prototype. The experimental results show that the target schemas resulting from the prototype and those generated by existing manual mapping techniques were comparable. We have also shown that the source and target databases were equivalent, and demonstrated that the solution, conceptually and practically, is feasible, efficient and correct

    An object query language for multimedia federations

    Get PDF
    The Fischlar system provides a large centralised repository of multimedia files. As expansion is difficult in centralised systems and as different user groups have a requirement to define their own schemas, the EGTV (Efficient Global Transactions for Video) project was established to examine how the distribution of this database could be managed. The federated database approach is advocated where global schema is designed in a top-down approach, while all multimedia and textual data is stored in object-oriented (O-O) and object-relational (0-R) compliant databases. This thesis investigates queries and updates on large multimedia collections organised in the database federation. The goal of this research is to provide a generic query language capable of interrogating global and local multimedia database schemas. Therefore, a new query language EQL is defined to facilitate the querying of object-oriented and objectrelational database schemas in a database and platform independent manner, and acts as a canonical language for database federations. A new canonical language was required as the existing query language standards (SQL: 1999 and OQL) axe generally incompatible and translation between them is not trivial. EQL is supported with a formally defined object algebra and specified semantics for query evaluation. The ability to capture and store metadata of multiple database schemas is essential when constructing and querying a federated schema. Therefore we also present a new platform independent metamodel for specifying multimedia schemas stored in both object-oriented and object-relational databases. This metadata information is later used for the construction of a global schemas, and during the evaluation of local and global queries. Another important feature of any federated system is the ability to unambiguously define database schemas. The schema definition language for an EGTV database federation must be capable of specifying both object-oriented and object-relational schemas in the database independent format. As XML represents a standard for encoding and distributing data across various platforms, a language based upon XML has been developed as a part of our research. The ODLx (Object Definition Language XML) language specifies a set of XMLbased structures for defining complex database schemas capable of representing different multimedia types. The language is fully integrated with the EGTV metamodel through which ODLx schemas can be mapped to 0-0 and 0-R databases

    Migrating relational databases into object-based and XML databases

    Get PDF
    Rapid changes in information technology, the emergence of object-based and WWW applications, and the interest of organisations in securing benefits from new technologies have made information systems re-engineering in general and database migration in particular an active research area. In order to improve the functionality and performance of existing systems, the re-engineering process requires identifying and understanding all of the components of such systems. An underlying database is one of the most important component of information systems. A considerable body of data is stored in relational databases (RDBs), yet they have limitations to support complex structures and user-defined data types provided by relatively recent databases such as object-based and XML databases. Instead of throwing away the large amount of data stored in RDBs, it is more appropriate to enrich and convert such data to be used by new systems. Most researchers into the migration of RDBs into object-based/XML databases have concentrated on schema translation, accessing and publishing RDB data using newer technology, while few have paid attention to the conversion of data, and the preservation of data semantics, e.g., inheritance and integrity constraints. In addition, existing work does not appear to provide a solution for more than one target database. Thus, research on the migration of RDBs is not fully developed. We propose a solution that offers automatic migration of an RDB as a source into the recent database technologies as targets based on available standards such as ODMG 3.0, SQL4 and XML Schema. A canonical data model (CDM) is proposed to bridge the semantic gap between an RDB and the target databases. The CDM preserves and enhances the metadata of existing RDBs to fit in with the essential characteristics of the target databases. The adoption of standards is essential for increased portability, flexibility and constraints preservation. This thesis contributes a solution for migrating RDBs into object-based and XML databases. The solution takes an existing RDB as input, enriches its metadata representation with the required explicit semantics, and constructs an enhanced relational schema representation (RSR). Based on the RSR, a CDM is generated which is enriched with the RDB's constraints and data semantics that may not have been explicitly expressed in the RDB metadata. The CDM so obtained facilitates both schema translation and data conversion. We design sets of rules for translating the CDM into each of the three target schemas, and provide algorithms for converting RDB data into the target formats based on the CDM. A prototype of the solution has been implemented, which generates the three target databases. Experimental study has been conducted to evaluate the prototype. The experimental results show that the target schemas resulting from the prototype and those generated by existing manual mapping techniques were comparable. We have also shown that the source and target databases were equivalent, and demonstrated that the solution, conceptually and practically, is feasible, efficient and correct.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    MOMIS: Exploiting agents to support information integration

    Get PDF
    Information overloading introduced by the large amount of data that is spread over the Internet must be faced in an appropriate way. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challenges for today's technologies related to information management. In the area of information integration, this paper proposes an approach based on mobile software agents integrated in the MOMIS (Mediator envirOnment for Multiple Information Sources) infrastructure, which enables semi-automatic information integration to deal with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The exploitation of mobile agents in MOMIS can significantly increase the flexibility of the system. In fact, their characteristics of autonomy and adaptability well suit the distributed and open environments, such as the Internet. The aim of this paper is to show the advantages of the introduction in the MOMIS infrastructure of intelligent and mobile software agents for the autonomous management and coordination of integration and query processing over heterogeneous data sources

    The exploration of a category theory-based virtual Geometrical product specification system for design and manufacturing

    Get PDF
    In order to ensure quality of products and to facilitate global outsourcing, almost all the so-called “world-class” manufacturing companies nowadays are applying various tools and methods to maintain the consistency of a product’s characteristics throughout its manufacturing life cycle. Among these, for ensuring the consistency of the geometric characteristics, a tolerancing language − the Geometrical Product Specification (GPS) has been widely adopted to precisely transform the functional requirements from customers into manufactured workpieces expressed as tolerance notes in technical drawings. Although commonly acknowledged by industrial users as one of the most successful efforts in integrating existing manufacturing life-cycle standards, current GPS implementations and software packages suffer from several drawbacks in their practical use, possibly the most significant, the difficulties in inferring the data for the “best” solutions. The problem stemmed from the foundation of data structures and knowledge-based system design. This indicates that there need to be a “new” software system to facilitate GPS applications. The presented thesis introduced an innovative knowledge-based system − the VirtualGPS − that provides an integrated GPS knowledge platform based on a stable and efficient database structure with knowledge generation and accessing facilities. The system focuses on solving the intrinsic product design and production problems by acting as a virtual domain expert through translating GPS standards and rules into the forms of computerized expert advices and warnings. Furthermore, this system can be used as a training tool for young and new engineers to understand the huge amount of GPS standards in a relative “quicker” manner. The thesis started with a detailed discussion of the proposed categorical modelling mechanism, which has been devised based on the Category Theory. It provided a unified mechanism for knowledge acquisition and representation, knowledge-based system design, and database schema modelling. As a core part for assessing this knowledge-based system, the implementation of the categorical Database Management System (DBMS) is also presented in this thesis. The focus then moved on to demonstrate the design and implementation of the proposed VirtualGPS system. The tests and evaluations of this system were illustrated in Chapter 6. Finally, the thesis summarized the contributions to knowledge in Chapter 7. After thoroughly reviewing the project, the conclusions reached construe that the III entire VirtualGPS system was designed and implemented to conform to Category Theory and object-oriented programming rules. The initial tests and performance analyses show that the system facilitates the geometric product manufacturing operations and benefits the manufacturers and engineers alike from function designs, to a manufacturing and verification

    The mediated data integration (MeDInt) : An approach to the integration of database and legacy systems

    Get PDF
    The information required for decision making by executives in organizations is normally scattered across disparate data sources including databases and legacy systems. To gain a competitive advantage, it is extremely important for executives to be able to obtain one unique view of information in an accurate and timely manner. To do this, it is necessary to interoperate multiple data sources, which differ structurally and semantically. Particular problems occur when applying traditional integration approaches, for example, the global schema needs to be recreated when the component schema has been modified. This research investigates the following heterogeneities between heterogeneous data sources: Data Model Heterogeneities, Schematic Heterogeneities and Semantic Heterogeneities. The problems of existing integration approaches are reviewed and solved by introducing and designing a new integration approach to logically interoperate heterogeneous data sources and to resolve three previously classified heterogeneities. The research attempts to reduce the complexity of the integration process by maximising the degree of automation. Mediation and wrapping techniques are employed in this research. The Mediated Data Integration (MeDint) architecture has been introduced to integrate heterogeneous data sources. Three major elements, the MeDint Mediator, wrappers, and the Mediated Data Model (MDM) play important roles in the integration of heterogeneous data sources. The MeDint Mediator acts as an intermediate layer transforming queries to sub-queries, resolving conflicts, and consolidating conflict-resolved results. Wrappers serve as translators between the MeDint Mediator and data sources. Both the mediator and wrappers arc well-supported by MDM, a semantically-rich data model which can describe or represent heterogeneous data schematically and semantically. Some organisational information systems have been tested and evaluated using the MeDint architecture. The results have addressed all the research questions regarding the interoperability of heterogeneous data sources. In addition, the results also confirm that the Me Dint architecture is able to provide integration that is transparent to users and that the schema evolution does not affect the integration

    The exploration of a category theory-based virtual geometrical product specification system for design and manufacturing

    Get PDF
    In order to ensure quality of products and to facilitate global outsourcing, almost all the so-called “world-class” manufacturing companies nowadays are applying various tools and methods to maintain the consistency of a product’s characteristics throughout its manufacturing life cycle. Among these, for ensuring the consistency of the geometric characteristics, a tolerancing language − the Geometrical Product Specification (GPS) has been widely adopted to precisely transform the functional requirements from customers into manufactured workpieces expressed as tolerance notes in technical drawings. Although commonly acknowledged by industrial users as one of the most successful efforts in integrating existing manufacturing life-cycle standards, current GPS implementations and software packages suffer from several drawbacks in their practical use, possibly the most significant, the difficulties in inferring the data for the “best” solutions. The problem stemmed from the foundation of data structures and knowledge-based system design. This indicates that there need to be a “new” software system to facilitate GPS applications. The presented thesis introduced an innovative knowledge-based system − the VirtualGPS − that provides an integrated GPS knowledge platform based on a stable and efficient database structure with knowledge generation and accessing facilities. The system focuses on solving the intrinsic product design and production problems by acting as a virtual domain expert through translating GPS standards and rules into the forms of computerized expert advices and warnings. Furthermore, this system can be used as a training tool for young and new engineers to understand the huge amount of GPS standards in a relative “quicker” manner. The thesis started with a detailed discussion of the proposed categorical modelling mechanism, which has been devised based on the Category Theory. It provided a unified mechanism for knowledge acquisition and representation, knowledge-based system design, and database schema modelling. As a core part for assessing this knowledge-based system, the implementation of the categorical Database Management System (DBMS) is also presented in this thesis. The focus then moved on to demonstrate the design and implementation of the proposed VirtualGPS system. The tests and evaluations of this system were illustrated in Chapter 6. Finally, the thesis summarized the contributions to knowledge in Chapter 7. After thoroughly reviewing the project, the conclusions reached construe that the III entire VirtualGPS system was designed and implemented to conform to Category Theory and object-oriented programming rules. The initial tests and performance analyses show that the system facilitates the geometric product manufacturing operations and benefits the manufacturers and engineers alike from function designs, to a manufacturing and verification.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources

    Get PDF
    The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, on-the- fly integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear winner . Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application

    APPLICATIONS OF GRAPH THEORY FOR REUSE OF MODEL BASED SYSTEMS ENGINEERING DESIGN DATA

    Get PDF
    This dissertation contributes to systems engineering (SE) by introducing and demonstrating a novel graph-based design repository (GBDR) tool. GBDR enables engineers to leverage system design information from a heterogenous set of system models created using multiple model based systems engineering (MBSE) software tools as an integrated body of knowledge. Specifically, the research provides a set of approaches that allow the use of system models described in Systems Modeling Language and Lifecycle Modeling Language as an integrated body of design information. The coalesced body of system design information serves to support concept ideation and analysis within SE. The research accomplishes this by using a graph database to store system model information imported from digital artifacts created by MBSE tools and applying principles from graph theory and semantic web technologies to identify likely connections and equivalent concepts across system models, modeling languages, and metamodels. The research demonstrates that the presented tool can import, store, synthesize, search, display, distribute, and export information from multiple MBSE tools. As a practical demonstration, feasible subsystem design alternatives for a small unmanned aircraft system government reference architecture are identified from within a set of existing system models.OSD CAPECivilian, Office of the Secretary of DefenseApproved for public release. Distribution is unlimited
    corecore