3,126 research outputs found

    Supporting Complex Scientific Database Schemas in a Grid Middleware

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.” DOI: 10.1109/AINA.2009.129The volume of digital scientific data has increased considerably with advancing technologies of computing devices and scientific instruments. We are exploring the use of emerging Grid technologies for the management and manipulation of very large distributed scientific datasets. Taking as an example a terabyte-size scientific database with complex database schema, this paper focuses on the potential of a well-known Grid middleware - OGSA-DQP - for distributing such datasets. In particular, we investigate and extend the data type support in this system to handle a complex schema of a real scientific database - the Sloan Digital Sky Survey database

    Transforming N-ary relationships to database schemas: an old and forgotten problem

    Get PDF
    The N-ary relationships, have been traditionally a source of confusion and still are. One important source of confusion is that the term cardinality in a relationship has several interpretations, two of them being very popular. But none of the two approaches, nor the two together, allow us to express all the possible cardinality patterns. The transformations from all the possible relationships to database schemas have never been described by the existing literature. Using the 14 ternary patterns as example, we discuss these transformations particularly the transformations from the patterns ignored in the literature.Postprint (published version

    Virtual Mutation Analysis of Relational Database Schemas

    Get PDF
    Relational databases are a vital component of many modern soft- ware applications. Key to the definition of the database schema — which specifies what types of data will be stored in the database and the structure in which the data is to be organized — are integrity constraints. Integrity constraints are conditions that protect and preserve the consistency and validity of data in the database, preventing data values that violate their rules from being admitted into database tables. They encode logic about the application concerned, and like any other component of a software application, need to be properly tested. Mutation analysis is a technique that has been successfully applied to integrity constraint testing, seeding database schema faults of both omission and commission. Yet, as for traditional mutation analysis for program testing, it is costly to perform, since the test suite under analysis needs to be run against each individual mutant to establish whether or not it exposes the fault. One overhead incurred by database schema mutation is the cost of communicating with the database management system (DBMS). In this paper, we seek to eliminate this cost by performing mutation analysis virtually on a local model of the DBMS, rather than on an actual, running instance hosting a real database. We present an empirical evaluation of our virtual technique revealing that, across all of the studied DBMSs and schemas, the virtual method yields an average time saving of 51% over the baseline

    Automated Software Testing of Relational Database Schemas

    Get PDF
    Relational databases are critical for many software systems, holding the most valuable data for organisations. Data engineers build relational databases using schemas to specify the structure of the data within a database and defining integrity constraints. These constraints protect the data's consistency and coherency, leading industry experts to recommend testing them. Since manual schema testing is labour-intensive and error-prone, automated techniques enable the generation of test data. Although these generators are well-established and effective, they use default values and often produce many, long, and similar tests --- this results in decreasing fault detection and increasing regression testing time and testers inspection efforts. It raises the following questions: How effective is the optimised random generator at generating tests and its fault detection compared to prior methods? What factors make tests understandable for testers? How to reduce tests while maintaining effectiveness? How effectively do testers inspect differently reduced tests? To answer these questions, the first contribution of this thesis is to evaluate a new optimised random generator against well-established methods empirically. Secondly, identifying understandability factors of schema tests using a human study. Thirdly, evaluating a novel approach that reduces and merge tests against traditional reduction methods. Finally, studying testers' inspection efforts with differently reduced tests using a human study. The results show that the optimised random method efficiently generates effective tests compared to well-established methods. Testers reported that many NULLs and negative numbers are confusing, and they prefer simple repetition of unimportant values and readable strings. The reduction technique with merging is the most effective at minimising tests and producing efficient tests while maintaining effectiveness compared to traditional methods. The merged tests showed an increase in inspection efficiency with a slight accuracy decrease compared to only reduced tests. Therefore, these techniques and investigations can help practitioners adopt these generators in practice

    Classifier System Learning of Good Database Schema

    Get PDF
    This thesis presents an implementation of a learning classifier system which learns good database schema. The system is implemented in Java using the NetBeans development environment, which provides a good control for the GUI components. The system contains four components: a user interface, a rule and message system, an apportionment of credit system, and genetic algorithms. The input of the system is a set of simple database schemas and the objective for the classifier system is to keep the good database schemas which are represented by classifiers. The learning classifier system is given some basic knowledge about database concepts or rules. The result showed that the system could decrease the bad schemas and keep the good ones
    corecore