2,727 research outputs found

    SODA: Generating SQL for Business Users

    Full text link
    The purpose of data warehouses is to enable business analysts to make better decisions. Over the years the technology has matured and data warehouses have become extremely successful. As a consequence, more and more data has been added to the data warehouses and their schemas have become increasingly complex. These systems still work great in order to generate pre-canned reports. However, with their current complexity, they tend to be a poor match for non tech-savvy business analysts who need answers to ad-hoc queries that were not anticipated. This paper describes the design, implementation, and experience of the SODA system (Search over DAta Warehouse). SODA bridges the gap between the business needs of analysts and the technical complexity of current data warehouses. SODA enables a Google-like search experience for data warehouses by taking keyword queries of business users and automatically generating executable SQL. The key idea is to use a graph pattern matching algorithm that uses the metadata model of the data warehouse. Our results with real data from a global player in the financial services industry show that SODA produces queries with high precision and recall, and makes it much easier for business users to interactively explore highly-complex data warehouses.Comment: VLDB201

    Mutation Analysis of Relational Database Schemas

    Get PDF
    The schema is the key artefact used to describe the structure of a relational database, specifying how data will be stored and the integrity constraints used to ensure it is valid. It is therefore surprising that to date little work has addressed the problem of schema testing, which aims to identify mistakes in the schema early in software development. Failure to do so may lead to critical faults, which may cause data loss or degradation of data quality, remaining undetected until later when they will prove much more costly to fix. This thesis explores how mutation analysis – a technique commonly used in software testing to evaluate test suite quality – can be applied to evaluate data generated to exercise the integrity constraints of a relational database schema. By injecting faults into the constraints, modelling both faults of omission and commission, this enables the fault-finding capability of test suites generated by different techniques to be compared. This is essential to empirically evaluate further schema testing research, providing a means of assessing the effectiveness of proposed techniques. To mutate the integrity constraints of a schema, a collection of novel mutation operators are proposed and implementation described. These allow an empirical evaluation of an existing data generation approach, demonstrating the effectiveness of the mutation analysis technique and identifying a configuration that killed 94% of mutants on average. Cost-effective algorithms for automatically removing equivalent mutants and other ineffective mutants are then proposed and evaluated, revealing a third of mutation scores to be mutation adequate and reducing time taken by an average of 7%. Finally, the execution cost problem is confronted, with a range of optimisation strategies being applied that consistently improve efficiency, reducing the time taken by several hours in the best case and as high as 99% on average for one DBMS

    06472 Abstracts Collection - XQuery Implementation Paradigms

    Get PDF
    From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Emergent relational schemas for RDF

    Get PDF
    • …
    corecore