33 research outputs found

    Fairness in Data Wrangling

    Get PDF

    The VADA Architecture for Cost-Effective Data Wrangling

    Get PDF
    Data wrangling, the multi-faceted process by which the data required by an application is identified, extracted, cleaned and integrated, is often cumbersome and labor intensive. In this paper, we present an architecture that supports a complete data wrangling lifecycle, orchestrates components dynamically, builds on automation wherever possible, is informed by whatever data is available, refines automatically produced results in the light of feedback, takes into account the user’s priorities, and supports data scientists with diverse skill sets. The architecture is demonstrated in practice for wrangling property sales and open government data

    Spatio-Temporal Databases: Contentions, Components and Consolidation

    No full text
    Spatio-temporal databases have been the focus of considerable research activity over a significant period. However, there are as of yet very few prototypes of complete systems, far less products that provide effective support for applications tracking changes to spatial and aspatial data over time. We contend that this is because much of the activity in spatio-temporal databases has focused on specific parts of the problem, at the expense of a more holistic view of database systems design and development. It is probably also the case that the database research community has been inclined to undervalue integration or consolidation activities. This paper outlines some contentions relating to spatio-temporal databases, with a view to pruning the space of possible paths that consolidation activities might follow. Suggestions are also made as to what areas are most likely to present challenges to a consolidation activity, in the light of a model architecture for a spatio-temporal database. ..

    MOVIE: An incremental maintenance system for materialized object views

    No full text
    View materialization is an important technique for high performance query processing, data integration and replication. Solutions to the problem of incrementally maintaining materialized views are very relevant. So far, most work on this problem has been confined to relational settings and solutions have not been comprehensively evaluated. This paper describes MOVIE, a complete, implemented and evaluated solution to the problem of incrementally maintaining materialized OQL views in ODMG-compliant object databases. The evaluation throws light into how the e#ectiveness of incremental maintenance is a#ected by issues such as database size, and the complexity and selectivity of views

    Extending ROCK & ROLL with Spatial Data Types: Part 1

    No full text
    The ROCK & ROLL deductive object-oriented database system has been used to develop applications that involve the querying and manipulation of spatial data. The approach to the development of these applications has hitherto required that a suitable set of spatial data types is defined and handed over to applications as a class library for reuse. While this approach is functionally adequate, it leaves open the way to potential inconsistencies in the treatment of geometries and to computational inefficiency, especially due to the absence of built-in spatial-indexing facilities and spatial-query optimization. Part 1 of this paper describes the embedding of a spatial algebra into the imperative language of ROCK & ROLL. This provides users with geometrically consistent, computationally efficient built-in support for spatial data types and operations, thereby lessening the burden associated with the development of complete spatial data handling applications in ROCK & ROLL. In Part 2, the same..
    corecore