4 research outputs found

    Design and Implementation of a Peer-to-Peer Data Quality Broker

    Get PDF
    Abstract Data quality is becoming an increasingly important issue in environments characterized by extensive data replication. Among such environments, this paper focuses on Cooperative Information Systems (CISs), for which it is very important to declare and access quality of data. Indeed, a system in the CIS will not easily exchange data with another system without a knowledge on its quality, and cooperation becomes dicult without data exchanges. Also, when poor quality data are exchanged, there is a progressive deterioration of the quality of data stored in the whole CIS. In this paper, we describe the detailed design and implementation of a peer-to-peer service for exchanging and improving data quality in CISs. Such a service allows to access data and related quality distributed in the CIS and improves quality of data by comparing dierent copies of the same data. Some experiments on real data will show the eectiveness of the service and the performance behavior

    Incorporating Domain-Specific Information Quality Constraints into Database Queries

    Get PDF
    The range of information now available in queryable repositories opens up a host of possibilities for new and valuable forms of data analysis. Database query languages such as SQL and XQuery offer a concise and high-level means by which such analyses can be implemented, facilitating the extraction of relevant data subsets into either generic or bespoke data analysis environments. Unfortunately, the quality of data in these repositories is often highly variable. The data is still useful, but only if the consumer is aware of the data quality problems and can work around them. Standard query languages offer little support for this aspect of data management. In principle, however, it should be possible to embed constraints describing the consumer’s data quality requirements into the query directly, so that the query evaluator can take over responsibility for enforcing them during query processing. Most previous attempts to incorporate information quality constraints into database queries have been based around a small number of highly generic quality measures, which are defined and computed by the information provider. This is a useful approach in some application areas but, in practice, quality criteria are more commonly determined by the user of the information not by the provider. In this paper, we explore an approach to incorporating quality constraints into databas
    corecore