4 research outputs found

    Incorporating Domain-Specific Information Quality Constraints into Database Queries

    Get PDF
    The range of information now available in queryable repositories opens up a host of possibilities for new and valuable forms of data analysis. Database query languages such as SQL and XQuery offer a concise and high-level means by which such analyses can be implemented, facilitating the extraction of relevant data subsets into either generic or bespoke data analysis environments. Unfortunately, the quality of data in these repositories is often highly variable. The data is still useful, but only if the consumer is aware of the data quality problems and can work around them. Standard query languages offer little support for this aspect of data management. In principle, however, it should be possible to embed constraints describing the consumer’s data quality requirements into the query directly, so that the query evaluator can take over responsibility for enforcing them during query processing. Most previous attempts to incorporate information quality constraints into database queries have been based around a small number of highly generic quality measures, which are defined and computed by the information provider. This is a useful approach in some application areas but, in practice, quality criteria are more commonly determined by the user of the information not by the provider. In this paper, we explore an approach to incorporating quality constraints into databas

    Towards the management of information quality in proteomics

    No full text
    We outline the application of a framework for managing information quality (IQ) in proteomics. The approach allows scientists to define the quality characteristics that are of importance in their particular domain, by extending a generic ontology of IQ concepts. Two quality indicators are defined for proteomic experiments: hit ratio and mass coverage. We describe how our framework allows experiments marked-up in a Standardformat (e.g. PEDRo) to be annotated with these computed indicators, and how the annotations can be viewed using a convenient plugin to the commonly-used Pedro data entry tool

    Towards the Management of Information Quality in Proteomics

    No full text
    corecore