473 research outputs found

    DMN for Data Quality Measurement and Assessment

    Get PDF
    Data Quality assessment is aimed at evaluating the suitability of a dataset for an intended task. The extensive literature on data quality describes the various methodologies for assessing data quality by means of data profiling techniques of the whole datasets. Our investigations are aimed to provide solutions to the need of automatically assessing the level of quality of the records of a dataset, where data profiling tools do not provide an adequate level of information. As most of the times, it is easier to describe when a record has quality enough than calculating a qualitative indicator, we propose a semi-automatically business rule-guided data quality assessment methodology for every record. This involves first listing the business rules that describe the data (data requirements), then those describing how to produce measures (business rules for data quality measurements), and finally, those defining how to assess the level of data quality of a data set (business rules for data quality assessment). The main contribution of this paper is the adoption of the OMG standard DMN (Decision Model and Notation) to support the data quality requirement description and their automatic assessment by using the existing DMN engines.Ministerio de Ciencia y Tecnología RTI2018-094283-B-C33Ministerio de Ciencia y Tecnología RTI2018-094283-B-C31European Regional Development Fund SBPLY/17/180501/00029

    Open Data Quality Measurement Framework: Definition and Application to Open Government Data

    Get PDF
    The diffusion of Open Government Data (OGD) in recent years kept a very fast pace. However, evidence from practitioners shows that disclosing data without proper quality control may jeopardize datasets reuse and negatively affect civic participation. Current approaches to the problem in literature lack of a comprehensive theoretical framework. Moreover, most of the evaluations concentrate on open data platforms, rather than on datasets. In this work, we address these two limitations and set up a framework of indicators to measure the quality of Open Government Data on a series of data quality dimensions at most granular level of measurement. We validated the evaluation framework by applying it to compare two cases of Italian OGD datasets: an internationally recognized good example of OGD, with centralized disclosure and extensive data quality controls, and samples of OGD from decentralized data disclosure (municipalities level), with no possibility of extensive quality controls as in the former case, hence with supposed lower quality. Starting from measurements based on the quality framework, we were able to verify the difference in quality: the measures showed a few common acquired good practices and weaknesses, and a set of discriminating factors that pertain to the type of datasets and the overall approach. On the basis of this evaluation, we also provided technical and policy guidelines to overcome the weaknesses observed in the decentralized release policy, addressing specific quality aspects

    Data Quality Measurement Based on Domain-Specific Information

    Get PDF
    Over the past decades, the topic of data quality became extremely important in various application fields. Originally developed for data warehouses, it received a strong push with the big data concept and artificial intelligence systems. In the presented chapter, we are looking at traditional data quality dimensions, which mainly have a more technical nature. However, we concentrate mostly on the idea of defining a single data quality determinant, which does not substitute the dimensions but allows us to look at the data quality from the point of view of users and particular applications. We consider this approach, which is known as a fit-to-use indicator, in two domains. The first one is the test data for complicated multi-component software systems on the example of a stock exchange. The second domain is scientific research on the example of validation of handwriting psychology. We demonstrate how the fit-to-use determinant of data quality can be defined and formalized and what benefit to the improvement of data quality it can give

    Data quality management and evolution of information systems

    Get PDF
    Information systems have been rapidly evolving from monolithic/ transactional to network/service based systems. The issue of data quality is becoming increasingly important, since information in new information systems is ubiquitous, diverse, uncontrolled. In the paper we examine data quality from the point of view of dimensions and methodologies proposed for data quality measurement and improvement. Dimensions and methodologies are examined in their relationship with the different types of data, from structured to unstructured, the evolution of information systems, and the diverse application areas.The past and the future of information systems: 1976-2006 and beyondRed de Universidades con Carreras en Informática (RedUNCI

    Taxonomy for trade-off problem in distributed telemedicine systems

    Get PDF
    Increasing amount of data and demands for features are challenging for Web systems. By serving these requirements, distributed systems gather ground, however they bring problems along. These issues are already present in telemedicine. Since telemedicine is a wide discipline, various phenomena have different effects on data. Availability and consistency have both important roles in telemedicine, but as CAP and PACELC theorems addressed trade-off problem, no one can guarantee both capabilities simultaneously. Our research aims to get a nearer view of the problem considering real world telemedicine use-cases and offer an easily tunable system with a taxonomy that gives a helping hand at designing telemedicine systems. Model checking verifies and data quality measurement proves the correctness of our model. In the course of measurements, we revealed an interesting occurence and consequence that is called hypothetical-zero-latency

    Toward a framework for data quality in cloud-based health information system

    No full text
    This Cloud computing is a promising platform for health information systems in order to reduce costs and improve accessibility. Cloud computing represents a shift away from computing being purchased as a product to be a service delivered over the Internet to customers. Cloud computing paradigm is becoming one of the popular IT infrastructures for facilitating Electronic Health Record (EHR) integration and sharing. EHR is defined as a repository of patient data in digital form. This record is stored and exchanged securely and accessible by different levels of authorized users. Its key purpose is to support the continuity of care, and allow the exchange and integration of medical information for a patient. However, this would not be achieved without ensuring the quality of data populated in the healthcare clouds as the data quality can have a great impact on the overall effectiveness of any system. The assurance of the quality of data used in healthcare systems is a pressing need to help the continuity and quality of care. Identification of data quality dimensions in healthcare clouds is a challenging issue as data quality of cloud-based health information systems arise some issues such as the appropriateness of use, and provenance. Some research proposed frameworks of the data quality dimensions without taking into consideration the nature of cloud-based healthcare systems. In this paper, we proposed an initial framework that fits the data quality attributes. This framework reflects the main elements of the cloud-based healthcare systems and the functionality of EHR

    TESTING PATTERNS FOR SYPHILIS AND OTHER SEXUALLY TRANSMITTED INFECTIONS IN PREGNANT WOMEN PRESENTING TO EMERGENCY DEPARTMENTS

    Get PDF
    Following an initial decrease in the incidence of congenital syphilis from 2008-2012, the rate of congenital syphilis rose by 38% across the United States between 2012-2014 (2). This trend followed a 22% rise in primary and secondary syphilis cases in women during the same period.(1) Vertical transmission of syphilis is a significant public health concern, contributing to stillbirth, infant mortality, and neurologic and skeletal morbidities in survivors. (2) The Centers for Disease Control and Prevention (CDC) recommends that all pregnant women be screened for sexually transmitted infections (STI) including HIV, syphilis, and hepatitis B at the first prenatal visit regardless of prior testing. The American College of Obstetricians and Gynecologists (ACOG) and the U.S. Preventive Services Task Force (USPSTF) also support similar recommendations. Yet, a CDC investigation into this epidemic revealed that 21% of women whose infants were diagnosed with congenital syphilis had no prenatal care, and of those who had at least one prenatal visit, 43% received no treatment for syphilis during pregnancy and 30% received inadequate treatment. (2, 3) Little is understood about factors associated with low STI screening during pregnancy in the US. In a 2014 study, Cha, et al. evaluated factors affecting the likelihood of STI screening in pregnant women in Guam. They found that the biggest barrier to STI testing was lack of prenatal care and insurance. Even women with access to prenatal care were not routinely screened for syphilis before 24 weeks’ gestation. Despite a 93.5% overall rate of screening for syphilis at any time during pregnancy, the authors found much lower screening 2 rates for other STIs, including 31% for HIV, 25.3% for chlamydia, and 25.7% for gonorrhea. (8) This suggests potential disparity in testing practices based on risk perception by providers or patients
    corecore