35 research outputs found

    Data Accuracy in Medical Record Abstraction

    Get PDF
    Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands

    Can prospective usability evaluation predict data errors?

    Get PDF
    Increasing amounts of clinical research data are collected by manual data entry into electronic source systems and directly from research subjects. For this manual entered source data, common methods of data cleaning such as post-entry identification and resolution of discrepancies and double data entry are not feasible. However data accuracy rates achieved without these mechanisms may be higher than desired for a particular research use. We evaluated a heuristic usability method for utility as a tool to independently and prospectively identify data collection form questions associated with data errors. The method evaluated had a promising sensitivity of 64% and a specificity of 67%. The method was used as described in the literature for usability with no further adaptations or specialization for predicting data errors. We conclude that usability evaluation methodology should be further investigated for use in data quality assurance

    The Human Studies Database Project: Federating Human Studies Design Data Using the Ontology of Clinical Research

    Get PDF
    Human studies, encompassing interventional and observational studies, are the most important source of evidence for advancing our understanding of health, disease, and treatment options. To promote discovery, the design and results of these studies should be made machine-readable for large-scale data mining, synthesis, and re-analysis. The Human Studies Database Project aims to define and implement an informatics infrastructure for institutions to share the design of their human studies. We have developed the Ontology of Clinical Research (OCRe) to model study features such as design type, interventions, and outcomes to support scientific query and analysis. We are using OCRe as the reference semantics for federated data sharing of human studies over caGrid, and are piloting this implementation with several Clinical and Translational Science Award (CTSA) institutions

    Quantifying data quality for clinical trials using electronic data capture.

    Get PDF
    BACKGROUND: Historically, only partial assessments of data quality have been performed in clinical trials, for which the most common method of measuring database error rates has been to compare the case report form (CRF) to database entries and count discrepancies. Importantly, errors arising from medical record abstraction and transcription are rarely evaluated as part of such quality assessments. Electronic Data Capture (EDC) technology has had a further impact, as paper CRFs typically leveraged for quality measurement are not used in EDC processes. METHODS AND PRINCIPAL FINDINGS: The National Institute on Drug Abuse Treatment Clinical Trials Network has developed, implemented, and evaluated methodology for holistically assessing data quality on EDC trials. We characterize the average source-to-database error rate (14.3 errors per 10,000 fields) for the first year of use of the new evaluation method. This error rate was significantly lower than the average of published error rates for source-to-database audits, and was similar to CRF-to-database error rates reported in the published literature. We attribute this largely to an absence of medical record abstraction on the trials we examined, and to an outpatient setting characterized by less acute patient conditions. CONCLUSIONS: Historically, medical record abstraction is the most significant source of error by an order of magnitude, and should be measured and managed during the course of clinical trials. Source-to-database error rates are highly dependent on the amount of structured data collection in the clinical setting and on the complexity of the medical record, dependencies that should be considered when developing data quality benchmarks

    Source-to-Database Audit Error Rates for CTN EDC Trials 1–4.

    No full text
    <p>The first source-to-database audit (“early”) was performed when 20%–30% of expected subject enrollment was reached; the second database audit (“late”) was performed when 70%–80% of expected enrollment was reached.</p

    Sample size curves: 95% confidence intervals (Formula 1), one-tailed.

    No full text
    <p>Intersection of vertical and horizontal lines shows sample size needed to achieve a one-sided CI given an acceptance criterion of 50 errors per 10,000 data fields, an underlying expected error rate of 30 errors per 10,000 fields, and a desired CI width of 20 errors per 10,000 fields (Fleiss J, Levin BL, Paik M. Statistical Methods for Rates and Proportions. 3<sup>rd</sup> ed. New York, NY: Wiley; 2003).</p

    Sample size curves: Hypothesis testing method (Formula 3) at 80% power and α of 0.05 (two-tailed), showing sample sizes needed to distinguish among groups for given baseline error rates and assumed differences (Fleiss J, Levin BL, Paik M. Statistical Methods for Rates and Proportions. 3<sup>rd</sup> ed. New York, NY: Wiley; 2003).

    No full text
    <p>Sample size curves: Hypothesis testing method (Formula 3) at 80% power and α of 0.05 (two-tailed), showing sample sizes needed to distinguish among groups for given baseline error rates and assumed differences (Fleiss J, Levin BL, Paik M. Statistical Methods for Rates and Proportions. 3<sup>rd</sup> ed. New York, NY: Wiley; 2003).</p
    corecore