24 research outputs found

    Bias in trials comparing paired continuous tests can cause researchers to choose the wrong screening modality

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To compare the diagnostic accuracy of two continuous screening tests, a common approach is to test the difference between the areas under the receiver operating characteristic (ROC) curves. After study participants are screened with both screening tests, the disease status is determined as accurately as possible, either by an invasive, sensitive and specific secondary test, or by a less invasive, but less sensitive approach. For most participants, disease status is approximated through the less sensitive approach. The invasive test must be limited to the fraction of the participants whose results on either or both screening tests exceed a threshold of suspicion, or who develop signs and symptoms of the disease after the initial screening tests.</p> <p>The limitations of this study design lead to a bias in the ROC curves we call <it>paired screening trial bias</it>. This bias reflects the synergistic effects of inappropriate reference standard bias, differential verification bias, and partial verification bias. The absence of a gold reference standard leads to inappropriate reference standard bias. When different reference standards are used to ascertain disease status, it creates differential verification bias. When only suspicious screening test scores trigger a sensitive and specific secondary test, the result is a form of partial verification bias.</p> <p>Methods</p> <p>For paired screening tests with bivariate normally distributed scores, we give formulae and programs to quantify the effect of <it>paired screening trial bias </it>on a paired comparison of area under the curves. We fix the prevalence of disease, and the chance a diseased subject manifests signs and symptoms. We derive the formulas for true sensitivity and specificity, and those for the sensitivity and specificity observed by the study investigator.</p> <p>Results</p> <p>The observed area under the ROC curves is quite different from the true area under the ROC curves. The typical direction of the bias is a strong inflation in sensitivity, paired with a concomitant slight deflation of specificity.</p> <p>Conclusion</p> <p>In paired trials of screening tests, when area under the ROC curve is used as the metric, bias may lead researchers to make the wrong decision as to which screening test is better.</p

    Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: a simulation study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A common feature of diagnostic research is that results for a diagnostic gold standard are available primarily for patients who are positive for the test under investigation. Data from such studies are subject to what has been termed "verification bias". We evaluated statistical methods for verification bias correction when there are few false negatives.</p> <p>Methods</p> <p>A simulation study was conducted of a screening study subject to verification bias. We compared estimates of the area-under-the-curve (AUC) corrected for verification bias varying both the rate and mechanism of verification.</p> <p>Results</p> <p>In a single simulated data set, varying false negatives from 0 to 4 led to verification bias corrected AUCs ranging from 0.550 to 0.852. Excess variation associated with low numbers of false negatives was confirmed in simulation studies and by analyses of published studies that incorporated verification bias correction. The 2.5<sup>th </sup>– 97.5<sup>th </sup>centile range constituted as much as 60% of the possible range of AUCs for some simulations.</p> <p>Conclusion</p> <p>Screening programs are designed such that there are few false negatives. Standard statistical methods for verification bias correction are inadequate in this circumstance.</p

    Bio+mine project: empowering the community to develop a site-specific system for the rehabilitation of a legacy mine

    Get PDF
    The rehabilitation of legacy mines continues to be a big challenge because of the difficulties in returning them to safe and stable conditions and ensuring that the mined-out areas become productive to support the economic activity of the host community. Previous efforts are often focused on purely technical and environmental aspects, leading to resistance from the local community due to their exclusion from the rehabilitation process. To address the issues associated with legacy mines and lack of participation of the community, we have developed a project, Biodiversity Positive Mining For The Net Zero Challenge (Bio + Mine), focusing on the abandoned Sto. Niño copper mine (Benguet, Philippines). The mine was closed in 1982 without a plan involving local stakeholders and leaving a significant ongoing negative legacy. Using the social-ecological-technological system framework, we will explore the intersections of the structure and functions of socio-economicdemographic, ecological, and technological data useful in devising a more inclusive mitigation strategy for the reconstruction of the supporting ecosystem. We aim to develop a site-specific system, underpinned by the local community's knowledge and practices, that can be a model for wider implementation in other legacy and active mines worldwide

    Visual inspection with acetic acid as a cervical cancer test: accuracy validated using latent class analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The purpose of this study was to validate the accuracy of an alternative cervical cancer test – visual inspection with acetic acid (VIA) – by addressing possible imperfections in the gold standard through latent class analysis (LCA). The data were originally collected at peri-urban health clinics in Zimbabwe.</p> <p>Methods</p> <p>Conventional accuracy (sensitivity/specificity) estimates for VIA and two other screening tests using colposcopy/biopsy as the reference standard were compared to LCA estimates based on results from all four tests. For conventional analysis, negative colposcopy was accepted as a negative outcome when biopsy was not available as the reference standard. With LCA, local dependencies between tests were handled through adding direct effect parameters or additional latent classes to the model.</p> <p>Results</p> <p>Two models yielded good fit to the data, a 2-class model with two adjustments and a 3-class model with one adjustment. The definition of latent disease associated with the latter was more stringent, backed by three of the four tests. Under that model, sensitivity for VIA (abnormal+) was 0.74 compared to 0.78 with conventional analyses. Specificity was 0.639 versus 0.568, respectively. By contrast, the LCA-derived sensitivity for colposcopy/biopsy was 0.63.</p> <p>Conclusion</p> <p>VIA sensitivity and specificity with the 3-class LCA model were within the range of published data and relatively consistent with conventional analyses, thus validating the original assessment of test accuracy. LCA probably yielded more likely estimates of the true accuracy than did conventional analysis with in-country colposcopy/biopsy as the reference standard. Colpscopy with biopsy can be problematic as a study reference standard and LCA offers the possibility of obtaining estimates adjusted for referent imperfections.</p

    Evaluation of appendicitis risk prediction models in adults with suspected appendicitis

    Get PDF
    Background Appendicitis is the most common general surgical emergency worldwide, but its diagnosis remains challenging. The aim of this study was to determine whether existing risk prediction models can reliably identify patients presenting to hospital in the UK with acute right iliac fossa (RIF) pain who are at low risk of appendicitis. Methods A systematic search was completed to identify all existing appendicitis risk prediction models. Models were validated using UK data from an international prospective cohort study that captured consecutive patients aged 16–45 years presenting to hospital with acute RIF in March to June 2017. The main outcome was best achievable model specificity (proportion of patients who did not have appendicitis correctly classified as low risk) whilst maintaining a failure rate below 5 per cent (proportion of patients identified as low risk who actually had appendicitis). Results Some 5345 patients across 154 UK hospitals were identified, of which two‐thirds (3613 of 5345, 67·6 per cent) were women. Women were more than twice as likely to undergo surgery with removal of a histologically normal appendix (272 of 964, 28·2 per cent) than men (120 of 993, 12·1 per cent) (relative risk 2·33, 95 per cent c.i. 1·92 to 2·84; P < 0·001). Of 15 validated risk prediction models, the Adult Appendicitis Score performed best (cut‐off score 8 or less, specificity 63·1 per cent, failure rate 3·7 per cent). The Appendicitis Inflammatory Response Score performed best for men (cut‐off score 2 or less, specificity 24·7 per cent, failure rate 2·4 per cent). Conclusion Women in the UK had a disproportionate risk of admission without surgical intervention and had high rates of normal appendicectomy. Risk prediction models to support shared decision‐making by identifying adults in the UK at low risk of appendicitis were identified
    corecore