2,855 research outputs found

    Interval Estimation for the Difference in Paired Areas under the ROC Curves in the Absence of a Gold Standard Test

    Get PDF
    Receiver operating characteristic (ROC) curves can be used to assess the accuracy of tests measured on ordinal or continuous scales. The most commonly used measure for the overall diagnostic accuracy of diagnostic tests is the area under the ROC curve (AUC). A gold standard test on the true disease status is required to estimate the AUC. However, a gold standard test may sometimes be too expensive or infeasible. Therefore, in many medical research studies, the true disease status of the subjects may remain unknown. Under the normality assumption on test results from each disease group of subjects, using the expectation-maximization (EM) algorithm in conjunction with a bootstrap method, we propose a maximum likelihood based procedure for construction of confidence intervals for the difference in paired areas under ROC curves in the absence of a gold standard test. Simulation results show that the proposed interval estimation procedure yields satisfactory coverage probabilities and interval lengths. The proposed method is illustrated with two examples

    Nonparametric and semiparametric methods in medical diagnostics

    Get PDF
    In medical diagnostics, biomarkers are used as the basis for detecting or predicting disease. There has been an increased interest in using the Receiver Operating Characteristic (ROC) curve to assess the accuracy of biomarkers. In many situations, a single biomarker is not sufficient for the desired level of accuracy; furthermore, newly discovered biomarkers can provide additional information for a specific disease. Even though numerous methods have been developed to evaluate a single biomarker, few statistical methods exist to accommodate multiple biomarkers simultaneously. The first paper proposes a semiparametric transformation model for multiple biomarkers in ROC analysis to optimize classification accuracy. This model assumes that some unknown and marker-specific transformations of biomarkers follow a multivariate normal distribution; it incorporates random effects to account for within-subject correlation among biomarkers. Nonparametric maximum likelihood estimation is used for inference, and the parameter estimators are shown to be asymptotically normal and semiparametrically efficient. The proposed method is applied to analyze brain tumor imaging data and prostate cancer data. In the second paper, we focus on assessing the accuracy of biomarkers by adjusting for covariates that can influence the performance of biomarkers. Therefore, we develop an accelerated ROC model in which the effect of covariates relates to rescaling the original ROC curve. The proposed model generalizes the usual accelerated failure time model in the survival context to the ROC analysis. An innovative method is developed to construct estimating equations for parameter estimation. The bootstrapping method is used for inference, and the parameter estimators are shown to be asymptotically normal. We apply the proposed method to data from a prostate cancer study. The paired-reader, paired-patient design is commonly used in reader studies when evaluating the diagnostic performance of radiological imaging systems. In this design, multiple readers interpret all test results of patients who undergo multiple diagnostic tests under study. In the third paper, we develop a method to estimate and compare accuracies of diagnostic tests in a paired-reader, paired-patient design by introducing a latent model for test results. The asymptotic property of the proposed test statistics is derived based on the theory of U-statistics. Furthermore, a method for correcting an imperfect gold standard bias and sample size formula are presented. The proposed method is applied to comparing the diagnostic performance of digital mammography and screen-film mammography in discriminating breast tumors

    Chapter 8: Meta-analysis of Test Performance When There is a “Gold Standard”

    Get PDF
    Synthesizing information on test performance metrics such as sensitivity, specificity, predictive values and likelihood ratios is often an important part of a systematic review of a medical test. Because many metrics of test performance are of interest, the meta-analysis of medical tests is more complex than the meta-analysis of interventions or associations. Sometimes, a helpful way to summarize medical test studies is to provide a “summary point”, a summary sensitivity and a summary specificity. Other times, when the sensitivity or specificity estimates vary widely or when the test threshold varies, it is more helpful to synthesize data using a “summary line” that describes how the average sensitivity changes with the average specificity. Choosing the most helpful summary is subjective, and in some cases both summaries provide meaningful and complementary information. Because sensitivity and specificity are not independent across studies, the meta-analysis of medical tests is fundamentaly a multivariate problem, and should be addressed with multivariate methods. More complex analyses are needed if studies report results at multiple thresholds for positive tests. At the same time, quantitative analyses are used to explore and explain any observed dissimilarity (heterogeneity) in the results of the examined studies. This can be performed in the context of proper (multivariate) meta-regressions

    Effect of HIV-1 infection on T-Cell-based and skin test detection of tuberculosis infection

    Get PDF
    RATIONALE: Two forms of the IFN-gamma release assay (IFNGRA) to detect tuberculosis infection are available, but neither has been evaluated in comparable HIV-infected and uninfected persons in a high tuberculosis incidence environment. OBJECTIVE: To compare the ability of the T-SPOT.TB (Oxford Immunotec, Abingdon, UK), QuantiFERON-TB Gold (Cellestis, Melbourne, Australia), and Mantoux tests to identify latent tuberculosis in HIV-infected and uninfected persons. METHODS: A cross-sectional study of 160 healthy adults without active tuberculosis attending a voluntary counseling and testing center for HIV infection in Khayelitsha, a deprived urban South African community with an HIV antenatal seroprevalence of 33% and a tuberculosis incidence of 1,612 per 100,000. MEASUREMENTS AND MAIN RESULTS: One hundred and sixty (74 HIV(+) and 86 HIV(-)) persons were enrolled. A lower proportion of Mantoux results was positive in HIV-infected subjects compared with HIV-uninfected subjects (p < 0.01). By contrast, the proportion of positive IFNGRAs was not significantly different in HIV-infected persons for the T-SPOT.TB test (52 vs. 59%; p = 0.41) or the QuantiFERON-TB Gold test (43 and 46%; p = 0.89). Fair agreement between the Mantoux test (5- and 10-mm cutoffs) and the IFNGRA was seen in HIV-infected people (kappa = 0.52-0.6). By contrast, poor agreement between the Mantoux and QuantiFERON-TB Gold tests was observed in the HIV-uninfected group (kappa = 0.07-0.30, depending on the Mantoux cutoff). The pattern was similar for T-SPOT.TB (kappa = 0.18-0.24). Interpretation: IFNGRA sensitivity appears relatively unimpaired by moderately advanced HIV infection. However, agreement between the tests and with the Mantoux test varied from poor to fair. This highlights the need for prospective studies to determine which test may predict the subsequent risk of tuberculosis

    Optimal cutoff points for classification in diagnostic studies: new contributions and software development

    Get PDF
    Continuous diagnostic tests (biomarkers or risk markers) are often used to discriminate between healthy and diseased populations. For the clinical application of such tests, the key aspect is how to select an appropriate cutpoint or discrimination value c that defines positive and negative test results. In general, individuals with a diagnostic test value smaller than c are classified as healthy and otherwise as diseased. In the literature, several methods have been proposed to select the threshold value c in terms of different specific criteria of optimality. Among others, one of the methods most used in clinical practice is the Symmetry point that maximizes simultaneously both types of correct classifications. From a graphical viewpoint, the Symmetry point is associated to the operating point on the Receiver Operating Characteristic (ROC) curve that intersects the diagonal line passing through the points (0,1) and (1,0). However, this cutpoint is actually valid only when the error of misclassifying a diseased patient has the same severity than the error of misclassifying a healthy patient. Since this may not be the case in practice, an important issue in order to assess the clinical effectiveness of a biomarker is to take into account the costs associated with the decisions taken when selecting the threshold value. Moreover, to facilitate the task of selecting the optimal cut-off point in clinical practice, it is essential to have software that implements the existing optimal criteria in an user-friendly environment. Another interesting issue appears when the marker shows an irregular distribution, with a dominance of diseased subjects in noncontiguous regions. Using a single cutpoint, as common practice in traditional ROC analysis, would not be appropriate for these scenarios because it would lead to erroneous conclusions, not taking full advantage of the intrinsic classificatory capacity of the marke

    Evaluation of lntelligent Medical Systems

    Get PDF
    This thesis presents novel, robust, analytic and algorithmic methods for calculating Bayesian posterior intervals of receiver operating characteristic (ROC) curves and confusion matrices used for the evaluation of intelligent medical systems tested with small amounts of data. Intelligent medical systems are potentially important in encapsulating rare and valuable medical expertise and making it more widely available. The evaluation of intelligent medical systems must make sure that such systems are safe and cost effective. To ensure systems are safe and perform at expert level they must be tested against human experts. Human experts are rare and busy which often severely restricts the number of test cases that may be used for comparison. The performance of expert human or machine can be represented objectively by ROC curves or confusion matrices. ROC curves and confusion matrices are complex representations and it is sometimes convenient to summarise them as a single value. In the case of ROC curves, this is given as the Area Under the Curve (AUC), and for confusion matrices by kappa, or weighted kappa statistics. While there is extensive literature on the statistics of ROC curves and confusion matrices they are not applicable to the measurement of intelligent systems when tested with small data samples, particularly when the AUC or kappa statistic is high. A fundamental Bayesian study has been carried out, and new methods devised, to provide better statistical measures for ROC curves and confusion matrices at low sample sizes. They enable exact Bayesian posterior intervals to be produced for: (1) the individual points on a ROC curve; (2) comparison between matching points on two uncorrelated curves; . (3) the AUC of a ROC curve, using both parametric and nonparametric assumptions; (4) the parameters of a parametric ROC curve; and (5) the weight of a weighted confusion matrix. These new methods have been implemented in software to provide a powerful and accurate tool for developers and evaluators of intelligent medical systems in particular, and to a much wider audience using ROC curves and confusion matrices in general. This should enhance the ability to prove intelligent medical systems safe and effective and should lead to their widespread deployment. The mathematical and computational methods developed in this thesis should also provide the basis for future research into determination of posterior intervals for other statistics at small sample sizes

    Confidence Interval Estimation of the Area under the Receiver Operating Characteristic Curve in the Presence of Measurement Error

    Get PDF
    Diagnosis of diseases is often based on biomarkers with continuous measurements. The discriminative ability of a biomarker can be depicted by a receiver operating characteristic (ROC) curve, which shows simultaneously the proportions of both ab­ normal and normal subjects correctly diagnosed at various cutoff points in the marker values. The area (A) under the ROC curve is commonly used to measure the ability of the marker to distinguish between two populations. Many biomarkers are subject to measurement error, which must be taken into account in statistical inference for A to avoid misleading results. Assuming a normal distribution for biomarker values, this thesis developed a confidence interval procedure for A adjusted for random mea­ surement error that can be quantified by an external reliability study. The basis of the new procedure is the method of variance estimates recovery. Simulation results show that this procedure outperformed the one based on the Delta method. The methodology is illustrated by a data set from a study using thiobarbituric acid reac­ tion substance to diagnose cardiovascular disease

    Under-Sampled Reconstruction Techniques for Accelerated Magnetic Resonance Imaging

    Get PDF
    Due to physical and biological constraints and requirements on the minimum resolution and SNR, the acquisition time is relatively long in magnetic resonance imaging (MRI). Consequently, a limited number of pulse sequences can be run in a clinical MRI session because of constraints on the total acquisition time due to patient comfort and cost considerations. Therefore, it is strongly desired to reduce the acquisition time without compromising the reconstruction quality. This thesis concerns under-sampled reconstruction techniques for acceleration of MRI acquisitions, i.e., parallel imaging and compressed sensing. While compressed sensing MRI reconstructions are commonly regularized by penalizing the decimated wavelet transform coefficients, it is shown in this thesis that the visual artifacts, associated with the lack of translation-invariance of the wavelet basis in the decimated form, can be avoided by penalizing the undecimated wavelet transform coefficients, i.e., the stationary wavelet transform (SWT). An iterative SWT thresholding algorithm for combined SWT-regularized compressed sensing and parallel imaging reconstruction is presented. Additionally, it is shown that in MRI applications involving multiple sequential acquisitions, e.g., quantitative T1/T2 mapping, the correlation between the successive acquisitions can be incorporated as an additional constraint for joint under-sampled reconstruction, resulting in improved reconstruction performance. While quantitative measures of quality, e.g., reconstruction error with respect to the fully-sampled reference, are commonly used for performance evaluation and comparison of under-sampled reconstructions, this thesis shows that such quantitative measures do not necessarily correlate with the subjective quality of reconstruction as perceived by radiologists and other expert end users. Therefore, unless accompanied by subjective evaluations, quantitative quality measurements/comparisons will be of limited clinical impact. The results of experiments aimed at subjective evaluation/comparison of different under-sampled reconstructions for specific clinical neuroimaging MRI applications are presented in this thesis. One motivation behind the current work was to reduce the acquisition time for relaxation mapping techniques DESPOT1 and DESPOT2. This work also includes a modification to the Driven Equilibrium Single Pulse Observation of T1 with high-speed incorporation of RF field inhomogeneities (DESPOT1-HIFI), resulting in more accurate estimation of T1 values at high strength (3T and higher) magnetic fields

    The epidemiology of malignant catarrhal fever viruses in bison

    Get PDF
    A competitive inhibition enzyme linked immunosorbent assay was validated and used to estimate the prevalence of MCF-group virus infections in a sample of Alberta farmed bison. Prevalence among 995 slaughterhouse specimens was 21.8%. Among 953 samples from a serum bank the prevalence was 23.9%, and among 646 samples from five bison herds the prevalence was 23.4%. Test results from samples from one isolated bison herd collected over a period of six years provided evidence that an MCF-group virus was being transmitted across generations in the absence of exposure to any other ruminant sources of MCF-group viruses. Study of an outbreak of MCF in bison following a brief exposure to sheep provided very strong evidence that transmission of MCF among bison does not occur. Culturing lymphocytes from 12 healthy adult bison in the presence of the phorbol ester, 12-0-tetradecanoyl phorbol-13-acetate and testing of DNA extracted from these cultures with consensus herpesvirus PCR allowed the identification of viruses very similar to ovine herpesvirus two in five of the bison samples
    • …
    corecore