92 research outputs found

    Estimating Sensitivity and Specificity from a Phase 2 Biomarker Study that Allows for Early Termination

    Get PDF
    Development of a disease screening biomarker involves several phases. In phase 2 its sensitivity and specificity is compared with established thresholds for minimally acceptable performance. Since we anticipate that most candidate markers will not prove to be useful and availability of specimens and funding is limited, early termination of a study is appropriate if accumulating data indicate that the marker is inadequate. Yet, for markers that complete phase 2, we seek estimates of sensitivity and specificity to proceed with the design of subsequent phase 3 studies. We suggest early stopping criteria and estimation procedures that adjust for bias caused by the early termination option. A novel aspect of our approach is to focus on properties of estimates conditional on reaching full study enrollment. We propose the conditional-UMVUE and contrast it with other estimates, including naive estimators, the well studied unconditional-UMVUE and the mean and median Whitehead adjusted estimators. The conditional-UMVUE appears to be a very good choice

    Insights into Latent Class Analysis

    Get PDF
    Latent class analysis is a popular statistical technique for estimating disease prevalence and test sensitivity and specificity. It is used when a gold standard assessment of disease is not available but results of multiple imperfect tests are. We derive analytic expressions for the parameter estimates in terms of the raw data, under the conditional independence assumption. These expressions indicate explicitly how observed two- and three-way associations between test results are used to infer disease prevalence and test operating characteristics. Although reasonable if the conditional independence model holds, the estimators have no basis when it fails. We therefore caution against using the latent class approach in practice

    Semiparametric methods for evaluating the covariate-specific predictiveness of continuous markers in matched case-control studies

    Get PDF
    To assess the value of a continuous marker in predicting the risk of a disease, a graphical tool called the predictiveness curve has been proposed. It characterizes the marker\u27s predictiveness, or capacity to risk stratify the population by displaying the population distribution of risk endowed by the marker. Methods for making inference about the curve and for comparing curves in a general population have been developed. However, knowledge about a marker\u27s performance in the general population only is not enough. Since a marker\u27s effect on the risk model and its distribution can both differ across subpopulations, its predictiveness may vary when applied to different subpopulations. Moreover, knowledge about the predictiveness of a marker conditional on baseline covariates is valuable for individual decision making about having the marker measured or not. Therefore, to fully realize the usefulness of a risk prediction marker, it is important to study its performance conditional on covariates. In this article, we propose semiparametric methods for estimating covariate-specific predictiveness curves for a continuous marker. Unmatched and matched case-control study designs are accommodated. We illustrate application of the methodology by evaluating serum creatinine as a predictor of risk of renal artery stenosis

    The Analysis of Placement Values for Evaluating Discriminatory Measures

    Get PDF
    The idea of using measurements such as biomarkers, clinical data, or molecular biology assays for classification and prediction is popular in modern medicine. The scientific evaluation of such measures includes assessing the accuracy with which they predict the outcome of interest. Receiver operating characteristic curves are commonly used for evaluating the accuracy of diagnostic tests. They can be applied more broadly, indeed to any problem involving classification to two states or populations (D = 0 or D = 1). We show that the ROC curve can be interpreted as a cumulative distribution function for the discriminatory measure Y in the affected population (D = 1) after Y has been standardized to the distribution in the reference population (D = 0). The standardized values are called placement values. If the placement values have a uniform (0, 1) distribution, then Y is not discriminatory., because its distribution in the affected population is the same as that in the reference population. The degree to which the distribution of the standardized measure differs from uniform (0, 1) is a natural way to characterize the discriminatory capacity of Y and provides a non-traditional interpretation for the ROC curve. Statistical methods for making inference about distribution functions therefore motivate new approaches to making inference about ROC curves. We demonstrate this by considering the ROC-GLM regression model and observing that it is equivalent to a regression model for the distribution of placement values. The likelihood of the placement values provides a new approach to parameter estimation that appears to be more efficient than previously proposed methods. The methods are applied to two datatsets, one that involves comparing two biomarkers for pancreatic cancer and one that evaluates a pulmonary function measure in cystic fibrosis patients to predict the occurrence of sever acute pulmonary infection requiring hospitalization. Finally, we note the relationship between regression models for the mean placement value and recently proposed models for the area under the ROC curve, which is the classic summary index of discrimination

    Semiparametric Receiver Operating Characteristic Analysis to Evaluate Biomarkers for Disease

    Get PDF
    The receiver operating characteristic (ROC) curve is a popular method for characterizing the accuracy of diagnostic tests when test results are not binary. Various methodologies for estimating and comparing ROC curves have been developed. One approach, due to Pepe, uses a parametric regression model with the baseline function specified up to a finite-dimensional parameter. In this article we extend the regression models by allowing arbitrary nonparametric baseline functions. We also provide asymptotic distribution theory and procedures for making statistical inference. We illustrate our approach with dataset from a prostate cancer biomarker study. Simulation studies suggest that the extra flexibility inherent in the semiparametric method is gained with little loss in statistical efficiency

    Calibrating Observed Differential Gene Expression for the Multiplicity of Genes on the Array

    Get PDF
    In a gene expression array study, the expression levels of thousands of genes are monitored simultaneously across various biological conditions on a small set of subjects. One goal of such studies is to explore a large pool of genes in order to select a subset of genes that appear to be differently expressed for further investigation. Of particular interest here is how to select the top k genes once genes are ranked based on their evidence for differential expression in two tissue types. We consider statistical methods that provide a more rigorous and intuitively appealing selection process for k. We propose to choose genes based on adjusted p-values (AP values). The AP values are calculated with a resampling based algorithm assuming that no genes are truly differentially expressed, and take into account the multiplicity and dependence encountered in microarray data. Using both simulated data and real microarray data, we assess and compare the performance of our new method with existing methods. The intuitive basis for the AP values and the fact that our procedure has operating characteristics at least as good as existing procedures make it attractive for practical application

    Adjusting for Covariate Effects on Classification Accuracy Using the Covariate-Adjusted ROC Curve

    Get PDF
    Recent scientific and technological innovations have produced an abundance of potential markers which are being investigated for their use in disease screen- ing and diagnosis. In evaluating these markers, it is often necessary to account for covariates which are associated with the marker of interest. These covariates may include subject characteristics, expertise of the test operator, test proce- dures, or aspects of specimen handling. In this paper, we propose the AROC, a covariate-adjusted measure of the classification accuracy. The AROC is the common covariate-specific ROC curve, when the covariate does not affect dis- crimination, and a weighted average of covariate-specific ROC curves, when the covariate does affect discrimination. We propose non-parametric and semi- parametric estimators for the AROC, provide asymptotic distribution theory for these estimators, and investigate their finite sample performance. We illus- trate our methods using data from the Physicians’ Health Study. The AROC is used to characterize the age-adjusted discriminatory accuracy of prostate- specific antigen as a biomarker for prostate cancer

    Comparing the Predictive Values of Diagnostic Tests: Sample Size and Analysis for Paired Study Designs

    Get PDF
    In this paper we consider the design and analysis of studies comparing the positive and negative predictive values of two diagnostic tests that are measured on all subjects. Although statistical methodology is well developed for comparing diagnostic tests in terms of their sensitivities and specificities, comparative inference about predictive values is not. We derive analytic variance expressions for the relative predictive values. Sample size formulas for study design ensue. In addition, two new methods for analyzing the resulting data are presented and compared with an existing marginal regression methodology

    A Linear Regression Framework for Receiver Operating Characteristic(ROC) Curve Analysis

    Get PDF
    In the field of medical diagnostic testing, the receiver operating characteristics(ROC) curve has long been used as a standard statistical tool to assess the accuracy of tests that yield continuous results. Although previous research in this area focused mostly on estimating the ROC curve, recently it has been recognized that the accuracy of a given test may fluctuate depending on certain factors, which motivates modelling covariate effects on the ROC curve. Comparing the corresponding ROC curves between two or more tests is a special case of covariate effect modelling. In this manuscript, we introduce a linear regression framework to model covariate effect on the ROC curve. We assumes the ROC curve takes a specific parametric form for each covariate level and the covariate effect reflects on the parameters of the curves. The new method provides an unified approach for the ROC curve analysis and it is intuitive and easy to apply. Two real data sets are used to illustrate the new approach
    • …
    corecore