426 research outputs found

    Multiple testing procedures under confounding

    Full text link
    While multiple testing procedures have been the focus of much statistical research, an important facet of the problem is how to deal with possible confounding. Procedures have been developed by authors in genetics and statistics. In this chapter, we relate these proposals. We propose two new multiple testing approaches within this framework. The first combines sensitivity analysis methods with false discovery rate estimation procedures. The second involves construction of shrinkage estimators that utilize the mixture model for multiple testing. The procedures are illustrated with applications to a gene expression profiling experiment in prostate cancer.Comment: Published in at http://dx.doi.org/10.1214/193940307000000176 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Nonparametric and semiparametric inference for models of tumor size and metastasis

    Get PDF
    There has been some recent work in the statistical literature for modelling the relationship between the size of primary cancers and the occurrences of metastases. While nonparametric methods have been proposed for estimation of the tumor size distribution at which metastatic transition occurs, their asymptotic properties have not been studied. In addition, no testing or regression methods are available so that potential confounders and prognostic factors can be adjusted for. We develop a unified approach to nonparametric and semiparametric analysis of modelling tumor size-metastasis data in this article. An equivalence between the models considered by previous authors with survival data structures. Based on this relationship, we develop nonparametric testing procedures and semiparametric regression methodology of modelling the effect of size of tumor on the probability at which metastatic transitions occur in two situations. Asymptotic properties of these estimators are provided. Procedures that achieve the semiparametric information bound are also considered. The proposed methodology is applied to data from a screening study in lung cancer

    Semiparametric methods for the binormal model with multiple biomarkers

    Get PDF
    Abstract: In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We also discuss adjustment for covariates in this class of models and provide a simple two-stage estimation procedure that can be fit using standard software packages. Some analytical comparisons between analyses using the proposed model with univariate biomarker analyses are given. In addition, the techniques are applied to simulated data as well as data from two cancer biomarker studies

    Semiparametic models and estimation procedures for binormal ROC curves with multiple biomarkers

    Get PDF
    In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used for receiver operating characteristic (ROC) curve modelling when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We show that the Van der Waerden rank score coefficient estimation procedure can be used for the multivariate binormal model. We also discuss adjustment for covariates in this class of models. We provide a simple two-stage estimation procedure that can be fit using standard software packages. Asymptotic results of the proposed methods are given. The techniques are applied to data from two cancer biomarker studies

    Simultaneous estimation procedures and multiple testing: a decision-theoretic framework

    Get PDF
    There is recent tremendous interest in statistical methods regarding the false discovery rate (FDR). Two classes of literature on this topic exist. In the first, authors have proposed sequential testing procedures that control the false discovery rate. For the second, authors have studied the procedures involving FDR in a univariate mixture model setting. We consider a decision-theoretic approach to the assessment of FDR-based methods. In particular, we attempt to reconcile the current literature on false discovery rate procedures with more classical simultaneous estimation procedures. Formulation of the link will allow us to apply results from decision theory; we can then traverse between the two literatures. In particular, we propose double shrinkage estimators for the location parameter in the multiple testing problem for false discovery rates and provide conditions for obtaining minimaxity. We also describe a double shrinkage estimation procedure for p-values. Simulation studies are used to explore the risk properties of existing statistical methods and the potential gains of shrinkage. We then develop a procedure for calculating double shrinkage estimators from observed data. The procedures are applied to data from a gene expression profiling study in prostate cancer

    Mixture models for assessing differential expression in complex tissues using microarray data

    Get PDF
    The use of DNA microarrays has become quite popular in many scientific and medical disciplines, such as in cancer research. One common goal of these studies is to determine which genes are differentially expressed between cancer and healthy tissue, or more generally, between two experimental conditions. A major complication in the molecular profiling of tumors using gene expression data is that the data represent a combination of tumor and normal cells. Much of the methodology developed for assessing differential expression with microarray data has assumed that tissue samples are homogeneous. In this article, we outline a general framework for determining differential expression in the presence of mixed cell populations. We consider study designs in which paired tissues and unpaired tissues are available. A hierarchical mixture model is used for modelling the data; a combination of methods of moments procedures and the expectation-maximization (EM) algorithm are used to estimate the model parameters. Links with the false discovery rate are discussed. The methods are applied to two microarray datasets from cancer studies as well as to simulated data
    • …
    corecore