1,721 research outputs found

    Handling Covariates in the Design of Clinical Trials

    Full text link
    There has been a split in the statistics community about the need for taking covariates into account in the design phase of a clinical trial. There are many advocates of using stratification and covariate-adaptive randomization to promote balance on certain known covariates. However, balance does not always promote efficiency or ensure more patients are assigned to the better treatment. We describe these procedures, including model-based procedures, for incorporating covariates into the design of clinical trials, and give examples where balance, efficiency and ethical considerations may be in conflict. We advocate a new class of procedures, covariate-adjusted response-adaptive (CARA) randomization procedures that attempt to optimize both efficiency and ethical considerations, while maintaining randomization. We review all these procedures, present a few new simulation studies, and conclude with our philosophy.Comment: Published in at http://dx.doi.org/10.1214/08-STS269 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint

    Get PDF
    In certain genetic studies, clinicians and genetic counselors are interested in estimating the cumulative risk of a disease for individuals with and without a rare deleterious mutation. Estimating the cumulative risk is difficult, however, when the estimates are based on family history data. Often, the genetic mutation status in many family members is unknown; instead, only estimated probabilities of a patient having a certain mutation status are available. Also, ages of disease-onset are subject to right censoring. Existing methods to estimate the cumulative risk using such family-based data only provide estimation at individual time points, and are not guaranteed to be monotonic or nonnegative. In this paper, we develop a novel method that combines Expectation-Maximization and isotonic regression to estimate the cumulative risk across the entire support. Our estimator is monotonic, satisfies self-consistent estimating equations and has high power in detecting differences between the cumulative risks of different populations. Application of our estimator to a Parkinson's disease (PD) study provides the age-at-onset distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a significant difference between the distribution in compound heterozygous carriers compared to noncarriers, but not between heterozygous carriers and noncarriers.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS730 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Direction-Projection-Permutation for High Dimensional Hypothesis Tests

    Full text link
    Motivated by the prevalence of high dimensional low sample size datasets in modern statistical applications, we propose a general nonparametric framework, Direction-Projection-Permutation (DiProPerm), for testing high dimensional hypotheses. The method is aimed at rigorous testing of whether lower dimensional visual differences are statistically significant. Theoretical analysis under the non-classical asymptotic regime of dimension going to infinity for fixed sample size reveals that certain natural variations of DiProPerm can have very different behaviors. An empirical power study both confirms the theoretical results and suggests DiProPerm is a powerful test in many settings. Finally DiProPerm is applied to a high dimensional gene expression dataset

    Minimum Sparsity of Unobservable Power Network Attacks

    Full text link
    Physical security of power networks under power injection attacks that alter generation and loads is studied. The system operator employs Phasor Measurement Units (PMUs) for detecting such attacks, while attackers devise attacks that are unobservable by such PMU networks. It is shown that, given the PMU locations, the solution to finding the sparsest unobservable attacks has a simple form with probability one, namely, κ(GM)+1\kappa(G^M) + 1, where κ(GM)\kappa(G^M) is defined as the vulnerable vertex connectivity of an augmented graph. The constructive proof allows one to find the entire set of the sparsest unobservable attacks in polynomial time. Furthermore, a notion of the potential impact of unobservable attacks is introduced. With optimized PMU deployment, the sparsest unobservable attacks and their potential impact as functions of the number of PMUs are evaluated numerically for the IEEE 30, 57, 118 and 300-bus systems and the Polish 2383, 2737 and 3012-bus systems. It is observed that, as more PMUs are added, the maximum potential impact among all the sparsest unobservable attacks drops quickly until it reaches the minimum sparsity.Comment: submitted to IEEE Transactions on Automatic Contro

    Modeling Persistent Trends in Distributions

    Full text link
    We present a nonparametric framework to model a short sequence of probability distributions that vary both due to underlying effects of sequential progression and confounding noise. To distinguish between these two types of variation and estimate the sequential-progression effects, our approach leverages an assumption that these effects follow a persistent trend. This work is motivated by the recent rise of single-cell RNA-sequencing experiments over a brief time course, which aim to identify genes relevant to the progression of a particular biological process across diverse cell populations. While classical statistical tools focus on scalar-response regression or order-agnostic differences between distributions, it is desirable in this setting to consider both the full distributions as well as the structure imposed by their ordering. We introduce a new regression model for ordinal covariates where responses are univariate distributions and the underlying relationship reflects consistent changes in the distributions over increasing levels of the covariate. This concept is formalized as a "trend" in distributions, which we define as an evolution that is linear under the Wasserstein metric. Implemented via a fast alternating projections algorithm, our method exhibits numerous strengths in simulations and analyses of single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio
    • …