1,721 research outputs found
Handling Covariates in the Design of Clinical Trials
There has been a split in the statistics community about the need for taking
covariates into account in the design phase of a clinical trial. There are many
advocates of using stratification and covariate-adaptive randomization to
promote balance on certain known covariates. However, balance does not always
promote efficiency or ensure more patients are assigned to the better
treatment. We describe these procedures, including model-based procedures, for
incorporating covariates into the design of clinical trials, and give examples
where balance, efficiency and ethical considerations may be in conflict. We
advocate a new class of procedures, covariate-adjusted response-adaptive (CARA)
randomization procedures that attempt to optimize both efficiency and ethical
considerations, while maintaining randomization. We review all these
procedures, present a few new simulation studies, and conclude with our
philosophy.Comment: Published in at http://dx.doi.org/10.1214/08-STS269 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint
In certain genetic studies, clinicians and genetic counselors are interested
in estimating the cumulative risk of a disease for individuals with and without
a rare deleterious mutation. Estimating the cumulative risk is difficult,
however, when the estimates are based on family history data. Often, the
genetic mutation status in many family members is unknown; instead, only
estimated probabilities of a patient having a certain mutation status are
available. Also, ages of disease-onset are subject to right censoring. Existing
methods to estimate the cumulative risk using such family-based data only
provide estimation at individual time points, and are not guaranteed to be
monotonic or nonnegative. In this paper, we develop a novel method that
combines Expectation-Maximization and isotonic regression to estimate the
cumulative risk across the entire support. Our estimator is monotonic,
satisfies self-consistent estimating equations and has high power in detecting
differences between the cumulative risks of different populations. Application
of our estimator to a Parkinson's disease (PD) study provides the age-at-onset
distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a
significant difference between the distribution in compound heterozygous
carriers compared to noncarriers, but not between heterozygous carriers and
noncarriers.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS730 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Direction-Projection-Permutation for High Dimensional Hypothesis Tests
Motivated by the prevalence of high dimensional low sample size datasets in
modern statistical applications, we propose a general nonparametric framework,
Direction-Projection-Permutation (DiProPerm), for testing high dimensional
hypotheses. The method is aimed at rigorous testing of whether lower
dimensional visual differences are statistically significant. Theoretical
analysis under the non-classical asymptotic regime of dimension going to
infinity for fixed sample size reveals that certain natural variations of
DiProPerm can have very different behaviors. An empirical power study both
confirms the theoretical results and suggests DiProPerm is a powerful test in
many settings. Finally DiProPerm is applied to a high dimensional gene
expression dataset
Minimum Sparsity of Unobservable Power Network Attacks
Physical security of power networks under power injection attacks that alter
generation and loads is studied. The system operator employs Phasor Measurement
Units (PMUs) for detecting such attacks, while attackers devise attacks that
are unobservable by such PMU networks. It is shown that, given the PMU
locations, the solution to finding the sparsest unobservable attacks has a
simple form with probability one, namely, , where
is defined as the vulnerable vertex connectivity of an augmented
graph. The constructive proof allows one to find the entire set of the sparsest
unobservable attacks in polynomial time. Furthermore, a notion of the potential
impact of unobservable attacks is introduced. With optimized PMU deployment,
the sparsest unobservable attacks and their potential impact as functions of
the number of PMUs are evaluated numerically for the IEEE 30, 57, 118 and
300-bus systems and the Polish 2383, 2737 and 3012-bus systems. It is observed
that, as more PMUs are added, the maximum potential impact among all the
sparsest unobservable attacks drops quickly until it reaches the minimum
sparsity.Comment: submitted to IEEE Transactions on Automatic Contro
Modeling Persistent Trends in Distributions
We present a nonparametric framework to model a short sequence of probability
distributions that vary both due to underlying effects of sequential
progression and confounding noise. To distinguish between these two types of
variation and estimate the sequential-progression effects, our approach
leverages an assumption that these effects follow a persistent trend. This work
is motivated by the recent rise of single-cell RNA-sequencing experiments over
a brief time course, which aim to identify genes relevant to the progression of
a particular biological process across diverse cell populations. While
classical statistical tools focus on scalar-response regression or
order-agnostic differences between distributions, it is desirable in this
setting to consider both the full distributions as well as the structure
imposed by their ordering. We introduce a new regression model for ordinal
covariates where responses are univariate distributions and the underlying
relationship reflects consistent changes in the distributions over increasing
levels of the covariate. This concept is formalized as a "trend" in
distributions, which we define as an evolution that is linear under the
Wasserstein metric. Implemented via a fast alternating projections algorithm,
our method exhibits numerous strengths in simulations and analyses of
single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio
- …