217,522 research outputs found
Pooled Association Tests for Rare Genetic Variants: A Review and Some New Results
In the search for genetic factors that are associated with complex heritable
human traits, considerable attention is now being focused on rare variants that
individually have small effects. In response, numerous recent papers have
proposed testing strategies to assess association between a group of rare
variants and a trait, with competing claims about the performance of various
tests. The power of a given test in fact depends on the nature of any
association and on the rareness of the variants in question. We review such
tests within a general framework that covers a wide range of genetic models and
types of data. We study the performance of specific tests through exact or
asymptotic power formulas and through novel simulation studies of over 10,000
different models. The tests considered are also applied to real sequence data
from the 1000 Genomes project and provided by the GAW17. We recommend a testing
strategy, but our results show that power to detect association in plausible
genetic scenarios is low for studies of medium size unless a high proportion of
the chosen variants are causal. Consequently, considerable attention must be
given to relevant biological information that can guide the selection of
variants for testing.Comment: Published in at http://dx.doi.org/10.1214/13-STS456 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
SLOPE - Adaptive variable selection via convex optimization
We introduce a new estimator for the vector of coefficients in the
linear model , where has dimensions with
possibly larger than . SLOPE, short for Sorted L-One Penalized Estimation,
is the solution to where
and are the
decreasing absolute values of the entries of . This is a convex program and
we demonstrate a solution algorithm whose computational complexity is roughly
comparable to that of classical procedures such as the Lasso. Here,
the regularizer is a sorted norm, which penalizes the regression
coefficients according to their rank: the higher the rank - that is, stronger
the signal - the larger the penalty. This is similar to the Benjamini and
Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] procedure (BH) which
compares more significant -values with more stringent thresholds. One
notable choice of the sequence is given by the BH critical
values , where and
is the quantile of a standard normal distribution. SLOPE aims to
provide finite sample guarantees on the selected model; of special interest is
the false discovery rate (FDR), defined as the expected proportion of
irrelevant regressors among all selected predictors. Under orthogonal designs,
SLOPE with provably controls FDR at level .
Moreover, it also appears to have appreciable inferential properties under more
general designs while having substantial power, as demonstrated in a series
of experiments running on both simulated and real data.Comment: Published at http://dx.doi.org/10.1214/15-AOAS842 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Change-Point Testing and Estimation for Risk Measures in Time Series
We investigate methods of change-point testing and confidence interval
construction for nonparametric estimators of expected shortfall and related
risk measures in weakly dependent time series. A key aspect of our work is the
ability to detect general multiple structural changes in the tails of time
series marginal distributions. Unlike extant approaches for detecting tail
structural changes using quantities such as tail index, our approach does not
require parametric modeling of the tail and detects more general changes in the
tail. Additionally, our methods are based on the recently introduced
self-normalization technique for time series, allowing for statistical analysis
without the issues of consistent standard error estimation. The theoretical
foundation for our methods are functional central limit theorems, which we
develop under weak assumptions. An empirical study of S&P 500 returns and US
30-Year Treasury bonds illustrates the practical use of our methods in
detecting and quantifying market instability via the tails of financial time
series during times of financial crisis
- …