Search CORE

217,522 research outputs found

Pooled Association Tests for Rare Genetic Variants: A Review and Some New Results

Author: Derkach Andriy
Lawless Jerry F.
Sun Lei
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 09/09/2014
Field of study

In the search for genetic factors that are associated with complex heritable human traits, considerable attention is now being focused on rare variants that individually have small effects. In response, numerous recent papers have proposed testing strategies to assess association between a group of rare variants and a trait, with competing claims about the performance of various tests. The power of a given test in fact depends on the nature of any association and on the rareness of the variants in question. We review such tests within a general framework that covers a wide range of genetic models and types of data. We study the performance of specific tests through exact or asymptotic power formulas and through novel simulation studies of over 10,000 different models. The tests considered are also applied to real sequence data from the 1000 Genomes project and provided by the GAW17. We recommend a testing strategy, but our results show that power to detect association in plausible genetic scenarios is low for studies of medium size unless a high proportion of the chosen variants are causal. Consequently, considerable attention must be given to relevant biological information that can guide the selection of variants for testing.Comment: Published in at http://dx.doi.org/10.1214/13-STS456 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

SLOPE - Adaptive variable selection via convex optimization

Author: Berg Ewout van den
Bogdan Małgorzata
Candès Emmanuel J.
Sabatti Chiara
Su Weijie
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2015
Field of study

We introduce a new estimator for the vector of coefficients

\beta

in the linear model

y=X\beta+z

, where

X

has dimensions

n\times p

with

p

possibly larger than

n

. SLOPE, short for Sorted L-One Penalized Estimation, is the solution to

\min_{b\in\mathbb{R}^p}\frac{1}{2}\Vert y-Xb\Vert _{\ell_2}^2+\lambda_1\vert b\vert _{(1)}+\lambda_2\vert b\vert_{(2)}+\cdots+\lambda_p\vert b\vert_{(p)},

where

\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0

and

\vert b\vert_{(1)}\ge\vert b\vert_{(2)}\ge\cdots\ge\vert b\vert_{(p)}

are the decreasing absolute values of the entries of

b

. This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical

\ell_1

procedures such as the Lasso. Here, the regularizer is a sorted

\ell_1

norm, which penalizes the regression coefficients according to their rank: the higher the rank - that is, stronger the signal - the larger the penalty. This is similar to the Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] procedure (BH) which compares more significant

p

-values with more stringent thresholds. One notable choice of the sequence

\{\lambda_i\}

is given by the BH critical values

\lambda_{\mathrm {BH}}(i)=z(1-i\cdot q/2p)

, where

q\in(0,1)

and

z(\alpha)

is the quantile of a standard normal distribution. SLOPE aims to provide finite sample guarantees on the selected model; of special interest is the false discovery rate (FDR), defined as the expected proportion of irrelevant regressors among all selected predictors. Under orthogonal designs, SLOPE with

\lambda_{\mathrm{BH}}

provably controls FDR at level

q

. Moreover, it also appears to have appreciable inferential properties under more general designs

X

while having substantial power, as demonstrated in a series of experiments running on both simulated and real data.Comment: Published at http://dx.doi.org/10.1214/15-AOAS842 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

PubMed Central

ScholarlyCommons@Penn

Change-Point Testing and Estimation for Risk Measures in Time Series

Author: Fan Lin
Glynn Peter W.
Pelger Markus
Publication venue
Publication date: 07/09/2018
Field of study

We investigate methods of change-point testing and confidence interval construction for nonparametric estimators of expected shortfall and related risk measures in weakly dependent time series. A key aspect of our work is the ability to detect general multiple structural changes in the tails of time series marginal distributions. Unlike extant approaches for detecting tail structural changes using quantities such as tail index, our approach does not require parametric modeling of the tail and detects more general changes in the tail. Additionally, our methods are based on the recently introduced self-normalization technique for time series, allowing for statistical analysis without the issues of consistent standard error estimation. The theoretical foundation for our methods are functional central limit theorems, which we develop under weak assumptions. An empirical study of S&P 500 returns and US 30-Year Treasury bonds illustrates the practical use of our methods in detecting and quantifying market instability via the tails of financial time series during times of financial crisis

arXiv.org e-Print Archive