Search CORE

8,470 research outputs found

Resampling-based confidence regions and multiple tests for a correlated random vector

Author: A.W. Vaart Van der
B. Efron
D.M. Mason
J. Præstgaard
J.P. Romano
M. Fromont
P. Hall
P. Hall
Y. Ge
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

We derive non-asymptotic confidence regions for the mean of a random vector whose coordinates have an unknown dependence structure. The random vector is supposed to be either Gaussian or to have a symmetric bounded distribution, and we observe

n

i.i.d copies of it. The confidence regions are built using a data-dependent threshold based on a weighted bootstrap procedure. We consider two approaches, the first based on a concentration approach and the second on a direct boostrapped quantile approach. The first one allows to deal with a very large class of resampling weights while our results for the second are restricted to Rademacher weights. However, the second method seems more accurate in practice. Our results are motivated by multiple testing problems, and we show on simulations that our procedures are better than the Bonferroni procedure (union bound) as soon as the observed vector has sufficiently correlated coordinates.Comment: submitted to COL

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Some nonasymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple tests

Author: Arlot Sylvain
Blanchard Gilles
Roquain Etienne
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

We study generalized bootstrap confidence regions for the mean of a random vector whose coordinates have an unknown dependency structure. The random vector is supposed to be either Gaussian or to have a symmetric and bounded distribution. The dimensionality of the vector can possibly be much larger than the number of observations and we focus on a nonasymptotic control of the confidence level, following ideas inspired by recent results in learning theory. We consider two approaches, the first based on a concentration principle (valid for a large class of resampling weights) and the second on a resampled quantile, specifically using Rademacher weights. Several intermediate results established in the approach based on concentration principles are of interest in their own right. We also discuss the question of accuracy when using Monte Carlo approximations of the resampled quantities.Comment: Published in at http://dx.doi.org/10.1214/08-AOS667; http://dx.doi.org/10.1214/08-AOS668 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Small sample sizes : A big data problem in high-dimensional data analysis

Author: Konietschke Frank
Pauly Markus
Schwab Karima
Publication venue: 'SAGE Publications'
Publication date: 01/03/2021
Field of study

Acknowledgements The authors are grateful to the Editor, Associate Editor and three anonymous referees for their helpful suggestions, which greatly improved the manuscript. Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research is supported by the German Science Foundation awards number DFG KO 4680/3-2 and PA 2409/3-2.Peer reviewedPublisher PD

Aberdeen University Research

Accelerating Permutation Testing in Voxel-wise Analysis through Subspace Tracking: A new plugin for SnPM

Author: Arndt
Balzano
Bennett
Candès
Candès
Chandrasekaran
Cheverud
Dahl
Edgington
Edgington
Edgington
Efron
Eklund
Eklund
Eklund
Fazel
Friston
FSL
Gaonkar
Gutierrez-Barragan
Halber
He
Hinrichs
Hochberg
Holmes
Ji
Jockel
Knijnenburg
Li
Nichols
Nichols
Recht
Recht
SnPM
SPM
Subramanian
Taylor
Winkler
Winkler
Worsley
Worsley
Publication venue: 'Elsevier BV'
Publication date: 15/07/2017
Field of study

Permutation testing is a non-parametric method for obtaining the max null distribution used to compute corrected

p

-values that provide strong control of false positives. In neuroimaging, however, the computational burden of running such an algorithm can be significant. We find that by viewing the permutation testing procedure as the construction of a very large permutation testing matrix,

T

, one can exploit structural properties derived from the data and the test statistics to reduce the runtime under certain conditions. In particular, we see that

T

is low-rank plus a low-variance residual. This makes

T

a good candidate for low-rank matrix completion, where only a very small number of entries of

T

(

\sim0.35\%

of all entries in our experiments) have to be computed to obtain a good estimate. Based on this observation, we present RapidPT, an algorithm that efficiently recovers the max null distribution commonly obtained through regular permutation testing in voxel-wise analysis. We present an extensive validation on a synthetic dataset and four varying sized datasets against two baselines: Statistical NonParametric Mapping (SnPM13) and a standard permutation testing implementation (referred as NaivePT). We find that RapidPT achieves its best runtime performance on medium sized datasets (

50 \leq n \leq 200

), with speedups of 1.5x - 38x (vs. SnPM13) and 20x-1000x (vs. NaivePT). For larger datasets (

n \geq 200

) RapidPT outperforms NaivePT (6x - 200x) on all datasets, and provides large speedups over SnPM13 when more than 10000 permutations (2x - 15x) are needed. The implementation is a standalone toolbox and also integrated within SnPM13, able to leverage multi-core architectures when available.Comment: 36 pages, 16 figure

arXiv.org e-Print Archive

Crossref

HAL-Inserm

Warwick Research Archives Portal Repository