5,169 research outputs found
Cram\'er type moderate deviation theorems for self-normalized processes
Cram\'er type moderate deviation theorems quantify the accuracy of the
relative error of the normal approximation and provide theoretical
justifications for many commonly used methods in statistics. In this paper, we
develop a new randomized concentration inequality and establish a Cram\'er type
moderate deviation theorem for general self-normalized processes which include
many well-known Studentized nonlinear statistics. In particular, a sharp
moderate deviation theorem under optimal moment conditions is established for
Studentized -statistics.Comment: Published at http://dx.doi.org/10.3150/15-BEJ719 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications
Over the last two decades, many exciting variable selection methods have been
developed for finding a small group of covariates that are associated with the
response from a large pool. Can the discoveries from these data mining
approaches be spurious due to high dimensionality and limited sample size? Can
our fundamental assumptions about the exogeneity of the covariates needed for
such variable selection be validated with the data? To answer these questions,
we need to derive the distributions of the maximum spurious correlations given
a certain number of predictors, namely, the distribution of the correlation of
a response variable with the best linear combinations of covariates
, even when and are independent. When the
covariance matrix of possesses the restricted eigenvalue property,
we derive such distributions for both a finite and a diverging , using
Gaussian approximation and empirical process techniques. However, such a
distribution depends on the unknown covariance matrix of . Hence,
we use the multiplier bootstrap procedure to approximate the unknown
distributions and establish the consistency of such a simple bootstrap
approach. The results are further extended to the situation where the residuals
are from regularized fits. Our approach is then used to construct the upper
confidence limit for the maximum spurious correlation and to test the
exogeneity of the covariates. The former provides a baseline for guarding
against false discoveries and the latter tests whether our fundamental
assumptions for high-dimensional model selection are statistically valid. Our
techniques and results are illustrated with both numerical examples and real
data analysis
Cram\'{e}r-type moderate deviations for Studentized two-sample -statistics with applications
Two-sample -statistics are widely used in a broad range of applications,
including those in the fields of biostatistics and econometrics. In this paper,
we establish sharp Cram\'{e}r-type moderate deviation theorems for Studentized
two-sample -statistics in a general framework, including the two-sample
-statistic and Studentized Mann-Whitney test statistic as prototypical
examples. In particular, a refined moderate deviation theorem with second-order
accuracy is established for the two-sample -statistic. These results extend
the applicability of the existing statistical methodologies from the one-sample
-statistic to more general nonlinear statistics. Applications to two-sample
large-scale multiple testing problems with false discovery rate control and the
regularized bootstrap method are also discussed.Comment: Published at http://dx.doi.org/10.1214/15-AOS1375 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …