24,180 research outputs found
Stability
Reproducibility is imperative for any scientific discovery. More often than
not, modern scientific findings rely on statistical analysis of
high-dimensional data. At a minimum, reproducibility manifests itself in
stability of statistical results relative to "reasonable" perturbations to data
and to the model used. Jacknife, bootstrap, and cross-validation are based on
perturbations to data, while robust statistics methods deal with perturbations
to models. In this article, a case is made for the importance of stability in
statistics. Firstly, we motivate the necessity of stability for interpretable
and reliable encoding models from brain fMRI signals. Secondly, we find strong
evidence in the literature to demonstrate the central role of stability in
statistical inference, such as sensitivity analysis and effect detection.
Thirdly, a smoothing parameter selector based on estimation stability (ES),
ES-CV, is proposed for Lasso, in order to bring stability to bear on
cross-validation (CV). ES-CV is then utilized in the encoding models to reduce
the number of predictors by 60% with almost no loss (1.3%) of prediction
performance across over 2,000 voxels. Last, a novel "stability" argument is
seen to drive new results that shed light on the intriguing interactions
between sample to sample variability and heavier tail error distribution (e.g.,
double-exponential) in high-dimensional regression models with predictors
and independent samples. In particular, when
and the error distribution is
double-exponential, the Ordinary Least Squares (OLS) is a better estimator than
the Least Absolute Deviation (LAD) estimator.Comment: Published in at http://dx.doi.org/10.3150/13-BEJSP14 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Comment: Monitoring Networked Applications With Incremental Quantile Estimation
Comment: Monitoring Networked Applications With Incremental Quantile
Estimation [arXiv:0708.0302]Comment: Published at http://dx.doi.org/10.1214/088342306000000628 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The shuffle estimator for explainable variance in fMRI experiments
In computational neuroscience, it is important to estimate well the
proportion of signal variance in the total variance of neural activity
measurements. This explainable variance measure helps neuroscientists assess
the adequacy of predictive models that describe how images are encoded in the
brain. Complicating the estimation problem are strong noise correlations, which
may confound the neural responses corresponding to the stimuli. If not properly
taken into account, the correlations could inflate the explainable variance
estimates and suggest false possible prediction accuracies. We propose a novel
method to estimate the explainable variance in functional MRI (fMRI) brain
activity measurements when there are strong correlations in the noise. Our
shuffle estimator is nonparametric, unbiased, and built upon the random effect
model reflecting the randomization in the fMRI data collection process.
Leveraging symmetries in the measurements, our estimator is obtained by
appropriately permuting the measurement vector in such a way that the noise
covariance structure is intact but the explainable variance is changed after
the permutation. This difference is then used to estimate the explainable
variance. We validate the properties of the proposed method in simulation
experiments. For the image-fMRI data, we show that the shuffle estimates can
explain the variation in prediction accuracy for voxels within the primary
visual cortex (V1) better than alternative parametric methods.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS681 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Number of paths versus number of basis functions in American option pricing
An American option grants the holder the right to select the time at which to
exercise the option, so pricing an American option entails solving an optimal
stopping problem. Difficulties in applying standard numerical methods to
complex pricing problems have motivated the development of techniques that
combine Monte Carlo simulation with dynamic programming. One class of methods
approximates the option value at each time using a linear combination of basis
functions, and combines Monte Carlo with backward induction to estimate optimal
coefficients in each approximation. We analyze the convergence of such a method
as both the number of basis functions and the number of simulated paths
increase. We get explicit results when the basis functions are polynomials and
the underlying process is either Brownian motion or geometric Brownian motion.
We show that the number of paths required for worst-case convergence grows
exponentially in the degree of the approximating polynomials in the case of
Brownian motion and faster in the case of geometric Brownian motion.Comment: Published at http://dx.doi.org/10.1214/105051604000000846 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
- …