19,926 research outputs found
Stability
Reproducibility is imperative for any scientific discovery. More often than
not, modern scientific findings rely on statistical analysis of
high-dimensional data. At a minimum, reproducibility manifests itself in
stability of statistical results relative to "reasonable" perturbations to data
and to the model used. Jacknife, bootstrap, and cross-validation are based on
perturbations to data, while robust statistics methods deal with perturbations
to models. In this article, a case is made for the importance of stability in
statistics. Firstly, we motivate the necessity of stability for interpretable
and reliable encoding models from brain fMRI signals. Secondly, we find strong
evidence in the literature to demonstrate the central role of stability in
statistical inference, such as sensitivity analysis and effect detection.
Thirdly, a smoothing parameter selector based on estimation stability (ES),
ES-CV, is proposed for Lasso, in order to bring stability to bear on
cross-validation (CV). ES-CV is then utilized in the encoding models to reduce
the number of predictors by 60% with almost no loss (1.3%) of prediction
performance across over 2,000 voxels. Last, a novel "stability" argument is
seen to drive new results that shed light on the intriguing interactions
between sample to sample variability and heavier tail error distribution (e.g.,
double-exponential) in high-dimensional regression models with predictors
and independent samples. In particular, when
and the error distribution is
double-exponential, the Ordinary Least Squares (OLS) is a better estimator than
the Least Absolute Deviation (LAD) estimator.Comment: Published in at http://dx.doi.org/10.3150/13-BEJSP14 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Detecting Unspecified Structure in Low-Count Images
Unexpected structure in images of astronomical sources often presents itself
upon visual inspection of the image, but such apparent structure may either
correspond to true features in the source or be due to noise in the data. This
paper presents a method for testing whether inferred structure in an image with
Poisson noise represents a significant departure from a baseline (null) model
of the image. To infer image structure, we conduct a Bayesian analysis of a
full model that uses a multiscale component to allow flexible departures from
the posited null model. As a test statistic, we use a tail probability of the
posterior distribution under the full model. This choice of test statistic
allows us to estimate a computationally efficient upper bound on a p-value that
enables us to draw strong conclusions even when there are limited computational
resources that can be devoted to simulations under the null model. We
demonstrate the statistical performance of our method on simulated images.
Applying our method to an X-ray image of the quasar 0730+257, we find
significant evidence against the null model of a single point source and
uniform background, lending support to the claim of an X-ray jet
Searching dark-matter halos in the GaBoDS survey
We apply the linear filter for the weak-lensing signal of dark-matter halos
developed in Maturi et al. (2005) to the cosmic-shear data extracted from the
Garching-Bonn-Deep-Survey (GaBoDS). We wish to search for dark-matter halos
through weak-lensing signatures which are significantly above the random and
systematic noise level caused by intervening large-scale structures. We employ
a linear matched filter which maximises the signal-to-noise ratio by minimising
the number of spurious detections caused by the superposition of large-scale
structures (LSS). This is achieved by suppressing those spatial frequencies
dominated by the LSS contamination. We confirm the improved stability and
reliability of the detections achieved with our new filter compared to the
commonly-used aperture mass (Schneider, 1996; Schneider et al., 1998) and to
the aperture mass based on the shear profile expected for NFW haloes (see e.g.
Schirmer et al., 2004; Hennawi & Spergel, 2005). Schirmer et al.~(2006)
achieved results comparable to our filter, but probably only because of the low
average redshift of the background sources in GaBoDS, which keeps the LSS
contamination low. For deeper data, the difference will be more important, as
shown by Maturi et al. (2005). We detect fourteen halos on about eighteen
square degrees selected from the survey. Five are known clusters, two are
associated with over-densities of galaxies visible in the GaBoDS image, and
seven have no known optical or X-ray counterparts.Comment: 8 pages, 4 figures, accepted by A&
Segmentation of skin lesions in 2D and 3D ultrasound images using a spatially coherent generalized Rayleigh mixture model
This paper addresses the problem of jointly estimating the statistical distribution and segmenting lesions in multiple-tissue high-frequency skin ultrasound images. The distribution of multiple-tissue images is modeled as a spatially coherent finite mixture of heavy-tailed Rayleigh distributions. Spatial coherence inherent to biological tissues is modeled by enforcing local dependence between the mixture components. An original Bayesian algorithm combined with a Markov chain Monte Carlo method is then proposed to jointly estimate the mixture parameters and a label-vector associating each voxel to a tissue. More precisely, a hybrid Metropolis-within-Gibbs sampler is used to draw samples that are asymptotically distributed according to the posterior distribution of the Bayesian model. The Bayesian estimators of the model parameters are then computed from the generated samples. Simulation results are conducted on synthetic data to illustrate the performance of the proposed estimation strategy. The method is then successfully applied to the segmentation of in vivo skin tumors in high-frequency 2-D and 3-D ultrasound images
- …