467 research outputs found
The Poisson transform for unnormalised statistical models
Contrary to standard statistical models, unnormalised statistical models only
specify the likelihood function up to a constant. While such models are natural
and popular, the lack of normalisation makes inference much more difficult.
Here we show that inferring the parameters of a unnormalised model on a space
can be mapped onto an equivalent problem of estimating the intensity
of a Poisson point process on . The unnormalised statistical model now
specifies an intensity function that does not need to be normalised.
Effectively, the normalisation constant may now be inferred as just another
parameter, at no loss of information. The result can be extended to cover
non-IID models, which includes for example unnormalised models for sequences of
graphs (dynamical graphs), or for sequences of binary vectors. As a
consequence, we prove that unnormalised parameteric inference in non-IID models
can be turned into a semi-parametric estimation problem. Moreover, we show that
the noise-contrastive divergence of Gutmann & Hyv\"arinen (2012) can be
understood as an approximation of the Poisson transform, and extended to
non-IID settings. We use our results to fit spatial Markov chain models of eye
movements, where the Poisson transform allows us to turn a highly non-standard
model into vanilla semi-parametric logistic regression
Power spectrum and intermittency of the transmitted flux of QSOs Ly-alpha absorption spectra
Using a set of 28 high resolution, high signal to noise ratio (S/N) QSO
Ly-alpha absorption spectra, we investigate the non-Gaussian features of the
transmitted flux fluctuations, and their effect upon the power spectrum of this
field. We find that the spatial distribution of the local power of the
transmitted flux on scales k >= 0.05 s/km is highly spiky or intermittent. The
probability distribution functions (PDFs) of the local power are long-tailed.
The power on small scales is dominated by small probability events, and
consequently, the uncertainty in the power spectrum of the transmitted flux
field is generally large. This uncertainty arises due to the slow convergence
of an intermittent field to a Gaussian limit required by the central limit
theorem (CLT). To reduce this uncertainty, it is common to estimate the error
of the power spectrum by selecting subsamples with an "optimal" size. We show
that this conventional method actually does not calculate the variance of the
original intermittent field but of a Gaussian field. Based on the analysis of
intermittency, we propose an algorithm to calculate the error. It is based on a
bootstrap re-sampling among all independent local power modes. This estimation
doesn't require any extra parameter like the size of the subsamples, and is
sensitive to the intermittency of the fields. This method effectively reduces
the uncertainty in the power spectrum when the number of independent modes
matches the condition of the CLT convergence.Comment: 26 pages (incl. figures). Accepted for publication in MNRA
Fast matrix computations for functional additive models
It is common in functional data analysis to look at a set of related
functions: a set of learning curves, a set of brain signals, a set of spatial
maps, etc. One way to express relatedness is through an additive model, whereby
each individual function is assumed to be a variation
around some shared mean . Gaussian processes provide an elegant way of
constructing such additive models, but suffer from computational difficulties
arising from the matrix operations that need to be performed. Recently Heersink
& Furrer have shown that functional additive model give rise to covariance
matrices that have a specific form they called quasi-Kronecker (QK), whose
inverses are relatively tractable. We show that under additional assumptions
the two-level additive model leads to a class of matrices we call restricted
quasi-Kronecker, which enjoy many interesting properties. In particular, we
formulate matrix factorisations whose complexity scales only linearly in the
number of functions in latent field, an enormous improvement over the cubic
scaling of na\"ive approaches. We describe how to leverage the properties of
rQK matrices for inference in Latent Gaussian Models
Replica theory for learning curves for Gaussian processes on random graphs
Statistical physics approaches can be used to derive accurate predictions for
the performance of inference methods learning from potentially noisy data, as
quantified by the learning curve defined as the average error versus number of
training examples. We analyse a challenging problem in the area of
non-parametric inference where an effectively infinite number of parameters has
to be learned, specifically Gaussian process regression. When the inputs are
vertices on a random graph and the outputs noisy function values, we show that
replica techniques can be used to obtain exact performance predictions in the
limit of large graphs. The covariance of the Gaussian process prior is defined
by a random walk kernel, the discrete analogue of squared exponential kernels
on continuous spaces. Conventionally this kernel is normalised only globally,
so that the prior variance can differ between vertices; as a more principled
alternative we consider local normalisation, where the prior variance is
uniform
A multiscale regularized restoration algorithm for XMM-Newton data
We introduce a new multiscale restoration algorithm for images with few
photons counts and its use for denoising XMM data. We use a thresholding of the
wavelet space so as to remove the noise contribution at each scale while
preserving the multiscale information of the signal. Contrary to other
algorithms the signal restoration process is the same whatever the signal to
noise ratio is. Thresholds according to a Poisson noise process are indeed
computed analytically at each scale thanks to the use of the unnormalized Haar
wavelet transform. Promising preliminary results are obtained on X-ray data for
Abell 2163 with the computation of a temperature map.Comment: To appear in the Proceedings of `Galaxy Clusters and the High
Redshift Universe Observed in X-rays', XXIth Moriond Astrophysics Meeting
(March 2001), Eds. Doris Neumann et a
A Bayesian approach to discrete object detection in astronomical datasets
A Bayesian approach is presented for detecting and characterising the signal
from discrete objects embedded in a diffuse background. The approach centres
around the evaluation of the posterior distribution for the parameters of the
discrete objects, given the observed data, and defines the
theoretically-optimal procedure for parametrised object detection. Two
alternative strategies are investigated: the simultaneous detection of all the
discrete objects in the dataset, and the iterative detection of objects. In
both cases, the parameter space characterising the object(s) is explored using
Markov-Chain Monte-Carlo sampling. For the iterative detection of objects,
another approach is to locate the global maximum of the posterior at each
iteration using a simulated annealing downhill simplex algorithm. The
techniques are applied to a two-dimensional toy problem consisting of Gaussian
objects embedded in uncorrelated pixel noise. A cosmological illustration of
the iterative approach is also presented, in which the thermal and kinetic
Sunyaev-Zel'dovich effects from clusters of galaxies are detected in microwave
maps dominated by emission from primordial cosmic microwave background
anisotropies.Comment: 20 pages, 12 figures, accepted by MNRAS; contains some additional
material in response to referee's comment
Is infinity that far? A Bayesian nonparametric perspective of finite mixture models
Mixture models are one of the most widely used statistical tools when dealing with data from heterogeneous populations. Following a Bayesian nonparametric perspective, we introduce a new class of priors: the Normalized Independent Point Process. We investigate the probabilistic properties of this new class and present many special cases. In particular, we provide an explicit formula for the distribution of the implied partition, as well as the posterior characterization of the new process in terms of the superposition of two discrete measures. We also provide consistency results. Moreover, we design both a marginal and a conditional algorithm for finite mixture models with a random number of components. These schemes are based on an auxiliary variable MCMC, which allows handling the otherwise intractable posterior distribution and overcomes the challenges associated with the Reversible Jump algorithm. We illustrate the performance and the potential of our model in a simulation study and on real data applications
- …