467 research outputs found

    The Poisson transform for unnormalised statistical models

    Full text link
    Contrary to standard statistical models, unnormalised statistical models only specify the likelihood function up to a constant. While such models are natural and popular, the lack of normalisation makes inference much more difficult. Here we show that inferring the parameters of a unnormalised model on a space Ω\Omega can be mapped onto an equivalent problem of estimating the intensity of a Poisson point process on Ω\Omega. The unnormalised statistical model now specifies an intensity function that does not need to be normalised. Effectively, the normalisation constant may now be inferred as just another parameter, at no loss of information. The result can be extended to cover non-IID models, which includes for example unnormalised models for sequences of graphs (dynamical graphs), or for sequences of binary vectors. As a consequence, we prove that unnormalised parameteric inference in non-IID models can be turned into a semi-parametric estimation problem. Moreover, we show that the noise-contrastive divergence of Gutmann & Hyv\"arinen (2012) can be understood as an approximation of the Poisson transform, and extended to non-IID settings. We use our results to fit spatial Markov chain models of eye movements, where the Poisson transform allows us to turn a highly non-standard model into vanilla semi-parametric logistic regression

    Power spectrum and intermittency of the transmitted flux of QSOs Ly-alpha absorption spectra

    Full text link
    Using a set of 28 high resolution, high signal to noise ratio (S/N) QSO Ly-alpha absorption spectra, we investigate the non-Gaussian features of the transmitted flux fluctuations, and their effect upon the power spectrum of this field. We find that the spatial distribution of the local power of the transmitted flux on scales k >= 0.05 s/km is highly spiky or intermittent. The probability distribution functions (PDFs) of the local power are long-tailed. The power on small scales is dominated by small probability events, and consequently, the uncertainty in the power spectrum of the transmitted flux field is generally large. This uncertainty arises due to the slow convergence of an intermittent field to a Gaussian limit required by the central limit theorem (CLT). To reduce this uncertainty, it is common to estimate the error of the power spectrum by selecting subsamples with an "optimal" size. We show that this conventional method actually does not calculate the variance of the original intermittent field but of a Gaussian field. Based on the analysis of intermittency, we propose an algorithm to calculate the error. It is based on a bootstrap re-sampling among all independent local power modes. This estimation doesn't require any extra parameter like the size of the subsamples, and is sensitive to the intermittency of the fields. This method effectively reduces the uncertainty in the power spectrum when the number of independent modes matches the condition of the CLT convergence.Comment: 26 pages (incl. figures). Accepted for publication in MNRA

    Fast matrix computations for functional additive models

    Full text link
    It is common in functional data analysis to look at a set of related functions: a set of learning curves, a set of brain signals, a set of spatial maps, etc. One way to express relatedness is through an additive model, whereby each individual function gi(x)g_{i}\left(x\right) is assumed to be a variation around some shared mean f(x)f(x). Gaussian processes provide an elegant way of constructing such additive models, but suffer from computational difficulties arising from the matrix operations that need to be performed. Recently Heersink & Furrer have shown that functional additive model give rise to covariance matrices that have a specific form they called quasi-Kronecker (QK), whose inverses are relatively tractable. We show that under additional assumptions the two-level additive model leads to a class of matrices we call restricted quasi-Kronecker, which enjoy many interesting properties. In particular, we formulate matrix factorisations whose complexity scales only linearly in the number of functions in latent field, an enormous improvement over the cubic scaling of na\"ive approaches. We describe how to leverage the properties of rQK matrices for inference in Latent Gaussian Models

    Replica theory for learning curves for Gaussian processes on random graphs

    Full text link
    Statistical physics approaches can be used to derive accurate predictions for the performance of inference methods learning from potentially noisy data, as quantified by the learning curve defined as the average error versus number of training examples. We analyse a challenging problem in the area of non-parametric inference where an effectively infinite number of parameters has to be learned, specifically Gaussian process regression. When the inputs are vertices on a random graph and the outputs noisy function values, we show that replica techniques can be used to obtain exact performance predictions in the limit of large graphs. The covariance of the Gaussian process prior is defined by a random walk kernel, the discrete analogue of squared exponential kernels on continuous spaces. Conventionally this kernel is normalised only globally, so that the prior variance can differ between vertices; as a more principled alternative we consider local normalisation, where the prior variance is uniform

    A multiscale regularized restoration algorithm for XMM-Newton data

    Get PDF
    We introduce a new multiscale restoration algorithm for images with few photons counts and its use for denoising XMM data. We use a thresholding of the wavelet space so as to remove the noise contribution at each scale while preserving the multiscale information of the signal. Contrary to other algorithms the signal restoration process is the same whatever the signal to noise ratio is. Thresholds according to a Poisson noise process are indeed computed analytically at each scale thanks to the use of the unnormalized Haar wavelet transform. Promising preliminary results are obtained on X-ray data for Abell 2163 with the computation of a temperature map.Comment: To appear in the Proceedings of `Galaxy Clusters and the High Redshift Universe Observed in X-rays', XXIth Moriond Astrophysics Meeting (March 2001), Eds. Doris Neumann et a

    A Bayesian approach to discrete object detection in astronomical datasets

    Full text link
    A Bayesian approach is presented for detecting and characterising the signal from discrete objects embedded in a diffuse background. The approach centres around the evaluation of the posterior distribution for the parameters of the discrete objects, given the observed data, and defines the theoretically-optimal procedure for parametrised object detection. Two alternative strategies are investigated: the simultaneous detection of all the discrete objects in the dataset, and the iterative detection of objects. In both cases, the parameter space characterising the object(s) is explored using Markov-Chain Monte-Carlo sampling. For the iterative detection of objects, another approach is to locate the global maximum of the posterior at each iteration using a simulated annealing downhill simplex algorithm. The techniques are applied to a two-dimensional toy problem consisting of Gaussian objects embedded in uncorrelated pixel noise. A cosmological illustration of the iterative approach is also presented, in which the thermal and kinetic Sunyaev-Zel'dovich effects from clusters of galaxies are detected in microwave maps dominated by emission from primordial cosmic microwave background anisotropies.Comment: 20 pages, 12 figures, accepted by MNRAS; contains some additional material in response to referee's comment

    Is infinity that far? A Bayesian nonparametric perspective of finite mixture models

    Get PDF
    Mixture models are one of the most widely used statistical tools when dealing with data from heterogeneous populations. Following a Bayesian nonparametric perspective, we introduce a new class of priors: the Normalized Independent Point Process. We investigate the probabilistic properties of this new class and present many special cases. In particular, we provide an explicit formula for the distribution of the implied partition, as well as the posterior characterization of the new process in terms of the superposition of two discrete measures. We also provide consistency results. Moreover, we design both a marginal and a conditional algorithm for finite mixture models with a random number of components. These schemes are based on an auxiliary variable MCMC, which allows handling the otherwise intractable posterior distribution and overcomes the challenges associated with the Reversible Jump algorithm. We illustrate the performance and the potential of our model in a simulation study and on real data applications
    • …
    corecore