73 research outputs found
Time-frequency transforms of white noises and Gaussian analytic functions
A family of Gaussian analytic functions (GAFs) has recently been linked to
the Gabor transform of white Gaussian noise [Bardenet et al., 2017]. This
answered pioneering work by Flandrin [2015], who observed that the zeros of the
Gabor transform of white noise had a very regular distribution and proposed
filtering algorithms based on the zeros of a spectrogram. The mathematical link
with GAFs provides a wealth of probabilistic results to inform the design of
such signal processing procedures. In this paper, we study in a systematic way
the link between GAFs and a class of time-frequency transforms of Gaussian
white noises on Hilbert spaces of signals. Our main observation is a conceptual
correspondence between pairs (transform, GAF) and generating functions for
classical orthogonal polynomials. This correspondence covers some classical
time-frequency transforms, such as the Gabor transform and the Daubechies-Paul
analytic wavelet transform. It also unveils new windowed discrete Fourier
transforms, which map white noises to fundamental GAFs. All these transforms
may thus be of interest to the research program `filtering with zeros'. We also
identify the GAF whose zeros are the extrema of the Gabor transform of the
white noise and derive their first intensity. Moreover, we discuss important
subtleties in defining a white noise and its transform on infinite dimensional
Hilbert spaces. Finally, we provide quantitative estimates concerning the
finite-dimensional approximations of these white noises, which is of practical
interest when it comes to implementing signal processing algorithms based on
GAFs.Comment: to appear in Applied and Computational Harmonic Analysi
Concentration inequalities for sampling without replacement
Concentration inequalities quantify the deviation of a random variable from a
fixed value. In spite of numerous applications, such as opinion surveys or
ecological counting procedures, few concentration results are known for the
setting of sampling without replacement from a finite population. Until now,
the best general concentration inequality has been a Hoeffding inequality due
to Serfling [Ann. Statist. 2 (1974) 39-48]. In this paper, we first improve on
the fundamental result of Serfling [Ann. Statist. 2 (1974) 39-48], and further
extend it to obtain a Bernstein concentration bound for sampling without
replacement. We then derive an empirical version of our bound that does not
require the variance to be known to the user.Comment: Published at http://dx.doi.org/10.3150/14-BEJ605 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Inference for determinantal point processes without spectral knowledge
Determinantal point processes (DPPs) are point process models that naturally
encode diversity between the points of a given realization, through a positive
definite kernel . DPPs possess desirable properties, such as exact sampling
or analyticity of the moments, but learning the parameters of kernel
through likelihood-based inference is not straightforward. First, the kernel
that appears in the likelihood is not , but another kernel related to
through an often intractable spectral decomposition. This issue is
typically bypassed in machine learning by directly parametrizing the kernel
, at the price of some interpretability of the model parameters. We follow
this approach here. Second, the likelihood has an intractable normalizing
constant, which takes the form of a large determinant in the case of a DPP over
a finite set of objects, and the form of a Fredholm determinant in the case of
a DPP over a continuous domain. Our main contribution is to derive bounds on
the likelihood of a DPP, both for finite and continuous domains. Unlike
previous work, our bounds are cheap to evaluate since they do not rely on
approximating the spectrum of a large matrix or an operator. Through usual
arguments, these bounds thus yield cheap variational inference and moderately
expensive exact Markov chain Monte Carlo inference methods for DPPs
Learning from DPPs via Sampling: Beyond HKPV and symmetry
Determinantal point processes (DPPs) have become a significant tool for
recommendation systems, feature selection, or summary extraction, harnessing
the intrinsic ability of these probabilistic models to facilitate sample
diversity. The ability to sample from DPPs is paramount to the empirical
investigation of these models. Most exact samplers are variants of a spectral
meta-algorithm due to Hough, Krishnapur, Peres and Vir\'ag (henceforth HKPV),
which is in general time and resource intensive. For DPPs with symmetric
kernels, scalable HKPV samplers have been proposed that either first downsample
the ground set of items, or force the kernel to be low-rank, using e.g.
Nystr\"om-type decompositions.
In the present work, we contribute a radically different approach than HKPV.
Exploiting the fact that many statistical and learning objectives can be
effectively accomplished by only sampling certain key observables of a DPP
(so-called linear statistics), we invoke an expression for the Laplace
transform of such an observable as a single determinant, which holds in
complete generality. Combining traditional low-rank approximation techniques
with Laplace inversion algorithms from numerical analysis, we show how to
directly approximate the distribution function of a linear statistic of a DPP.
This distribution function can then be used in hypothesis testing or to
actually sample the linear statistic, as per requirement. Our approach is
scalable and applies to very general DPPs, beyond traditional symmetric
kernels
Adaptive MCMC with online relabeling
When targeting a distribution that is artificially invariant under some
permutations, Markov chain Monte Carlo (MCMC) algorithms face the
label-switching problem, rendering marginal inference particularly cumbersome.
Such a situation arises, for example, in the Bayesian analysis of finite
mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM),
which self-calibrates its proposal distribution using an online estimate of the
covariance matrix of the target, are no exception. To address the
label-switching issue, relabeling algorithms associate a permutation to each
MCMC sample, trying to obtain reasonable marginals. In the case of adaptive
Metropolis (Bernoulli 7 (2001) 223-242), an online relabeling strategy is
required. This paper is devoted to the AMOR algorithm, a provably consistent
variant of AM that can cope with the label-switching problem. The idea is to
nest relabeling steps within the MCMC algorithm based on the estimation of a
single covariance matrix that is used both for adapting the covariance of the
proposal distribution in the Metropolis algorithm step and for online
relabeling. We compare the behavior of AMOR to similar relabeling methods. In
the case of compactly supported target distributions, we prove a strong law of
large numbers for AMOR and its ergodicity. These are the first results on the
consistency of an online relabeling algorithm to our knowledge. The proof
underlines latent relations between relabeling and vector quantization.Comment: Published at http://dx.doi.org/10.3150/13-BEJ578 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …