10 research outputs found
DPPy: Sampling Determinantal Point Processes with Python
International audienceDeterminantal point processes (DPPs) are specific probability distributions over clouds of points that are used as models and computational tools across physics, probability, statistics, and more recently machine learning. Sampling from DPPs is a challenge and therefore we present DPPy, a Python toolbox that gathers known exact and approximate sampling algorithms. The project is hosted on GitHub and equipped with an extensive documentation. This documentation takes the form of a short survey of DPPs and relates each mathematical property with DPPy objects
Learning from DPPs via Sampling: Beyond HKPV and symmetry
Determinantal point processes (DPPs) have become a significant tool for
recommendation systems, feature selection, or summary extraction, harnessing
the intrinsic ability of these probabilistic models to facilitate sample
diversity. The ability to sample from DPPs is paramount to the empirical
investigation of these models. Most exact samplers are variants of a spectral
meta-algorithm due to Hough, Krishnapur, Peres and Vir\'ag (henceforth HKPV),
which is in general time and resource intensive. For DPPs with symmetric
kernels, scalable HKPV samplers have been proposed that either first downsample
the ground set of items, or force the kernel to be low-rank, using e.g.
Nystr\"om-type decompositions.
In the present work, we contribute a radically different approach than HKPV.
Exploiting the fact that many statistical and learning objectives can be
effectively accomplished by only sampling certain key observables of a DPP
(so-called linear statistics), we invoke an expression for the Laplace
transform of such an observable as a single determinant, which holds in
complete generality. Combining traditional low-rank approximation techniques
with Laplace inversion algorithms from numerical analysis, we show how to
directly approximate the distribution function of a linear statistic of a DPP.
This distribution function can then be used in hypothesis testing or to
actually sample the linear statistic, as per requirement. Our approach is
scalable and applies to very general DPPs, beyond traditional symmetric
kernels
Asymptotic Equivalence of Fixed-size and Varying-size Determinantal Point Processes
Determinantal Point Processes (DPPs) are popular models for point processes
with repulsion. They appear in numerous contexts, from physics to graph theory,
and display appealing theoretical properties. On the more practical side of
things, since DPPs tend to select sets of points that are some distance apart
(repulsion), they have been advocated as a way of producing random subsets with
high diversity. DPPs come in two variants: fixed-size and varying-size. A
sample from a varying-size DPP is a subset of random cardinality, while in
fixed-size "-DPPs" the cardinality is fixed. The latter makes more sense in
many applications, but unfortunately their computational properties are less
attractive, since, among other things, inclusion probabilities are harder to
compute. In this work we show that as the size of the ground set grows,
-DPPs and DPPs become equivalent, meaning that their inclusion probabilities
converge. As a by-product, we obtain saddlepoint formulas for inclusion
probabilities in -DPPs. These turn out to be extremely accurate, and suffer
less from numerical difficulties than exact methods do. Our results also
suggest that -DPPs and DPPs also have equivalent maximum likelihood
estimators. Finally, we obtain results on asymptotic approximations of
elementary symmetric polynomials which may be of independent interest
Fast sampling from -ensembles
We study sampling algorithms for -ensembles with time complexity less
than cubic in the cardinality of the ensemble. Following Dumitriu & Edelman
(2002), we see the ensemble as the eigenvalues of a random tridiagonal matrix,
namely a random Jacobi matrix. First, we provide a unifying and elementary
treatment of the tridiagonal models associated to the three classical Hermite,
Laguerre and Jacobi ensembles. For this purpose, we use simple changes of
variables between successive reparametrizations of the coefficients defining
the tridiagonal matrix. Second, we derive an approximate sampler for the
simulation of -ensembles, and illustrate how fast it can be for
polynomial potentials. This method combines a Gibbs sampler on Jacobi matrices
and the diagonalization of these matrices. In practice, even for large
ensembles, only a few Gibbs passes suffice for the marginal distribution of the
eigenvalues to fit the expected theoretical distribution. When the conditionals
in the Gibbs sampler can be simulated exactly, the same fast empirical
convergence is observed for the fluctuations of the largest eigenvalue. Our
experimental results support a conjecture by Krishnapur et al. (2016), that the
Gibbs chain on Jacobi matrices of size mixes in .Comment: 37 pages, 8 figures, code at https://github.com/guilgautier/DPP
Probabilistic Latent Factor Model for Collaborative Filtering with Bayesian Inference
Latent Factor Model (LFM) is one of the most successful methods for
Collaborative filtering (CF) in the recommendation system, in which both users
and items are projected into a joint latent factor space. Base on matrix
factorization applied usually in pattern recognition, LFM models user-item
interactions as inner products of factor vectors of user and item in that space
and can be efficiently solved by least square methods with optimal estimation.
However, such optimal estimation methods are prone to overfitting due to the
extreme sparsity of user-item interactions. In this paper, we propose a
Bayesian treatment for LFM, named Bayesian Latent Factor Model (BLFM). Based on
observed user-item interactions, we build a probabilistic factor model in which
the regularization is introduced via placing prior constraint on latent
factors, and the likelihood function is established over observations and
parameters. Then we draw samples of latent factors from the posterior
distribution with Variational Inference (VI) to predict expected value. We
further make an extension to BLFM, called BLFMBias, incorporating
user-dependent and item-dependent biases into the model for enhancing
performance. Extensive experiments on the movie rating dataset show the
effectiveness of our proposed models by compared with several strong baselines.Comment: 8 pages, 5 figures, ICPR2020 conferenc
Zonotope hit-and-run for efficient sampling from projection DPPs
International audienceDeterminantal point processes (DPPs) are distributions over sets of items that model diversity using kernels. Their applications in machine learning include summary extraction and recommendation systems. Yet, the cost of sampling from a DPP is prohibitive in large-scale applications, which has triggered an effort towards efficient approximate samplers. We build a novel MCMC sampler that combines ideas from combinatorial geometry, linear programming, and Monte Carlo methods to sample from DPPs with a fixed sample cardinality, also called projection DPPs. Our sampler leverages the ability of the hit-and-run MCMC kernel to efficiently move across convex bodies. Previous theoretical results yield a fast mixing time of our chain when targeting a distribution that is close to a projection DPP, but not a DPP in general. Our empirical results demonstrate that this extends to sampling projection DPPs, i.e., our sampler is more sample-efficient than previous approaches which in turn translates to faster convergence when dealing with costly-to-evaluate functions, such as summary extraction in our experiments
Zonotope hit-and-run for efficient sampling from projection DPPs
International audienceDeterminantal point processes (DPPs) are distributions over sets of items that model diversity using kernels. Their applications in machine learning include summary extraction and recommendation systems. Yet, the cost of sampling from a DPP is prohibitive in large-scale applications, which has triggered an effort towards efficient approximate samplers. We build a novel MCMC sampler that combines ideas from combinatorial geometry, linear programming, and Monte Carlo methods to sample from DPPs with a fixed sample cardinality, also called projection DPPs. Our sampler leverages the ability of the hit-and-run MCMC kernel to efficiently move across convex bodies. Previous theoretical results yield a fast mixing time of our chain when targeting a distribution that is close to a projection DPP, but not a DPP in general. Our empirical results demonstrate that this extends to sampling projection DPPs, i.e., our sampler is more sample-efficient than previous approaches which in turn translates to faster convergence when dealing with costly-to-evaluate functions, such as summary extraction in our experiments