12,506 research outputs found
Latent Gaussian processes for distribution estimation of multivariate categorical data
Multivariate categorical data occur in many applications of machine learning.
One of the main difficulties with these vectors of categorical variables is
sparsity. The number of possible observations grows exponentially with vector
length, but dataset diversity might be poor in comparison. Recent models have
gained significant improvement in supervised tasks with this data. These models
embed observations in a continuous space to capture similarities between them.
Building on these ideas we propose a Bayesian model for the unsupervised task
of distribution estimation of multivariate categorical data. We model vectors
of categorical variables as generated from a non-linear transformation of a
continuous latent space. Non-linearity captures multi-modality in the
distribution. The continuous representation addresses sparsity. Our model ties
together many existing models, linking the linear categorical latent Gaussian
model, the Gaussian process latent variable model, and Gaussian process
classification. We derive inference for our model based on recent developments
in sampling based variational inference. We show empirically that the model
outperforms its linear and discrete counterparts in imputation tasks of sparse
data.YG is supported by the Google European fellowship in Machine Learning.This is the final version of the article. It first appeared from Microtome Publishing via http://jmlr.org/proceedings/papers/v37/gala15.htm
The supervised IBP: neighbourhood preserving infinite latent feature models
We propose a probabilistic model to infer supervised latent variables in the Hamming space from observed data. Our model allows simultaneous inference of the number of binary latent variables, and their values. The latent variables preserve neighbourhood structure of the data in a sense that objects in the same semantic concept have similar latent values, and objects in different concepts have dissimilar latent values. We formulate the supervised infinite latent variable problem based on an intuitive principle of pulling objects together if they are of the same type, and pushing them apart if they are not. We then combine this principle with a flexible Indian Buffet Process prior on the latent variables. We show that the inferred supervised latent variables can be directly used to perform a nearest neighbour search for the purpose of retrieval. We introduce a new application of dynamically extending hash codes, and show how to effectively couple the structure of the hash codes with continuously growing structure of the neighbourhood preserving infinite latent feature space
Gibbs Max-margin Topic Models with Data Augmentation
Max-margin learning is a powerful approach to building classifiers and
structured output predictors. Recent work on max-margin supervised topic models
has successfully integrated it with Bayesian topic models to discover
discriminative latent semantic structures and make accurate predictions for
unseen testing data. However, the resulting learning problems are usually hard
to solve because of the non-smoothness of the margin loss. Existing approaches
to building max-margin supervised topic models rely on an iterative procedure
to solve multiple latent SVM subproblems with additional mean-field assumptions
on the desired posterior distributions. This paper presents an alternative
approach by defining a new max-margin loss. Namely, we present Gibbs max-margin
supervised topic models, a latent variable Gibbs classifier to discover hidden
topic representations for various tasks, including classification, regression
and multi-task learning. Gibbs max-margin supervised topic models minimize an
expected margin loss, which is an upper bound of the existing margin loss
derived from an expected prediction rule. By introducing augmented variables
and integrating out the Dirichlet variables analytically by conjugacy, we
develop simple Gibbs sampling algorithms with no restricting assumptions and no
need to solve SVM subproblems. Furthermore, each step of the
"augment-and-collapse" Gibbs sampling algorithms has an analytical conditional
distribution, from which samples can be easily drawn. Experimental results
demonstrate significant improvements on time efficiency. The classification
performance is also significantly improved over competitors on binary,
multi-class and multi-label classification tasks.Comment: 35 page
- …