59,794 research outputs found
Infinite Mixtures of Multivariate Gaussian Processes
This paper presents a new model called infinite mixtures of multivariate
Gaussian processes, which can be used to learn vector-valued functions and
applied to multitask learning. As an extension of the single multivariate
Gaussian process, the mixture model has the advantages of modeling multimodal
data and alleviating the computationally cubic complexity of the multivariate
Gaussian process. A Dirichlet process prior is adopted to allow the (possibly
infinite) number of mixture components to be automatically inferred from
training data, and Markov chain Monte Carlo sampling techniques are used for
parameter and latent variable inference. Preliminary experimental results on
multivariate regression show the feasibility of the proposed model.Comment: Proceedings of the International Conference on Machine Learning and
Cybernetics, 2013, pages 1011-101
Infinite factorization of multiple non-parametric views
Combined analysis of multiple data sources has increasing application interest, in particular for distinguishing shared and source-specific aspects. We extend this rationale of classical canonical correlation analysis into a flexible, generative and non-parametric clustering
setting, by introducing a novel non-parametric hierarchical
mixture model. The lower level of the model describes each source with a flexible non-parametric mixture, and the top level combines these to describe commonalities of the sources. The lower-level clusters arise from hierarchical Dirichlet Processes, inducing an infinite-dimensional contingency table between the views. The commonalities between the sources are modeled by an infinite block
model of the contingency table, interpretable as non-negative factorization of infinite matrices, or as a prior for infinite contingency tables. With Gaussian mixture components plugged in for continuous measurements, the model is applied to two views of genes, mRNA expression and abundance of the produced proteins, to expose groups of genes that are co-regulated in either or both of the views.
Cluster analysis of co-expression is a standard simple way of screening for co-regulation, and the two-view analysis extends the approach to distinguishing between pre- and post-translational regulation
The Infinite Mixture of Infinite Gaussian Mixtures
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined. Herein, we present the infinite mixture of infinite Gaussian mixtures (I2GMM) for more flexible modeling of data sets with skewed and multi-modal cluster distributions. Instead of using a single Gaussian for each cluster as in the standard DPMG model, the generative model of I2GMM uses a single DPMG for each cluster. The individual DPMGs are linked together through centering of their base distributions at the atoms of a higher level DP prior. Inference is performed by a collapsed Gibbs sampler that also enables partial parallelization. Experimental results on several artificial and real-world data sets suggest the proposed I2GMM model can predict clusters more accurately than existing variational Bayes and Gibbs sampler versions of DPMG
Multidimensional Membership Mixture Models
We present the multidimensional membership mixture (M3) models where every
dimension of the membership represents an independent mixture model and each
data point is generated from the selected mixture components jointly. This is
helpful when the data has a certain shared structure. For example, three unique
means and three unique variances can effectively form a Gaussian mixture model
with nine components, while requiring only six parameters to fully describe it.
In this paper, we present three instantiations of M3 models (together with the
learning and inference algorithms): infinite, finite, and hybrid, depending on
whether the number of mixtures is fixed or not. They are built upon Dirichlet
process mixture models, latent Dirichlet allocation, and a combination
respectively. We then consider two applications: topic modeling and learning 3D
object arrangements. Our experiments show that our M3 models achieve better
performance using fewer topics than many classic topic models. We also observe
that topics from the different dimensions of M3 models are meaningful and
orthogonal to each other.Comment: 9 pages, 7 figure
A non-parametric hierarchical clustering model
© 2015 IEEE. We present a novel non-parametric clustering model using Gaussian mixture model (NHCM). NHCM uses a novel Dirichlet process (DP) prior allowing for more flexible modeling of the data, where the base distribution of DP is itself an infinite mixture of Gaussian conjugate prior. NHCM can be thought of as hierarchical clustering model, in which the low level base prior governs the distribution of the data points forming sub-clusters, and the higher level prior governs the distribution of the sub-clusters forming clusters. Using this hierarchical configuration, we can maintain low complexity of the model and allow for clustering skewed complex data. To perform inference, we propose a Gibbs sampling algorithm. Empirical investigations have been carried out to analyse the efficiency of the proposed clustering model
A nonparametric Bayesian approach toward robot learning by demonstration
In the past years, many authors have considered application of machine learning methodologies to effect robot learning by demonstration. Gaussian mixture regression (GMR) is one of the most successful methodologies used for this purpose. A major limitation of GMR models concerns automatic selection of the proper number of model states, i.e., the number of model component densities. Existing methods, including likelihood- or entropy-based criteria, usually tend to yield noisy model size estimates while imposing heavy computational requirements. Recently, Dirichlet process (infinite) mixture models have emerged in the cornerstone of nonparametric Bayesian statistics as promising candidates for clustering applications where the number of clusters is unknown a priori. Under this motivation, to resolve the aforementioned issues of GMR-based methods for robot learning by demonstration, in this paper we introduce a nonparametric Bayesian formulation for the GMR model, the Dirichlet process GMR model. We derive an efficient variational Bayesian inference algorithm for the proposed model, and we experimentally investigate its efficacy as a robot learning by demonstration methodology, considering a number of demanding robot learning by demonstration scenarios
- …