7,294 research outputs found
Infinite Mixtures of Multivariate Gaussian Processes
This paper presents a new model called infinite mixtures of multivariate
Gaussian processes, which can be used to learn vector-valued functions and
applied to multitask learning. As an extension of the single multivariate
Gaussian process, the mixture model has the advantages of modeling multimodal
data and alleviating the computationally cubic complexity of the multivariate
Gaussian process. A Dirichlet process prior is adopted to allow the (possibly
infinite) number of mixture components to be automatically inferred from
training data, and Markov chain Monte Carlo sampling techniques are used for
parameter and latent variable inference. Preliminary experimental results on
multivariate regression show the feasibility of the proposed model.Comment: Proceedings of the International Conference on Machine Learning and
Cybernetics, 2013, pages 1011-101
Sparse covariance estimation in heterogeneous samples
Standard Gaussian graphical models (GGMs) implicitly assume that the
conditional independence among variables is common to all observations in the
sample. However, in practice, observations are usually collected form
heterogeneous populations where such assumption is not satisfied, leading in
turn to nonlinear relationships among variables. To tackle these problems we
explore mixtures of GGMs; in particular, we consider both infinite mixture
models of GGMs and infinite hidden Markov models with GGM emission
distributions. Such models allow us to divide a heterogeneous population into
homogenous groups, with each cluster having its own conditional independence
structure. The main advantage of considering infinite mixtures is that they
allow us easily to estimate the number of number of subpopulations in the
sample. As an illustration, we study the trends in exchange rate fluctuations
in the pre-Euro era. This example demonstrates that the models are very
flexible while providing extremely interesting interesting insights into
real-life applications
Multiplying a Gaussian Matrix by a Gaussian Vector
We provide a new and simple characterization of the multivariate generalized
Laplace distribution. In particular, this result implies that the product of a
Gaussian matrix with independent and identically distributed columns by an
independent isotropic Gaussian vector follows a symmetric multivariate
generalized Laplace distribution
Identifying Mixtures of Mixtures Using Bayesian Estimation
The use of a finite mixture of normal distributions in model-based clustering
allows to capture non-Gaussian data clusters. However, identifying the clusters
from the normal components is challenging and in general either achieved by
imposing constraints on the model or by using post-processing procedures.
Within the Bayesian framework we propose a different approach based on sparse
finite mixtures to achieve identifiability. We specify a hierarchical prior
where the hyperparameters are carefully selected such that they are reflective
of the cluster structure aimed at. In addition this prior allows to estimate
the model using standard MCMC sampling methods. In combination with a
post-processing approach which resolves the label switching issue and results
in an identified model, our approach allows to simultaneously (1) determine the
number of clusters, (2) flexibly approximate the cluster distributions in a
semi-parametric way using finite mixtures of normals and (3) identify
cluster-specific parameters and classify observations. The proposed approach is
illustrated in two simulation studies and on benchmark data sets.Comment: 49 page
On approximating copulas by finite mixtures
Copulas are now frequently used to approximate or estimate multivariate
distributions because of their ability to take into account the multivariate
dependence of the variables while controlling the approximation properties of
the marginal densities. Copula based multivariate models can often also be more
parsimonious than fitting a flexible multivariate model, such as a mixture of
normals model, directly to the data. However, to be effective, it is imperative
that the family of copula models considered is sufficiently flexible. Although
finite mixtures of copulas have been used to construct flexible families of
copulas, their approximation properties are not well understood and we show
that natural candidates such as mixtures of elliptical copulas and mixtures of
Archimedean copulas cannot approximate a general copula arbitrarily well. Our
article develops fundamental tools for approximating a general copula
arbitrarily well by a mixture and proposes a family of finite mixtures that can
do so. We illustrate empirically on a financial data set that our approach for
estimating a copula can be much more parsimonious and results in a better fit
than approximating the copula by a mixture of normal copulas.Comment: 26 pages and 1 figure and 2 table
Beta-Product Poisson-Dirichlet Processes
Time series data may exhibit clustering over time and, in a multiple time
series context, the clustering behavior may differ across the series. This
paper is motivated by the Bayesian non--parametric modeling of the dependence
between the clustering structures and the distributions of different time
series. We follow a Dirichlet process mixture approach and introduce a new
class of multivariate dependent Dirichlet processes (DDP). The proposed DDP are
represented in terms of vector of stick-breaking processes with dependent
weights. The weights are beta random vectors that determine different and
dependent clustering effects along the dimension of the DDP vector. We discuss
some theoretical properties and provide an efficient Monte Carlo Markov Chain
algorithm for posterior computation. The effectiveness of the method is
illustrated with a simulation study and an application to the United States and
the European Union industrial production indexes
- …