52,854 research outputs found
An Introduction to the Practical and Theoretical Aspects of Mixture-of-Experts Modeling
Mixture-of-experts (MoE) models are a powerful paradigm for modeling of data
arising from complex data generating processes (DGPs). In this article, we
demonstrate how different MoE models can be constructed to approximate the
underlying DGPs of arbitrary types of data. Due to the probabilistic nature of
MoE models, we propose the maximum quasi-likelihood (MQL) estimator as a method
for estimating MoE model parameters from data, and we provide conditions under
which MQL estimators are consistent and asymptotically normal. The blockwise
minorization-maximizatoin (blockwise-MM) algorithm framework is proposed as an
all-purpose method for constructing algorithms for obtaining MQL estimators. An
example derivation of a blockwise-MM algorithm is provided. We then present a
method for constructing information criteria for estimating the number of
components in MoE models and provide justification for the classic Bayesian
information criterion (BIC). We explain how MoE models can be used to conduct
classification, clustering, and regression and we illustrate these applications
via a pair of worked examples
Hidden Truncation Hyperbolic Distributions, Finite Mixtures Thereof, and Their Application for Clustering
A hidden truncation hyperbolic (HTH) distribution is introduced and finite
mixtures thereof are applied for clustering. A stochastic representation of the
HTH distribution is given and a density is derived. A hierarchical
representation is described, which aids in parameter estimation. Finite
mixtures of HTH distributions are presented and their identifiability is
proved. The convexity of the HTH distribution is discussed, which is important
in clustering applications, and some theoretical results in this direction are
presented. The relationship between the HTH distribution and other skewed
distributions in the literature is discussed. Illustrations are provided ---
both of the HTH distribution and application of finite mixtures thereof for
clustering
A Multivariate Poisson-Log Normal Mixture Model for Clustering Transcriptome Sequencing Data
High-dimensional data of discrete and skewed nature is commonly encountered
in high-throughput sequencing studies. Analyzing the network itself or the
interplay between genes in this type of data continues to present many
challenges. As data visualization techniques become cumbersome for higher
dimensions and unconvincing when there is no clear separation between
homogeneous subgroups within the data, cluster analysis provides an intuitive
alternative. The aim of applying mixture model-based clustering in this context
is to discover groups of co-expressed genes, which can shed light on biological
functions and pathways of gene products. A mixture of multivariate Poisson-Log
Normal (MPLN) model is proposed for clustering of high-throughput transcriptome
sequencing data. The MPLN model is able to fit a wide range of correlation and
overdispersion situations, and is ideal for modeling multivariate count data
from RNA sequencing studies. Parameter estimation is carried out via a Markov
chain Monte Carlo expectation-maximization algorithm (MCMC-EM), and information
criteria are used for model selection
A Mixture of Generalized Hyperbolic Distributions
We introduce a mixture of generalized hyperbolic distributions as an
alternative to the ubiquitous mixture of Gaussian distributions as well as
their near relatives of which the mixture of multivariate t and skew-t
distributions are predominant. The mathematical development of our mixture of
generalized hyperbolic distributions model relies on its relationship with the
generalized inverse Gaussian distribution. The latter is reviewed before our
mixture models are presented along with details of the aforesaid reliance.
Parameter estimation is outlined within the expectation-maximization framework
before the clustering performance of our mixture models is illustrated via
applications on simulated and real data. In particular, the ability of our
models to recover parameters for data from underlying Gaussian and skew-t
distributions is demonstrated. Finally, the role of Generalized hyperbolic
mixtures within the wider model-based clustering, classification, and density
estimation literature is discussed
Model-Based Multiple Instance Learning
While Multiple Instance (MI) data are point patterns -- sets or multi-sets of
unordered points -- appropriate statistical point pattern models have not been
used in MI learning. This article proposes a framework for model-based MI
learning using point process theory. Likelihood functions for point pattern
data derived from point process theory enable principled yet conceptually
transparent extensions of learning tasks, such as classification, novelty
detection and clustering, to point pattern data. Furthermore, tractable point
pattern models as well as solutions for learning and decision making from point
pattern data are developed.Comment: 16 pages, 15 figure
Dirichlet Process Parsimonious Mixtures for clustering
The parsimonious Gaussian mixture models, which exploit an eigenvalue
decomposition of the group covariance matrices of the Gaussian mixture, have
shown their success in particular in cluster analysis. Their estimation is in
general performed by maximum likelihood estimation and has also been considered
from a parametric Bayesian prospective. We propose new Dirichlet Process
Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric
formulation of these parsimonious Gaussian mixture models. The proposed DPPM
models are Bayesian nonparametric parsimonious mixture models that allow to
simultaneously infer the model parameters, the optimal number of mixture
components and the optimal parsimonious mixture structure from the data. We
develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of
the developed DPMM models and provide a Bayesian model selection framework by
using Bayes factors. We apply them to cluster simulated data and real data
sets, and compare them to the standard parsimonious mixture models. The
obtained results highlight the effectiveness of the proposed nonparametric
parsimonious mixture models as a good nonparametric alternative for the
parametric parsimonious models
A Mixture of Generalized Hyperbolic Factor Analyzers
Model-based clustering imposes a finite mixture modelling structure on data
for clustering. Finite mixture models assume that the population is a convex
combination of a finite number of densities, the distribution within each
population is a basic assumption of each particular model. Among all
distributions that have been tried, the generalized hyperbolic distribution has
the advantage that is a generalization of several other methods, such as the
Gaussian distribution, the skew t-distribution, etc. With specific parameters,
it can represent either a symmetric or a skewed distribution. While its
inherent flexibility is an advantage in many ways, it means the estimation of
more parameters than its special and limiting cases. The aim of this work is to
propose a mixture of generalized hyperbolic factor analyzers to introduce
parsimony and extend the method to high dimensional data. This work can be seen
as an extension of the mixture of factor analyzers model to generalized
hyperbolic mixtures. The performance of our generalized hyperbolic factor
analyzers is illustrated on real data, where it performs favourably compared to
its Gaussian analogue
Extending mixtures of factor models using the restricted multivariate skew-normal distribution
The mixture of factor analyzers (MFA) model provides a powerful tool for
analyzing high-dimensional data as it can reduce the number of free parameters
through its factor-analytic representation of the component covariance
matrices. This paper extends the MFA model to incorporate a restricted version
of the multivariate skew-normal distribution to model the distribution of the
latent component factors, called mixtures of skew-normal factor analyzers
(MSNFA). The proposed MSNFA model allows us to relax the need for the normality
assumption for the latent factors in order to accommodate skewness in the
observed data. The MSNFA model thus provides an approach to model-based density
estimation and clustering of high-dimensional data exhibiting asymmetric
characteristics. A computationally feasible ECM algorithm is developed for
computing the maximum likelihood estimates of the parameters. Model selection
can be made on the basis of three commonly used information-based criteria. The
potential of the proposed methodology is exemplified through applications to
two real examples, and the results are compared with those obtained from
fitting the MFA model
Multiple Scaled Contaminated Normal Distribution and Its Application in Clustering
The multivariate contaminated normal (MCN) distribution represents a simple
heavy-tailed generalization of the multivariate normal (MN) distribution to
model elliptical contoured scatters in the presence of mild outliers, referred
to as "bad" points. The MCN can also automatically detect bad points. The price
of these advantages is two additional parameters, both with specific and useful
interpretations: proportion of good observations and degree of contamination.
However, points may be bad in some dimensions but good in others. The use of an
overall proportion of good observations and of an overall degree of
contamination is limiting. To overcome this limitation, we propose a multiple
scaled contaminated normal (MSCN) distribution with a proportion of good
observations and a degree of contamination for each dimension. Once the model
is fitted, each observation has a posterior probability of being good with
respect to each dimension. Thanks to this probability, we have a method for
simultaneous directional robust estimation of the parameters of the MN
distribution based on down-weighting and for the automatic directional
detection of bad points by means of maximum a posteriori probabilities. The
term "directional" is added to specify that the method works separately for
each dimension. Mixtures of MSCN distributions are also proposed as an
application of the proposed model for robust clustering. An extension of the EM
algorithm is used for parameter estimation based on the maximum likelihood
approach. Real and simulated data are used to show the usefulness of our
mixture with respect to well-established mixtures of symmetric distributions
with heavy tails
A Mixture of SDB Skew-t Factor Analyzers
Mixtures of skew-t distributions offer a flexible choice for model-based
clustering. A mixture model of this sort can be implemented using a variety of
formulations of the skew-t distribution. Herein we develop a mixture of skew-t
factor analyzers model for clustering of high-dimensional data using a flexible
formulation of the skew-t distribution. Methodological details of our approach,
which represents an extension of the mixture of factor analyzers model to a
flexible skew-t distribution, are outlined and details of parameter estimation
are provided. Clustering results are illustrated and compared to an alternative
formulation of the mixture of skew-t factor analyzers model as well as the
mixture of factor analyzers model
- …