31,220 research outputs found
Mixture of Bilateral-Projection Two-dimensional Probabilistic Principal Component Analysis
The probabilistic principal component analysis (PPCA) is built upon a global
linear mapping, with which it is insufficient to model complex data variation.
This paper proposes a mixture of bilateral-projection probabilistic principal
component analysis model (mixB2DPPCA) on 2D data. With multi-components in the
mixture, this model can be seen as a soft cluster algorithm and has capability
of modeling data with complex structures. A Bayesian inference scheme has been
proposed based on the variational EM (Expectation-Maximization) approach for
learning model parameters. Experiments on some publicly available databases
show that the performance of mixB2DPPCA has been largely improved, resulting in
more accurate reconstruction errors and recognition rates than the existing
PCA-based algorithms
The interplay of microscopic and mesoscopic structure in complex networks
Not all nodes in a network are created equal. Differences and similarities
exist at both individual node and group levels. Disentangling single node from
group properties is crucial for network modeling and structural inference.
Based on unbiased generative probabilistic exponential random graph models and
employing distributive message passing techniques, we present an efficient
algorithm that allows one to separate the contributions of individual nodes and
groups of nodes to the network structure. This leads to improved detection
accuracy of latent class structure in real world data sets compared to models
that focus on group structure alone. Furthermore, the inclusion of hitherto
neglected group specific effects in models used to assess the statistical
significance of small subgraph (motif) distributions in networks may be
sufficient to explain most of the observed statistics. We show the predictive
power of such generative models in forecasting putative gene-disease
associations in the Online Mendelian Inheritance in Man (OMIM) database. The
approach is suitable for both directed and undirected uni-partite as well as
for bipartite networks
Tied factor analysis for face recognition across large pose differences
Face recognition algorithms perform very unreliably when the pose of the probe face is different from the gallery face: typical feature vectors vary more with pose than with identity. We propose a generative model that creates a one-to-many mapping from an idealized âidentityâ space to the observed data space. In identity space, the representation for each individual does not vary with pose. We model the measured feature vector as being generated by a pose-contingent linear transformation of the identity variable in the presence of Gaussian noise. We term this model âtiedâ factor analysis. The choice of linear transformation (factors) depends on the pose, but the loadings are constant (tied) for a given individual. We use the EM algorithm to estimate the linear transformations and the noise parameters from training data.
We propose a probabilistic distance metric that allows a full posterior over possible matches to be established. We introduce a novel feature extraction process and investigate recognition performance by using the FERET, XM2VTS, and PIE databases. Recognition performance compares favorably with contemporary approaches
High-Dimensional Regression with Gaussian Mixtures and Partially-Latent Response Variables
In this work we address the problem of approximating high-dimensional data
with a low-dimensional representation. We make the following contributions. We
propose an inverse regression method which exchanges the roles of input and
response, such that the low-dimensional variable becomes the regressor, and
which is tractable. We introduce a mixture of locally-linear probabilistic
mapping model that starts with estimating the parameters of inverse regression,
and follows with inferring closed-form solutions for the forward parameters of
the high-dimensional regression problem of interest. Moreover, we introduce a
partially-latent paradigm, such that the vector-valued response variable is
composed of both observed and latent entries, thus being able to deal with data
contaminated by experimental artifacts that cannot be explained with noise
models. The proposed probabilistic formulation could be viewed as a
latent-variable augmentation of regression. We devise expectation-maximization
(EM) procedures based on a data augmentation strategy which facilitates the
maximum-likelihood search over the model parameters. We propose two
augmentation schemes and we describe in detail the associated EM inference
procedures that may well be viewed as generalizations of a number of EM
regression, dimension reduction, and factor analysis algorithms. The proposed
framework is validated with both synthetic and real data. We provide
experimental evidence that our method outperforms several existing regression
techniques
- âŠ