2,493 research outputs found
An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation
Modelling of multivariate densities is a core component in many signal
processing, pattern recognition and machine learning applications. The
modelling is often done via Gaussian mixture models (GMMs), which use
computationally expensive and potentially unstable training algorithms. We
provide an overview of a fast and robust implementation of GMMs in the C++
language, employing multi-threaded versions of the Expectation Maximisation
(EM) and k-means training algorithms. Multi-threading is achieved through
reformulation of the EM and k-means algorithms into a MapReduce-like framework.
Furthermore, the implementation uses several techniques to improve numerical
stability and modelling accuracy. We demonstrate that the multi-threaded
implementation achieves a speedup of an order of magnitude on a recent 16 core
machine, and that it can achieve higher modelling accuracy than a previously
well-established publically accessible implementation. The multi-threaded
implementation is included as a user-friendly class in recent releases of the
open source Armadillo C++ linear algebra library. The library is provided under
the permissive Apache~2.0 license, allowing unencumbered use in commercial
products
Statistical Compressed Sensing of Gaussian Mixture Models
A novel framework of compressed sensing, namely statistical compressed
sensing (SCS), that aims at efficiently sampling a collection of signals that
follow a statistical distribution, and achieving accurate reconstruction on
average, is introduced. SCS based on Gaussian models is investigated in depth.
For signals that follow a single Gaussian model, with Gaussian or Bernoulli
sensing matrices of O(k) measurements, considerably smaller than the O(k
log(N/k)) required by conventional CS based on sparse models, where N is the
signal dimension, and with an optimal decoder implemented via linear filtering,
significantly faster than the pursuit decoders applied in conventional CS, the
error of SCS is shown tightly upper bounded by a constant times the best k-term
approximation error, with overwhelming probability. The failure probability is
also significantly smaller than that of conventional sparsity-oriented CS.
Stronger yet simpler results further show that for any sensing matrix, the
error of Gaussian SCS is upper bounded by a constant times the best k-term
approximation with probability one, and the bound constant can be efficiently
calculated. For Gaussian mixture models (GMMs), that assume multiple Gaussian
distributions and that each signal follows one of them with an unknown index, a
piecewise linear estimator is introduced to decode SCS. The accuracy of model
selection, at the heart of the piecewise linear decoder, is analyzed in terms
of the properties of the Gaussian distributions and the number of sensing
measurements. A maximum a posteriori expectation-maximization algorithm that
iteratively estimates the Gaussian models parameters, the signals model
selection, and decodes the signals, is presented for GMM-based SCS. In real
image sensing applications, GMM-based SCS is shown to lead to improved results
compared to conventional CS, at a considerably lower computational cost
On Quantifying Qualitative Geospatial Data: A Probabilistic Approach
Living in the era of data deluge, we have witnessed a web content explosion,
largely due to the massive availability of User-Generated Content (UGC). In
this work, we specifically consider the problem of geospatial information
extraction and representation, where one can exploit diverse sources of
information (such as image and audio data, text data, etc), going beyond
traditional volunteered geographic information. Our ambition is to include
available narrative information in an effort to better explain geospatial
relationships: with spatial reasoning being a basic form of human cognition,
narratives expressing such experiences typically contain qualitative spatial
data, i.e., spatial objects and spatial relationships.
To this end, we formulate a quantitative approach for the representation of
qualitative spatial relations extracted from UGC in the form of texts. The
proposed method quantifies such relations based on multiple text observations.
Such observations provide distance and orientation features which are utilized
by a greedy Expectation Maximization-based (EM) algorithm to infer a
probability distribution over predefined spatial relationships; the latter
represent the quantified relationships under user-defined probabilistic
assumptions. We evaluate the applicability and quality of the proposed approach
using real UGC data originating from an actual travel blog text corpus. To
verify the quality of the result, we generate grid-based maps visualizing the
spatial extent of the various relations
Speaker verification using sequence discriminant support vector machines
This paper presents a text-independent speaker verification system using support vector machines (SVMs) with score-space kernels. Score-space kernels generalize Fisher kernels and are based on underlying generative models such as Gaussian mixture models (GMMs). This approach provides direct discrimination between whole sequences, in contrast with the frame-level approaches at the heart of most current systems. The resultant SVMs have a very high dimensionality since it is related to the number of parameters in the underlying generative model. To address problems that arise in the resultant optimization we introduce a technique called spherical normalization that preconditions the Hessian matrix. We have performed speaker verification experiments using the PolyVar database. The SVM system presented here reduces the relative error rates by 34% compared to a GMM likelihood ratio system
Lip segmentation using adaptive color space training
In audio-visual speech recognition (AVSR), it is beneficial
to use lip boundary information in addition to texture-dependent
features. In this paper, we propose an automatic lip segmentation
method that can be used in AVSR systems. The algorithm
consists of the following steps: face detection, lip corners extraction,
adaptive color space training for lip and non-lip regions
using Gaussian mixture models (GMMs), and curve evolution
using level-set formulation based on region and image
gradients fields. Region-based fields are obtained using adapted
GMM likelihoods. We have tested the proposed algorithm on a
database (SU-TAV) of 100 facial images and obtained objective
performance results by comparing automatic lip segmentations
with hand-marked ground truth segmentations. Experimental
results are promising and much work has to be done to improve
the robustness of the proposed method
- ā¦