1,830 research outputs found

    Parallel implementation of Expectation-Maximisation algorithm for the training of Gaussian Mixture Models

    Get PDF
    Most machine learning algorithms need to handle large data sets. This feature often leads to limitations on processing time and memory. The Expectation-Maximization (EM) is one of such algorithms, which is used to train one of the most commonly used parametric statistical models, the Gaussian Mixture Models (GMM). All steps of the algorithm are potentially parallelizable once they iterate over the entire data set. In this study, we propose a parallel implementation of EM for training GMM using CUDA. Experiments are performed with a UCI dataset and results show a speedup of 7 if compared to the sequential version. We have also carried out modifications to the code in order to provide better access to global memory and shared memory usage. We have achieved up to 56.4% of achieved occupancy, regardless the number of Gaussians considered in the set of experiments

    An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation

    Get PDF
    Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentially unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-threaded versions of the Expectation Maximisation (EM) and k-means training algorithms. Multi-threading is achieved through reformulation of the EM and k-means algorithms into a MapReduce-like framework. Furthermore, the implementation uses several techniques to improve numerical stability and modelling accuracy. We demonstrate that the multi-threaded implementation achieves a speedup of an order of magnitude on a recent 16 core machine, and that it can achieve higher modelling accuracy than a previously well-established publically accessible implementation. The multi-threaded implementation is included as a user-friendly class in recent releases of the open source Armadillo C++ linear algebra library. The library is provided under the permissive Apache~2.0 license, allowing unencumbered use in commercial products

    Plane-extraction from depth-data using a Gaussian mixture regression model

    Get PDF
    We propose a novel algorithm for unsupervised extraction of piecewise planar models from depth-data. Among other applications, such models are a good way of enabling autonomous agents (robots, cars, drones, etc.) to effectively perceive their surroundings and to navigate in three dimensions. We propose to do this by fitting the data with a piecewise-linear Gaussian mixture regression model whose components are skewed over planes, making them flat in appearance rather than being ellipsoidal, by embedding an outlier-trimming process that is formally incorporated into the proposed expectation-maximization algorithm, and by selectively fusing contiguous, coplanar components. Part of our motivation is an attempt to estimate more accurate plane-extraction by allowing each model component to make use of all available data through probabilistic clustering. The algorithm is thoroughly evaluated against a standard benchmark and is shown to rank among the best of the existing state-of-the-art methods.Comment: 11 pages, 2 figures, 1 tabl

    Statistical models for noise-robust speech recognition

    Get PDF
    A standard way of improving the robustness of speech recognition systems to noise is model compensation. This replaces a speech recogniser's distributions over clean speech by ones over noise-corrupted speech. For each clean speech component, model compensation techniques usually approximate the corrupted speech distribution with a diagonal-covariance Gaussian distribution. This thesis looks into improving on this approximation in two ways: firstly, by estimating full-covariance Gaussian distributions; secondly, by approximating corrupted-speech likelihoods without any parameterised distribution. The first part of this work is about compensating for within-component feature correlations under noise. For this, the covariance matrices of the computed Gaussians should be full instead of diagonal. The estimation of off-diagonal covariance elements turns out to be sensitive to approximations. A popular approximation is the one that state-of-the-art compensation schemes, like VTS compensation, use for dynamic coefficients: the continuous-time approximation. Standard speech recognisers contain both per-time slice, static, coefficients, and dynamic coefficients, which represent signal changes over time, and are normally computed from a window of static coefficients. To remove the need for the continuous-time approximation, this thesis introduces a new technique. It first compensates a distribution over the window of statics, and then applies the same linear projection that extracts dynamic coefficients. It introduces a number of methods that address the correlation changes that occur in noise within this framework. The next problem is decoding speed with full covariances. This thesis re-analyses the previously-introduced predictive linear transformations, and shows how they can model feature correlations at low and tunable computational cost. The second part of this work removes the Gaussian assumption completely. It introduces a sampling method that, given speech and noise distributions and a mismatch function, in the limit calculates the corrupted speech likelihood exactly. For this, it transforms the integral in the likelihood expression, and then applies sequential importance resampling. Though it is too slow to use for recognition, it enables a more fine-grained assessment of compensation techniques, based on the KL divergence to the ideal compensation for one component. The KL divergence proves to predict the word error rate well. This technique also makes it possible to evaluate the impact of approximations that standard compensation schemes make.This work was supported by Toshiba Research Europe Ltd., Cambridge Research Laboratory

    Single camera pose estimation using Bayesian filtering and Kinect motion priors

    Full text link
    Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014 conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video: https://www.youtube.com/watch?v=dJMTSo7-uF

    4-D Tomographic Inference: Application to SPECT and MR-driven PET

    Get PDF
    Emission tomographic imaging is framed in the Bayesian and information theoretic framework. The first part of the thesis is inspired by the new possibilities offered by PET-MR systems, formulating models and algorithms for 4-D tomography and for the integration of information from multiple imaging modalities. The second part of the thesis extends the models described in the first part, focusing on the imaging hardware. Three key aspects for the design of new imaging systems are investigated: criteria and efficient algorithms for the optimisation and real-time adaptation of the parameters of the imaging hardware; learning the characteristics of the imaging hardware; exploiting the rich information provided by depthof- interaction (DOI) and energy resolving devices. The document concludes with the description of the NiftyRec software toolkit, developed to enable 4-D multi-modal tomographic inference

    Robust and Optimal Methods for Geometric Sensor Data Alignment

    Get PDF
    Geometric sensor data alignment - the problem of finding the rigid transformation that correctly aligns two sets of sensor data without prior knowledge of how the data correspond - is a fundamental task in computer vision and robotics. It is inconvenient then that outliers and non-convexity are inherent to the problem and present significant challenges for alignment algorithms. Outliers are highly prevalent in sets of sensor data, particularly when the sets overlap incompletely. Despite this, many alignment objective functions are not robust to outliers, leading to erroneous alignments. In addition, alignment problems are highly non-convex, a property arising from the objective function and the transformation. While finding a local optimum may not be difficult, finding the global optimum is a hard optimisation problem. These key challenges have not been fully and jointly resolved in the existing literature, and so there is a need for robust and optimal solutions to alignment problems. Hence the objective of this thesis is to develop tractable algorithms for geometric sensor data alignment that are robust to outliers and not susceptible to spurious local optima. This thesis makes several significant contributions to the geometric alignment literature, founded on new insights into robust alignment and the geometry of transformations. Firstly, a novel discriminative sensor data representation is proposed that has better viewpoint invariance than generative models and is time and memory efficient without sacrificing model fidelity. Secondly, a novel local optimisation algorithm is developed for nD-nD geometric alignment under a robust distance measure. It manifests a wider region of convergence and a greater robustness to outliers and sampling artefacts than other local optimisation algorithms. Thirdly, the first optimal solution for 3D-3D geometric alignment with an inherently robust objective function is proposed. It outperforms other geometric alignment algorithms on challenging datasets due to its guaranteed optimality and outlier robustness, and has an efficient parallel implementation. Fourthly, the first optimal solution for 2D-3D geometric alignment with an inherently robust objective function is proposed. It outperforms existing approaches on challenging datasets, reliably finding the global optimum, and has an efficient parallel implementation. Finally, another optimal solution is developed for 2D-3D geometric alignment, using a robust surface alignment measure. Ultimately, robust and optimal methods, such as those in this thesis, are necessary to reliably find accurate solutions to geometric sensor data alignment problems

    MAINT.Data: modelling and analysing interval data in R

    Get PDF
    We present the CRAN R package MAINT.Data for the modelling and analysis of multivariate interval data, i.e., where units are described by variables whose values are intervals of IR, representing intrinsic variability. Parametric inference methodologies based on probabilistic models for interval variables have been developed, where each interval is represented by its midpoint and log-range, for which multivariate Normal and Skew-Normal distributions are assumed. The intrinsic nature of the interval variables leads to special structures of the variance-covariance matrix, which are represented by four different possible configurations. MAINT.Data implements the proposed methodologies in the S4 object system, introducing a specific data class for representing interval data. It includes functions and methods for modelling and analysing interval data, in particular maximum likelihood estimation, statistical tests for the different configurations, (M)ANOVA and Discriminant Analysis. For the Gaussian model, Model-based Clustering, robust estimation, outlier detection and Robust Discriminant Analysis are also availableinfo:eu-repo/semantics/publishedVersio
    • …
    corecore