Search CORE

9,101 research outputs found

Dimensionality reduction of clustered data sets

Author: Sanguinetti G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2008
Field of study

We present a novel probabilistic latent variable model to perform linear dimensionality reduction on data sets which contain clusters. We prove that the maximum likelihood solution of the model is an unsupervised generalisation of linear discriminant analysis. This provides a completely new approach to one of the most established and widely used classification algorithms. The performance of the model is then demonstrated on a number of real and artificial data sets

CiteSeerX

Crossref

White Rose Research Online

Recommended from our members

Improving "bag-of-keypoints" image categorisation: Generative Models and PDF-Kernels

Author: Farquhar J
Meng H
Shawe-Taylor J
Szedmak S
Publication venue
Publication date: 01/01/2005
Field of study

In this paper we propose two distinct enhancements to the basic ''bag-of-keypoints" image categorisation scheme proposed in [4]. In this approach images are represented as a variable sized set of local image features (keypoints). Thus, we require machine learning tools which can operate on sets of vectors. In [4] this is achieved by representing the set as a histogram over bins found by k-means. We show how this approach can be improved and generalised using Gaussian Mixture Models (GMMs). Alternatively, the set of keypoints can be represented directly as a probability density function, over which a kernel can be de ned. This approach is shown to give state of the art categorisation performance

Brunel University Research Archive

Speaker verification using sequence discriminant support vector machines

Author: Renals S.
Wan V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper presents a text-independent speaker verification system using support vector machines (SVMs) with score-space kernels. Score-space kernels generalize Fisher kernels and are based on underlying generative models such as Gaussian mixture models (GMMs). This approach provides direct discrimination between whole sequences, in contrast with the frame-level approaches at the heart of most current systems. The resultant SVMs have a very high dimensionality since it is related to the number of parameters in the underlying generative model. To address problems that arise in the resultant optimization we introduce a technique called spherical normalization that preconditions the Hessian matrix. We have performed speaker verification experiments using the PolyVar database. The SVM system presented here reduces the relative error rates by 34% compared to a GMM likelihood ratio system

Crossref

Edinburgh Research Archive

Edinburgh Research Explorer

White Rose Research Online

Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data

Author: Bianchi Filippo Maria
Jenssen Robert
Mikalsen Karl Øyvind
Soguero-Ruiz Cristina
Publication venue
Publication date: 01/01/2017
Field of study

Similarity-based approaches represent a promising direction for time series analysis. However, many such methods rely on parameter tuning, and some have shortcomings if the time series are multivariate (MTS), due to dependencies between attributes, or the time series contain missing data. In this paper, we address these challenges within the powerful context of kernel methods by proposing the robust \emph{time series cluster kernel} (TCK). The approach taken leverages the missing data handling properties of Gaussian mixture models (GMM) augmented with informative prior distributions. An ensemble learning approach is exploited to ensure robustness to parameters by combining the clustering results of many GMM to form the final kernel. We evaluate the TCK on synthetic and real data and compare to other state-of-the-art techniques. The experimental results demonstrate that the TCK is robust to parameter choices, provides competitive results for MTS without missing data and outstanding results for missing data.Comment: 23 pages, 6 figure

arXiv.org e-Print Archive

Munin - Open Research Archive

NORA - Norwegian Open Research Archives

Compositional Model based Fisher Vector Coding for Image Classification

Author: Hengel Anton van den
Liu Lingqiao
Shen Chunhua
Shen Heng Tao
Wang Chao
Wang Lei
Wang Peng
Publication venue
Publication date: 01/01/2017
Field of study

Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) to depict the generation process of local features. However, the representative power of the GMM could be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes and the number of prototypes is usually small in FVC. To handle this limitation, in this paper we break the convention which assumes that a local feature is drawn from one of few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as the linear combination of multiple key components and the combination weight is a latent random variable. In this way, we can greatly enhance the representative power of the generative model of FVC. To implement our idea, we designed two particular generative models with such a compositional mechanism.Comment: Fixed typos. 16 pages. Appearing in IEEE T. Pattern Analysis and Machine Intelligence (TPAMI

arXiv.org e-Print Archive

Adelaide Research & Scholarship

Research Online

Mixtures of Gaussian distributions under linear dimensionality reduction

Author: Concha OP
Otoom AF
Piccardi M
Publication venue
Publication date: 10/09/2010
Field of study

High dimensional spaces pose a serious challenge to the learning process. It is a combination of limited number of samples and high dimensions that positions many problems under the "curse of dimensionality", which restricts severely the practical application of density estimation. Many techniques have been proposed in the past to discover embedded, locally-linear manifolds of lower dimensionality, including the mixture of Principal Component Analyzers, the mixture of Probabilistic Principal Component Analyzers and the mixture of Factor Analyzers. In this paper, we present a mixture model for reducing dimensionality based on a linear transformation which is not restricted to be orthogonal. Two methods are proposed for the learning of all the transformations and mixture parameters: the first method is based on an iterative maximum-likelihood approach and the second is based on random transformations and fixed (non iterative) probability functions. For experimental validation, we have used the proposed model for maximum-likelihood classification of five "hard" data sets including data sets from the UCI repository and the authors' own. Moreover, we compared the classification performance of the proposed method with that of other popular classifiers including the mixture of Probabilistic Principal Component Analyzers and the Gaussian mixture model. In all cases but one, the accuracy achieved by the proposed method proved the highest, with increases with respect to the runner-up ranging from 0.2% to 5.2%

OPUS - University of Technology Sydney