Search CORE

2,313 research outputs found

EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis

Author: Alameda-Pineda Xavier
Forbes Florence
Gebru Israel D.
Horaud Radu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/01/2016
Field of study

Data clustering has received a lot of attention and numerous methods, algorithms and software packages are available. Among these techniques, parametric finite-mixture models play a central role due to their interesting mathematical properties and to the existence of maximum-likelihood estimators based on expectation-maximization (EM). In this paper we propose a new mixture model that associates a weight with each observed point. We introduce the weighted-data Gaussian mixture and we derive two EM algorithms. The first one considers a fixed weight for each observation. The second one treats each weight as a random variable following a gamma distribution. We propose a model selection method based on a minimum message length criterion, provide a weight initialization strategy, and validate the proposed algorithms by comparing them with several state of the art parametric and non-parametric clustering techniques. We also demonstrate the effectiveness and robustness of the proposed clustering technique in the presence of heterogeneous data, namely audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Model-based understanding of facial expressions

Author: Sauer Patrick
Publication venue
Publication date: 01/08/2014
Field of study

Tensor Regression

Author: Liu Jiani
Liu Yipeng
Long Zhen
Zhu Ce
Publication venue
Publication date: 22/08/2023
Field of study

Regression analysis is a key area of interest in the field of data analysis and machine learning which is devoted to exploring the dependencies between variables, often using vectors. The emergence of high dimensional data in technologies such as neuroimaging, computer vision, climatology and social networks, has brought challenges to traditional data representation methods. Tensors, as high dimensional extensions of vectors, are considered as natural representations of high dimensional data. In this book, the authors provide a systematic study and analysis of tensor-based regression models and their applications in recent years. It groups and illustrates the existing tensor-based regression methods and covers the basics, core ideas, and theoretical characteristics of most tensor-based regression methods. In addition, readers can learn how to use existing tensor-based regression methods to solve specific regression tasks with multiway data, what datasets can be selected, and what software packages are available to start related work as soon as possible. Tensor Regression is the first thorough overview of the fundamentals, motivations, popular algorithms, strategies for efficient implementation, related applications, available datasets, and software resources for tensor-based regression analysis. It is essential reading for all students, researchers and practitioners of working on high dimensional data.Comment: 187 pages, 32 figures, 10 table

arXiv.org e-Print Archive

A survey on face detection in the wild: past, present and future

Author: Zafeiriou S
Zhang C
Zhang Z
Publication venue: 'Elsevier BV'
Publication date: 27/03/2015
Field of study

Spiral - Imperial College Digital Repository

Covariate dimension reduction for survival data via the Gaussian process latent variable model

Author: Cox
Curtis
Ek
Ek
Eleftheriadis
Engler
Gal
Gao
Goeman
Ibrahim
Ishwaran
Ishwaran
Ishwaran
Ishwaran
Lawrence
Lu
Martino
Park
Rasmussen
Shon
Sohn
Tibshirani
Uno
Urtasun
Vanhatalo
Witten
Publication venue
Publication date: 01/11/2015
Field of study

The analysis of high dimensional survival data is challenging, primarily due to the problem of overfitting which occurs when spurious relationships are inferred from data that subsequently fail to exist in test data. Here we propose a novel method of extracting a low dimensional representation of covariates in survival data by combining the popular Gaussian Process Latent Variable Model (GPLVM) with a Weibull Proportional Hazards Model (WPHM). The combined model offers a flexible non-linear probabilistic method of detecting and extracting any intrinsic low dimensional structure from high dimensional data. By reducing the covariate dimension we aim to diminish the risk of overfitting and increase the robustness and accuracy with which we infer relationships between covariates and survival outcomes. In addition, we can simultaneously combine information from multiple data sources by expressing multiple datasets in terms of the same low dimensional space. We present results from several simulation studies that illustrate a reduction in overfitting and an increase in predictive performance, as well as successful detection of intrinsic dimensionality. We provide evidence that it is advantageous to combine dimensionality reduction with survival outcomes rather than performing unsupervised dimensionality reduction on its own. Finally, we use our model to analyse experimental gene expression data and detect and extract a low dimensional representation that allows us to distinguish high and low risk groups with superior accuracy compared to doing regression on the original high dimensional data

arXiv.org e-Print Archive

King's Research Portal