Clustering in high-dimensional spaces is nowadays a recurrent problem in many
scientific domains but remains a difficult task from both the clustering
accuracy and the result understanding points of view. This paper presents a
discriminative latent mixture (DLM) model which fits the data in a latent
orthonormal discriminative subspace with an intrinsic dimension lower than the
dimension of the original space. By constraining model parameters within and
between groups, a family of 12 parsimonious DLM models is exhibited which
allows to fit onto various situations. An estimation algorithm, called the
Fisher-EM algorithm, is also proposed for estimating both the mixture
parameters and the discriminative subspace. Experiments on simulated and real
datasets show that the proposed approach performs better than existing
clustering methods while providing a useful representation of the clustered
data. The method is as well applied to the clustering of mass spectrometry
data

A. Jain

A. Montanari

A. Raftery

C. Biernacki

C. Bishop

C. Bouveyron

C. Fraley

C. Maugis

Camille Brunet

Charles Bouveyron

D. Foley

D. Rubin

D. Scott

D.A. Clausi

E. Anderson

E. Tipping

G. Celeux

G. Golub

G. Kimeldorf

G. McLachlan

G. Schwarz

H. Akaike

I. Jolliffe

J. Baek

J. Friedman

J. Ye

K. Fukunaga

K. Liu

L. Parsons

M. Law

N. Campbell

N. Trendafilov

P. Howland

P. McNicholas

R. Agrawal

R. Bellman

R. Duda

R. Fisher

S. Boutemedjet

T. Alexandrov

T. Hastie

W. Krzanowski

Y. Hamamoto

Y.F. Guo

Z. Jin

English

arXiv

International audienceClustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets show that the proposed approach performs better than existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data

Bouveyron, Charles

Brunet, Camille

HAL Evry

Simultaneous model-based clustering and visualization in the Fisher discriminative subspace

Crossref

HAL-Paris1

https://hal-paris1.archives-ouvertes.fr/hal-00492406

Simultaneous model-based clustering and visualization in the Fisher
  discriminative subspace

Simultaneous model-based clustering and visualization in the Fisher discriminative subspace

Abstract

Similar works

Full text

Available Versions

HAL Evry

Crossref

HAL-Paris1