Search CORE

114,487 research outputs found

Model-based clustering via linear cluster-weighted models

Author: Aitken
Andrews
Andrews
Antonio Punzo
Baek
Biernacki
Brent
Böhning
Campbell
Cellini
Chatzis
Cleveland
Dempster
Everitt
Flury
Fraley
Frühwirth-Schnatter
Gershenfeld
Greselin
Hennig
Hubert
Ingrassia
Lange
Leisch
McLachlan
McLachlan
McNicholas
McNicholas
McNicholas
McNicholas
Peel
Salvatore Ingrassia
Schwarz
Shoham
Simona C. Minotti
Titterington
Wand
Wedel
Zellner
Publication venue: 'Elsevier BV'
Publication date: 09/03/2015
Field of study

A novel family of twelve mixture models with random covariates, nested in the linear

t

cluster-weighted model (CWM), is introduced for model-based clustering. The linear

t

CWM was recently presented as a robust alternative to the better known linear Gaussian CWM. The proposed family of models provides a unified framework that also includes the linear Gaussian CWM as a special case. Maximum likelihood parameter estimation is carried out within the EM framework, and both the BIC and the ICL are used for model selection. A simple and effective hierarchical random initialization is also proposed for the EM algorithm. The novel model-based clustering technique is illustrated in some applications to real data. Finally, a simulation study for evaluating the performance of the BIC and the ICL is presented

arXiv.org e-Print Archive

Crossref

Socially Constrained Structural Learning for Groups Detection in Crowd

Author: Calderara Simone
Cucchiara Rita
Solera Francesco
Publication venue
Publication date: 06/08/2015
Field of study

Modern crowd theories agree that collective behavior is the result of the underlying interactions among small groups of individuals. In this work, we propose a novel algorithm for detecting social groups in crowds by means of a Correlation Clustering procedure on people trajectories. The affinity between crowd members is learned through an online formulation of the Structural SVM framework and a set of specifically designed features characterizing both their physical and social identity, inspired by Proxemic theory, Granger causality, DTW and Heat-maps. To adhere to sociological observations, we introduce a loss function (G-MITRE) able to deal with the complexity of evaluating group detection performances. We show our algorithm achieves state-of-the-art results when relying on both ground truth trajectories and tracklets previously extracted by available detector/tracker systems

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Inference and Evaluation of the Multinomial Mixture Model for Text Clustering

Author: Banerjee
Church
Deerwester
François Yvon
Halkidi
Hofmann
Jain
Katz
Kuhn
Lange
Loïs Rigouste
Mosimann
Nigam
Olivier Cappé
Robert
Sebastiani
Shahnaz
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

In this article, we investigate the use of a probabilistic model for unsupervised clustering in text collections. Unsupervised clustering has become a basic module for many intelligent text processing applications, such as information retrieval, text classification or information extraction. The model considered in this contribution consists of a mixture of multinomial distributions over the word counts, each component corresponding to a different theme. We present and contrast various estimation procedures, which apply both in supervised and unsupervised contexts. In supervised learning, this work suggests a criterion for evaluating the posterior odds of new documents which is more statistically sound than the "naive Bayes" approach. In an unsupervised context, we propose measures to set up a systematic evaluation framework and start with examining the Expectation-Maximization (EM) algorithm as the basic tool for inference. We discuss the importance of initialization and the influence of other features such as the smoothing strategy or the size of the vocabulary, thereby illustrating the difficulties incurred by the high dimensionality of the parameter space. We also propose a heuristic algorithm based on iterative EM with vocabulary reduction to solve this problem. Using the fact that the latent variables can be analytically integrated out, we finally show that Gibbs sampling algorithm is tractable and compares favorably to the basic expectation maximization approach

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL Descartes

Recommended from our members

A survey of clustering methods

Author: Gennari John H.
Publication venue: eScholarship, University of California
Publication date: 30/10/1989
Field of study

In this paper, I describe a large variety of clustering methods within a single framework. This paper unifies work across different fields, from biology (numerical taxonomy) to machine learning (concept formation). An important objective for this paper is to show that one can benefit by a knowledge of research across different disciplines. After describing the task from a set of different viewpoints or paradigms, I begin by describing the similarity measures or evaluation functions that form the basis of any clustering technique. Next, I describe a number of different algorithms that use these measures, and I close with a brief discussion of ways to evaluate different approaches to clustering

eScholarship - University of California