Search CORE

8,411 research outputs found

Model–based Clustering with Copulas

Author: Ernst Dominik
Publication venue
Publication date: 01/01/2010
Field of study

Effect fusion using model-based clustering

Author: Malsiner-Walli Gertraud
Pauger Daniela
Wagner Helga
Publication venue
Publication date: 22/03/2017
Field of study

In social and economic studies many of the collected variables are measured on a nominal scale, often with a large number of categories. The definition of categories is usually not unambiguous and different classification schemes using either a finer or a coarser grid are possible. Categorisation has an impact when such a variable is included as covariate in a regression model: a too fine grid will result in imprecise estimates of the corresponding effects, whereas with a too coarse grid important effects will be missed, resulting in biased effect estimates and poor predictive performance. To achieve automatic grouping of levels with essentially the same effect, we adopt a Bayesian approach and specify the prior on the level effects as a location mixture of spiky normal components. Fusion of level effects is induced by a prior on the mixture weights which encourages empty components. Model-based clustering of the effects during MCMC sampling allows to simultaneously detect categories which have essentially the same effect size and identify variables with no effect at all. The properties of this approach are investigated in simulation studies. Finally, the method is applied to analyse effects of high-dimensional categorical predictors on income in Austria

arXiv.org e-Print Archive

Elektronische Publikationen der Wirtschaftsuniversität Wien

Model-based clustering for populations of networks

Author: Signorelli Mirko
Wit Ernst
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

Until recently obtaining data on populations of networks was typically rare. However, with the advancement of automatic monitoring devices and the growing social and scientific interest in networks, such data has become more widely available. From sociological experiments involving cognitive social structures to fMRI scans revealing large-scale brain networks of groups of patients, there is a growing awareness that we urgently need tools to analyse populations of networks and particularly to model the variation between networks due to covariates. We propose a model-based clustering method based on mixtures of generalized linear (mixed) models that can be employed to describe the joint distribution of a populations of networks in a parsimonious manner and to identify subpopulations of networks that share certain topological properties of interest (degree distribution, community structure, effect of covariates on the presence of an edge, etc.). Maximum likelihood estimation for the proposed model can be efficiently carried out with an implementation of the EM algorithm. We assess the performance of this method on simulated data and conclude with an example application on advice networks in a small business.Comment: The final (published) version of the article can be downloaded for free (Open Access) from the editor's website (click on the DOI link below

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Leiden University Scholary Publications

Dissertations of the University of Groningen

Model Based Clustering for Mixed Data: clustMD

Author: Gormley Isobel Claire
McParland Damien
Publication venue
Publication date: 05/11/2015
Field of study

A model based clustering procedure for data of mixed type, clustMD, is developed using a latent variable model. It is proposed that a latent variable, following a mixture of Gaussian distributions, generates the observed data of mixed type. The observed data may be any combination of continuous, binary, ordinal or nominal variables. clustMD employs a parsimonious covariance structure for the latent variables, leading to a suite of six clustering models that vary in complexity and provide an elegant and unified approach to clustering mixed data. An expectation maximisation (EM) algorithm is used to estimate clustMD; in the presence of nominal data a Monte Carlo EM algorithm is required. The clustMD model is illustrated by clustering simulated mixed type data and prostate cancer patients, on whom mixed data have been recorded

arXiv.org e-Print Archive

Crossref

Research Repository UCD

Irish Universities

Model-based clustering via linear cluster-weighted models

Author: Aitken
Andrews
Andrews
Antonio Punzo
Baek
Biernacki
Brent
Böhning
Campbell
Cellini
Chatzis
Cleveland
Dempster
Everitt
Flury
Fraley
Frühwirth-Schnatter
Gershenfeld
Greselin
Hennig
Hubert
Ingrassia
Lange
Leisch
McLachlan
McLachlan
McNicholas
McNicholas
McNicholas
McNicholas
Peel
Salvatore Ingrassia
Schwarz
Shoham
Simona C. Minotti
Titterington
Wand
Wedel
Zellner
Publication venue: 'Elsevier BV'
Publication date: 09/03/2015
Field of study

A novel family of twelve mixture models with random covariates, nested in the linear

t

cluster-weighted model (CWM), is introduced for model-based clustering. The linear

t

CWM was recently presented as a robust alternative to the better known linear Gaussian CWM. The proposed family of models provides a unified framework that also includes the linear Gaussian CWM as a special case. Maximum likelihood parameter estimation is carried out within the EM framework, and both the BIC and the ICL are used for model selection. A simple and effective hierarchical random initialization is also proposed for the EM algorithm. The novel model-based clustering technique is illustrated in some applications to real data. Finally, a simulation study for evaluating the performance of the BIC and the ICL is presented

arXiv.org e-Print Archive

Crossref