Search CORE

46 research outputs found

Local Statistical Modeling via Cluster-Weighted Approach with Elliptical Distributions

Author: Giorgio Vittadini
Salvatore Ingrassia
Simona Caterina Minotti
Publication venue
Publication date
Field of study

Cluster Weighted Modeling (CWM) is a mixture approach regarding the modelisation of the joint probability of data coming from a heterogeneous population. Under Gaussian assumptions, we investigate statistical properties of CWM from both the theoretical and numerical point of view; in particular, we show that CWM includes as special cases mixtures of distributions and mixtures of regressions. Further, we introduce CWM based on Student-t distributions providing more robust fitting for groups of observations with longer than normal tails or atypical observations. Theoretical results are illustrated using some empirical studies, considering both real and simulated data.Cluster-Weighted Modeling, Mixture Models, Model-Based Clustering

Research Papers in Economics

flexCWM: A Flexible Framework for Cluster-Weighted Models

Author: Angelo Mazza
Antonio Punzo
Salvatore Ingrassia
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/09/2018
Field of study

Cluster-weighted models (CWMs) are mixtures of regression models with random covariates. However, besides having recently become rather popular in statistics and data mining, there is still a lack of support for CWMs within the most popular statistical suites. In this paper, we introduce flexCWM, an R package specifically conceived for fitting CWMs. The package supports modeling the conditioned response variable by means of the most common distributions of the exponential family and by the t distribution. Covariates are allowed to be of mixed-type and parsimonious modeling of multivariate normal covariates, based on the eigenvalue decomposition of the component covariance matrices, is supported. Furthermore, either the response or the covariates distributions can be omitted, yielding to mixtures of distributions and mixtures of regression models with fixed covariates, respectively. The expectation-maximization (EM) algorithm is used to obtain maximum-likelihood estimates of the parameters and likelihood-based information criteria are adopted to select the number of groups and/or a parsimonious model. For the component regression coefficients, standard errors and significance tests are also provided. Parallel computation can be used on multicore PCs and computer clusters, when several models have to be fitted. To exemplify the use of the package, applications to artificial and real datasets, included in the package, are presented

Directory of Open Access Journals

Journal of Statistical Software

The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers.

Author: García-Escudero Luis Ángel
Gordaliza Alfonso
Greselin Francesca
Ingrassia Salvatore
Mayo Iscar Agustín
Publication venue: ELSEVIER
Publication date: 15/12/1998
Field of study

Producción CientíficaMixtures of Gaussian factors are powerful tools for modeling an unobserved heterogeneous population, offering – at the same time – dimension reduction and model-based clustering. The high prevalence of spurious solutions and the disturbing effects of outlying observations in maximum likelihood estimation may cause biased or misleading inferences. Restrictions for the component covariances are considered in order to avoid spurious solutions, and trimming is also adopted, to provide robustness against violations of normality assumptions of the underlying latent factors. A detailed AECM algorithm for this new approach is presented. Simulation results and an application to the AIS dataset show the aim and effectiveness of the proposed methodology.Ministerio de Economía y Competitividad and FEDER, grant MTM2014-56235-C2-1-P, and by Consejería de Educación de la Junta de Castilla y León, grant VA212U13, by grant FAR 2015 from the University of Milano-Bicocca and by grant FIR 2014 from the University of Catania

Crossref

Repositorio Documental de la Universidad de Valladolid

Oskar Bordeaux

The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers.

Author: García Escudero Luis Ángel
Gordaliza Ramos Alfonso
Greselin Francesca
Ingrassia Salvatore
Mayo Iscar Agustín
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Repositorio Documental de la Universidad de Valladolid

Model-based clustering via linear cluster-weighted models

Author: Aitken
Andrews
Andrews
Antonio Punzo
Baek
Biernacki
Brent
Böhning
Campbell
Cellini
Chatzis
Cleveland
Dempster
Everitt
Flury
Fraley
Frühwirth-Schnatter
Gershenfeld
Greselin
Hennig
Hubert
Ingrassia
Lange
Leisch
McLachlan
McLachlan
McNicholas
McNicholas
McNicholas
McNicholas
Peel
Salvatore Ingrassia
Schwarz
Shoham
Simona C. Minotti
Titterington
Wand
Wedel
Zellner
Publication venue: 'Elsevier BV'
Publication date: 09/03/2015
Field of study

A novel family of twelve mixture models with random covariates, nested in the linear

t

cluster-weighted model (CWM), is introduced for model-based clustering. The linear

t

CWM was recently presented as a robust alternative to the better known linear Gaussian CWM. The proposed family of models provides a unified framework that also includes the linear Gaussian CWM as a special case. Maximum likelihood parameter estimation is carried out within the EM framework, and both the BIC and the ICL are used for model selection. A simple and effective hierarchical random initialization is also proposed for the EM algorithm. The novel model-based clustering technique is illustrated in some applications to real data. Finally, a simulation study for evaluating the performance of the BIC and the ICL is presented

arXiv.org e-Print Archive

Crossref

Eigenvalues and constraints in mixture modeling: geometric and computational issues

Author: García Escudero Luis Ángel
Gordaliza Ramos Alfonso
Greselin Francesca
Mayo Iscar Agustín
Salvatore Ingrassia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

This paper presents a review about the usage of eigenvalues restrictions for constrained parameter estimation in mixtures of elliptical distributions according to the likelihood approach. These restrictions serve a twofold purpose: to avoid convergence to degenerate solutions and to reduce the onset of non interesting (spurious) maximizers, related to complex likelihood surfaces. The paper shows how the constraints may play a key role in the theory of Euclidean data clustering. The aim here is to provide a reasoned review of the constraints and their applications, along the contributions of many authors, spanning the literature of the last thirty years.Spanish Ministerio de Economía y Competitividad (grant MTM2017-86061-C2-1-P)Junta de Castilla y León - Fondo Europeo de Desarrollo Regional (grant VA005P17 and VA002G18

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Documental de la Universidad de Valladolid

Geometrical Aspects of Discrimination by Multilayer Perceptrons

Author: Ingrassia Salvatore
Publication venue
Publication date
Field of study

We investigate some geometrical aspects of the discriminant functions of the kindfp(x)=[summation operator]pk=1Â ck[tau](a'kx) for suitable constantsak,Â ckwhere[tau]is a sigmoidal transformation. This function is realized by a multilayer perceptron with one hidden layer. These results are applied in the analysis of the discriminating power offp. In particular, we prove that the class of finite populations[Omega]1and[Omega]2that can be distinguished byfpis monotonically increasing inpand we give a minimal sufficientpleading to a complete separation between[Omega]1and[Omega]2.Discrimination multilayer perceptron sigmoidal function.

Research Papers in Economics