Search CORE

1,277 research outputs found

Regularization in regression: comparing Bayesian and frequentist methods in a poorly informative situation

Author: Anbari Mohammed El
Celeux Gilles
Marin Jean-Michel
Robert Christian P.
Publication venue
Publication date: 15/11/2011
Field of study

Using a collection of simulated an real benchmarks, we compare Bayesian and frequentist regularization approaches under a low informative constraint when the number of variables is almost equal to the number of observations on simulated and real datasets. This comparison includes new global noninformative approaches for Bayesian variable selection built on Zellner's g-priors that are similar to Liang et al. (2008). The interest of those calibration-free proposals is discussed. The numerical experiments we present highlight the appeal of Bayesian regularization methods, when compared with non-Bayesian alternatives. They dominate frequentist methods in the sense that they provide smaller prediction errors while selecting the most relevant variables in a parsimonious way

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

INRIA a CCSD electronic archive server

Regularized Gaussian Discriminant Analysis Through Eigenvalue Decomposition

Author: Gilles Celeux
Halima Bensmail
Publication venue: 'JSTOR'
Publication date: 01/01/2006
Field of study

Crossref

Enhancing the selection of a model-based clustering with external qualitative variables

Author: Amorim Maria José
Baudry Jean-Patrick
Cardoso Margarida
Celeux Gilles
Ferreira Ana Sousa
Publication venue
Publication date: 31/10/2012
Field of study

In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which were not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a model and a number of clusters which both fit the data well and take advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Exact and Monte Carlo calculations of integrated likelihoods for the latent class model

Author: Aitkin
Biernacki
Biernacki
C. Biernacki
Celeux
Celeux
Celeux
Dempster
Fraley
Frühwirth-Schnatter
G. Celeux
G. Govaert
Goodman
McLachlan
McLachlan
Nadif
Rand
Robert
Schwarz
Spiegelhalter
Stephens
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Mixtures of Regression Models for Time-Course Gene Expression Data: Evaluation of Initialization and Random Effects

Author: Bar-Joseph
Bettina Grün
Biernacki
Celeux
Celeux
Cho
Dempster
Diebolt
Fraley
Friedrich Leisch
Grün
Handl
Hubert
Karatzoglou
Leisch
Luan
Ma
Ng
Ng
R Development Core Team
Ramoni
Scharl
Thalamuthu
Theresa Scharl
Wehrens
Publication venue
Publication date: 01/01/2009
Field of study

Finite mixture models are routinely applied to time course microarray data. Due to the complexity and size of this type of data the choice of good starting values plays an important role. So far initialization strategies have only been investigated for data from a mixture of multivariate normal distributions. In this work several initialization procedures are evaluated for mixtures of regression models with and without random effects in an extensive simulation study on different artificial datasets. Finally these procedures are also applied to a real dataset from E. coli

Crossref

Open Access LMU

Research Online

Some discussions on the Read Paper "Beyond subjective and objective in statistics" by A. Gelman and C. Hennig

Author: Celeux Gilles
Jewson Jack
Josse Julie
Marin Jean-Michel
Robert Christian
Robert Christian P.
Publication venue
Publication date: 01/01/2017
Field of study

This note is a collection of several discussions of the paper "Beyond subjective and objective in statistics", read by A. Gelman and C. Hennig to the Royal Statistical Society on April 12, 2017, and to appear in the Journal of the Royal Statistical Society, Series A

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

HAL-Polytechnique

Identifiability of a Switching Markov State-Space Model

Author: Celeux Gilles
Kalawoun Jana
Pamphile Patrick
Publication venue: HAL CCSD
Publication date: 08/09/2015
Field of study

International audienceWhile switching Markov state-space models arise in many applied science applications like signal processing, bioinformatics, etc., it is often difficult to establish their identifiability which is essential for parameters estimation. This paper discusses the simple case in which the unknown continuous state and the observations are scalars. We demonstrate that if a prior information relating the observations to the unknown continuous state at a time t0 is available, and if the Markov chain is irreducible and aperiodic, the set of the model parameters will be " globally structurally identifiable ". In addition, we show that under these constraints, the model parameters can be efficiently estimated by an EM algorithm.Les modèles à espaces d'états gouvernés par une chaîne de Markov cachée sont utilisés dans de nombreux domaines appliqués comme le traitement de signal, la bioinformatique, etc. Cependant, il est souvent difficile d'établir leur identifiabilité, propriété essentielle pour l'estimation de leurs paramètres. Dans cet article, nous traitons un cas simple pour lequel l'état continu inconnu et les observations sont des scalaires. Nous démontrons que lorsque la chaîne de Markov est irréductible et apériodique , une information a priori reliant les observations et l'état continu inconnu à un instant t0 suffit pour assurer " l'identifiabilité générale " de l'ensemble des paramètres du modèle. Nous montrons aussi qu'en intégrant ces contraintes dans un algorithme EM, les paramètres du modèle sont estimés efficacement

INRIA a CCSD electronic archive server

HAL-CEA

Latent class analysis was accurate but sensitive in data simulations

Author: Celeux
Collins
Dziak
Green
Green
Michael J. Green
Muthén
Muthén
Nylund
Schwarz
Skardhamar
Tein
Twisk
Vermunt
Publication venue: 'Elsevier BV'
Publication date: 01/10/2014
Field of study

Objectives: Latent class methods are increasingly being used in analysis of developmental trajectories. A recent simulation study by Twisk and Hoekstra (2012) suggested caution in use of these methods because they failed to accurately identify developmental patterns that had been artificially imposed on a real data set. This article tests whether existing developmental patterns within the data set used might have obscured the imposed patterns. Study Design and Setting: Data were simulated to match the latent class pattern in the previous article, but with varying levels of randomly generated variance, rather than variance carried over from a real data set. Latent class analysis (LCA) was then used to see if the latent class structure could be accurately identified. Results: LCA performed very well at identifying the simulated latent class structure, even when the level of variance was similar to that reported in the previous study, although misclassification began to be more problematic with considerably higher levels of variance. Conclusion: The failure of LCA to replicate the imposed patterns in the previous study may have been because it was sensitive enough to detect residual patterns of population heterogeneity within the altered data. LCA performs well at classifying developmental trajectories.</p&gt

Elsevier - Publisher Connector

Crossref

PubMed Central

Enlighten