71 research outputs found

    Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for robust Gaussian clustering

    Get PDF
    The two main topics of this paper are the introduction of the "optimally tuned improper maximum likelihood estimator" (OTRIMLE) for robust clustering based on the multivariate Gaussian model for clusters, and a comprehensive simulation study comparing the OTRIMLE to Maximum Likelihood in Gaussian mixtures with and without noise component, mixtures of t-distributions, and the TCLUST approach for trimmed clustering. The OTRIMLE uses an improper constant density for modelling outliers and noise. This can be chosen optimally so that the non-noise part of the data looks as close to a Gaussian mixture as possible. Some deviation from Gaussianity can be traded in for lowering the estimated noise proportion. Covariance matrix constraints and computation of the OTRIMLE are also treated. In the simulation study, all methods are confronted with setups in which their model assumptions are not exactly fulfilled, and in order to evaluate the experiments in a standardized way by misclassification rates, a new model-based definition of "true clusters" is introduced that deviates from the usual identification of mixture components with clusters. In the study, every method turns out to be superior for one or more setups, but the OTRIMLE achieves the most satisfactory overall performance. The methods are also applied to two real datasets, one without and one with known "true" clusters

    Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering

    Get PDF
    The robust improper maximum likelihood estimator (RIMLE) is a new method for robust multivariate clustering finding approximately Gaussian clusters. It maximizes a pseudo-likelihood defined by adding a component with improper constant density for accommodating outliers to a Gaussian mixture. A special case of the RIMLE is MLE for multivariate finite Gaussian mixture models. In this paper we treat existence, consistency, and breakdown theory for the RIMLE comprehensively. RIMLE's existence is proved under non-smooth covariance matrix constraints. It is shown that these can be implemented via a computationally feasible Expectation-Conditional Maximization algorithm.Comment: The title of this paper was originally: "A consistent and breakdown robust model-based clustering method

    Una teoria della decidibilità: entropia e scelte in condizioni di incertezza

    Get PDF
    Questo lavoro presenta un nuovo modello di scelta in condizioni di incertezza. Dopo aver introdotto una caratterizzazione del concetto di incertezza, si dimostra, su base assiomatica, come sia possibile interpretare la funzione di entropia come misura di incertezza debole. I concetti di entropia e di utilità attesa sono impiegati per costruire su base assiomatica una nuova funzione: la funzione di decidibilità, che è in grado di ordinare le preferenze sullo spazio delle lotterie. Infine si mostra che questo modello è in grado di razionalizzare sia il paradosso di Allais che quello di Ellsberg.

    An adequacy approach for deciding the number of clusters for OTRIMLE robust Gaussian mixture-based clustering

    Get PDF
    We introduce a new approach to deciding the number of clusters. The approach is applied to Optimally Tuned Robust Improper Maximum Likelihood Estimation (OTRIMLE; Coretto & Hennig, Journal of the American Statistical Association111, 1648-1659) of a Gaussian mixture model allowing for observations to be classified as 'noise', but it can be applied to other clustering methods as well. The quality of a clustering is assessed by a statistic Q that measures how close the within-cluster distributions are to elliptical unimodal distributions that have the only mode in the mean. This non-parametric measure allows for non-Gaussian clusters as long as they have a good quality according to Q. The simplicity of a model is assessed by a measure S that prefers a smaller number of clusters unless additional clusters can reduce the estimated noise proportion substantially. The simplest model is then chosen that is adequate for the data in the sense that its observed value of Q is not significantly larger than what is expected for data truly generated from the fitted model, as can be assessed by parametric bootstrap. The approach is compared with model-based clustering using the Bayesian information criterion (BIC) and the integrated complete likelihood (ICL) in a simulation study and on real two data sets

    Nonparametric consistency for maximum likelihood estimation and clustering based on mixtures of elliptically-symmetric distributions

    Full text link
    The consistency of the maximum likelihood estimator for mixtures of elliptically-symmetric distributions for estimating its population version is shown, where the underlying distribution PP is nonparametric and does not necessarily belong to the class of mixtures on which the estimator is based. In a situation where PP is a mixture of well enough separated but nonparametric distributions it is shown that the components of the population version of the estimator correspond to the well separated components of PP. This provides some theoretical justification for the use of such estimators for cluster analysis in case that PP has well separated subpopulations even if these subpopulations differ from what the mixture model assumes

    Selecting the number of clusters, clustering models, and algorithms. A unifying approach based on the quadratic discriminant score

    Get PDF
    Cluster analysis requires many decisions: the clustering method and the implied reference model, the number of clusters and, often, several hyper-parameters and algorithms' tunings. In practice, one produces several partitions, and a final one is chosen based on validation or selection criteria. There exist an abundance of validation methods that, implicitly or explicitly, assume a certain clustering notion. Moreover, they are often restricted to operate on partitions obtained from a specific method. In this paper, we focus on groups that can be well separated by quadratic or linear boundaries. The reference cluster concept is defined through the quadratic discriminant score function and parameters describing clusters' size, center and scatter. We develop two cluster-quality criteria called quadratic scores. We show that these criteria are consistent with groups generated from a general class of elliptically-symmetric distributions. The quest for this type of groups is common in applications. The connection with likelihood theory for mixture models and model-based clustering is investigated. Based on bootstrap resampling of the quadratic scores, we propose a selection rule that allows choosing among many clustering solutions. The proposed method has the distinctive advantage that it can compare partitions that cannot be compared with other state-of-the-art methods. Extensive numerical experiments and the analysis of real data show that, even if some competing methods turn out to be superior in some setups, the proposed methodology achieves a better overall performance.Comment: Supplemental materials are included at the end of the pape

    A simulations study to compare robust clustering methods based on mixtures

    Get PDF
    Abstract The following mixture model-based clustering methods are compared in a simulation study with one-dimensional data, fixed number of clusters and a focus on outliers and uniform "noise": an ML-estimator (MLE) for Gaussian mixtures, an MLE for a mixture of Gaussians and a uniform distribution (interpreted as "noise component" to catch outliers), an MLE for a mixture of Gaussian distributions where a uniform distribution over the range of the data is fixed (Fraley and Raftery in Comput J 41:578-588, 199

    Identifiability for mixtures of distributions from a location-scale family with uniforms

    Get PDF
    In this paper we study the indentifiability of a class of mixture models where a finite number of one-dimensional location scale distributions is mixed with a finite number of uniform distributions on an interval. We define identifiability and we show that, under certain conditions, the afore-mentioned class of distributions is identifiable
    corecore