22,373 research outputs found

    EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis

    Get PDF
    Data clustering has received a lot of attention and numerous methods, algorithms and software packages are available. Among these techniques, parametric finite-mixture models play a central role due to their interesting mathematical properties and to the existence of maximum-likelihood estimators based on expectation-maximization (EM). In this paper we propose a new mixture model that associates a weight with each observed point. We introduce the weighted-data Gaussian mixture and we derive two EM algorithms. The first one considers a fixed weight for each observation. The second one treats each weight as a random variable following a gamma distribution. We propose a model selection method based on a minimum message length criterion, provide a weight initialization strategy, and validate the proposed algorithms by comparing them with several state of the art parametric and non-parametric clustering techniques. We also demonstrate the effectiveness and robustness of the proposed clustering technique in the presence of heterogeneous data, namely audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table

    Techniques for clustering gene expression data

    Get PDF
    Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

    Modelling Spatial Regimes in Farms Technologies

    Get PDF
    We exploit the information derived from geographical coordinates to endogenously identify spatial regimes in technologies that are the result of a variety of complex, dynamic interactions among site-specific environmental variables and farmer decision making about technology, which are often not observed at the farm level. Controlling for unobserved heterogeneity is a fundamental challenge in empirical research, as failing to do so can produce model misspecification and preclude causal inference. In this article, we adopt a two-step procedure to deal with unobserved spatial heterogeneity, while accounting for spatial dependence in a cross-sectional setting. The first step of the procedure takes explicitly unobserved spatial heterogeneity into account to endogenously identify subsets of farms that follow a similar local production econometric model, i.e. spatial production regimes. The second step consists in the specification of a spatial autoregressive model with autoregressive disturbances and spatial regimes. The method is applied to two regional samples of olive growing farms in Italy. The main finding is that the identification of spatial regimes can help drawing a more detailed picture of the production environment and provide more accurate information to guide extension services and policy makers
    corecore