22,373 research outputs found
EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis
Data clustering has received a lot of attention and numerous methods,
algorithms and software packages are available. Among these techniques,
parametric finite-mixture models play a central role due to their interesting
mathematical properties and to the existence of maximum-likelihood estimators
based on expectation-maximization (EM). In this paper we propose a new mixture
model that associates a weight with each observed point. We introduce the
weighted-data Gaussian mixture and we derive two EM algorithms. The first one
considers a fixed weight for each observation. The second one treats each
weight as a random variable following a gamma distribution. We propose a model
selection method based on a minimum message length criterion, provide a weight
initialization strategy, and validate the proposed algorithms by comparing them
with several state of the art parametric and non-parametric clustering
techniques. We also demonstrate the effectiveness and robustness of the
proposed clustering technique in the presence of heterogeneous data, namely
audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table
Techniques for clustering gene expression data
Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
Modelling Spatial Regimes in Farms Technologies
We exploit the information derived from geographical coordinates to
endogenously identify spatial regimes in technologies that are the result of a
variety of complex, dynamic interactions among site-specific environmental
variables and farmer decision making about technology, which are often not
observed at the farm level. Controlling for unobserved heterogeneity is a
fundamental challenge in empirical research, as failing to do so can produce
model misspecification and preclude causal inference. In this article, we adopt
a two-step procedure to deal with unobserved spatial heterogeneity, while
accounting for spatial dependence in a cross-sectional setting. The first step
of the procedure takes explicitly unobserved spatial heterogeneity into account
to endogenously identify subsets of farms that follow a similar local
production econometric model, i.e. spatial production regimes. The second step
consists in the specification of a spatial autoregressive model with
autoregressive disturbances and spatial regimes. The method is applied to two
regional samples of olive growing farms in Italy. The main finding is that the
identification of spatial regimes can help drawing a more detailed picture of
the production environment and provide more accurate information to guide
extension services and policy makers
- …