2,385 research outputs found
Detection of elliptical shapes via cross-entropy clustering
The problem of finding elliptical shapes in an image will be considered. We
discuss the solution which uses cross-entropy clustering. The proposed method
allows the search for ellipses with predefined sizes and position in the space.
Moreover, it works well for search of ellipsoids in higher dimensions
A local Gaussian filter and adaptive morphology as tools for completing partially discontinuous curves
This paper presents a method for extraction and analysis of curve--type
structures which consist of disconnected components. Such structures are found
in electron--microscopy (EM) images of metal nanograins, which are widely used
in the field of nanosensor technology.
The topography of metal nanograins in compound nanomaterials is crucial to
nanosensor characteristics. The method of completing such templates consists of
three steps. In the first step, a local Gaussian filter is used with different
weights for each neighborhood. In the second step, an adaptive morphology
operation is applied to detect the endpoints of curve segments and connect
them. In the last step, pruning is employed to extract a curve which optimally
fits the template
Cross-Entropy Clustering
We construct a cross-entropy clustering (CEC) theory which finds the optimal
number of clusters by automatically removing groups which carry no information.
Moreover, our theory gives simple and efficient criterion to verify cluster
validity.
Although CEC can be build on an arbitrary family of densities, in the most
important case of Gaussian CEC:
{\em -- the division into clusters is affine invariant;
-- the clustering will have the tendency to divide the data into
ellipsoid-type shapes;
-- the approach is computationally efficient as we can apply Hartigan
approach.}
We study also with particular attention clustering based on the Spherical
Gaussian densities and that of Gaussian densities with covariance s \I. In
the letter case we show that with converging to zero we obtain the
classical k-means clustering
Identifying Mixtures of Mixtures Using Bayesian Estimation
The use of a finite mixture of normal distributions in model-based clustering
allows to capture non-Gaussian data clusters. However, identifying the clusters
from the normal components is challenging and in general either achieved by
imposing constraints on the model or by using post-processing procedures.
Within the Bayesian framework we propose a different approach based on sparse
finite mixtures to achieve identifiability. We specify a hierarchical prior
where the hyperparameters are carefully selected such that they are reflective
of the cluster structure aimed at. In addition this prior allows to estimate
the model using standard MCMC sampling methods. In combination with a
post-processing approach which resolves the label switching issue and results
in an identified model, our approach allows to simultaneously (1) determine the
number of clusters, (2) flexibly approximate the cluster distributions in a
semi-parametric way using finite mixtures of normals and (3) identify
cluster-specific parameters and classify observations. The proposed approach is
illustrated in two simulation studies and on benchmark data sets.Comment: 49 page
Uniform Cross-entropy Clustering
Robust mixture models approaches, which use non-normal distributions have recently been upgraded to accommodate data with fixed bounds. In this article we propose a new method based on uniform distributions and Cross-Entropy Clustering (CEC). We combine a simple density model with a clustering method which allows to treat groups separately and estimate parameters in each cluster individually. Consequently, we introduce an effective clustering algorithm which deals with non-normal data
Ellipticity and circularity measuring via Kullback-Leibler divergence
Using the Kullback-Leibler divergence we provide a simple statistical measure which uses only the covariance matrix of a given set to verify whether the set is an ellipsoid. Similar measure is provided for verification of circles and balls. The new measure is easily computable, intuitive, and can be applied to higher dimensional data. Experiments have been performed to illustrate that the new measure behaves in natural way
Soft clustering analysis of galaxy morphologies: A worked example with SDSS
Context: The huge and still rapidly growing amount of galaxies in modern sky
surveys raises the need of an automated and objective classification method.
Unsupervised learning algorithms are of particular interest, since they
discover classes automatically. Aims: We briefly discuss the pitfalls of
oversimplified classification methods and outline an alternative approach
called "clustering analysis". Methods: We categorise different classification
methods according to their capabilities. Based on this categorisation, we
present a probabilistic classification algorithm that automatically detects the
optimal classes preferred by the data. We explore the reliability of this
algorithm in systematic tests. Using a small sample of bright galaxies from the
SDSS, we demonstrate the performance of this algorithm in practice. We are able
to disentangle the problems of classification and parametrisation of galaxy
morphologies in this case. Results: We give physical arguments that a
probabilistic classification scheme is necessary. The algorithm we present
produces reasonable morphological classes and object-to-class assignments
without any prior assumptions. Conclusions: There are sophisticated automated
classification algorithms that meet all necessary requirements, but a lot of
work is still needed on the interpretation of the results.Comment: 18 pages, 19 figures, 2 tables, submitted to A
- …