190,070 research outputs found
Hierarchical characterization of complex networks
While the majority of approaches to the characterization of complex networks
has relied on measurements considering only the immediate neighborhood of each
network node, valuable information about the network topological properties can
be obtained by considering further neighborhoods. The current work discusses on
how the concepts of hierarchical node degree and hierarchical clustering
coefficient (introduced in cond-mat/0408076), complemented by new hierarchical
measurements, can be used in order to obtain a powerful set of topological
features of complex networks. The interpretation of such measurements is
discussed, including an analytical study of the hierarchical node degree for
random networks, and the potential of the suggested measurements for the
characterization of complex networks is illustrated with respect to simulations
of random, scale-free and regular network models as well as real data
(airports, proteins and word associations). The enhanced characterization of
the connectivity provided by the set of hierarchical measurements also allows
the use of agglomerative clustering methods in order to obtain taxonomies of
relationships between nodes in a network, a possibility which is also
illustrated in the current article.Comment: 19 pages, 23 figure
Element-centric clustering comparison unifies overlaps and hierarchy
Clustering is one of the most universal approaches for understanding complex
data. A pivotal aspect of clustering analysis is quantitatively comparing
clusterings; clustering comparison is the basis for many tasks such as
clustering evaluation, consensus clustering, and tracking the temporal
evolution of clusters. In particular, the extrinsic evaluation of clustering
methods requires comparing the uncovered clusterings to planted clusterings or
known metadata. Yet, as we demonstrate, existing clustering comparison measures
have critical biases which undermine their usefulness, and no measure
accommodates both overlapping and hierarchical clusterings. Here we unify the
comparison of disjoint, overlapping, and hierarchically structured clusterings
by proposing a new element-centric framework: elements are compared based on
the relationships induced by the cluster structure, as opposed to the
traditional cluster-centric philosophy. We demonstrate that, in contrast to
standard clustering similarity measures, our framework does not suffer from
critical biases and naturally provides unique insights into how the clusterings
differ. We illustrate the strengths of our framework by revealing new insights
into the organization of clusters in two applications: the improved
classification of schizophrenia based on the overlapping and hierarchical
community structure of fMRI brain networks, and the disentanglement of various
social homophily factors in Facebook social networks. The universality of
clustering suggests far-reaching impact of our framework throughout all areas
of science
Hierarchical Clustering of Complex Symbolic Data and Application for Emitter Identification
It is well-known that the values of symbolic variables may take various forms such as an interval, a set of stochastic measurements of some underlying patterns or qualitative multi-values and so on. However, the majority of existing work in symbolic data analysis still focuses on interval values. Although some pioneering work in stochastic pattern based symbolic data and mixture of symbolic variables has been explored, it still lacks flexibility and computation efficiency to make full use of the distinctive individual symbolic variables. Therefore, we bring forward a novel hierarchical clustering method with weighted general Jaccard distance and effective global pruning strategy for complex symbolic data and apply it to emitter identification. Extensive experiments indicate that our method has outperformed its peers in both computational efficiency and emitter identification accuracy.Peer reviewe
A non-parametric hierarchical clustering model
© 2015 IEEE. We present a novel non-parametric clustering model using Gaussian mixture model (NHCM). NHCM uses a novel Dirichlet process (DP) prior allowing for more flexible modeling of the data, where the base distribution of DP is itself an infinite mixture of Gaussian conjugate prior. NHCM can be thought of as hierarchical clustering model, in which the low level base prior governs the distribution of the data points forming sub-clusters, and the higher level prior governs the distribution of the sub-clusters forming clusters. Using this hierarchical configuration, we can maintain low complexity of the model and allow for clustering skewed complex data. To perform inference, we propose a Gibbs sampling algorithm. Empirical investigations have been carried out to analyse the efficiency of the proposed clustering model
The New Software Package for Dynamic Hierarchical Clustering for Circles Types of Shapes
In data mining, efforts have focused on finding methods for efficient and effective cluster analysis in
large databases. Active themes of research focus on the scalability of clustering methods, the effectiveness of
methods for clustering complex shapes and types of data, high-dimensional clustering techniques, and methods
for clustering mixed numerical and categorical data in large databases. One of the most accuracy approach
based on dynamic modeling of cluster similarity is called Chameleon. In this paper we present a modified
hierarchical clustering algorithm that used the main idea of Chameleon and the effectiveness of suggested
approach will be demonstrated by the experimental results
An Agglomerative Hierarchical Clustering with Various Distance Measurements for Ground Level Ozone Clustering in Putrajaya, Malaysia
Ground level ozone is one of the common pollution issues that has a negative influence on human health. The key characteristic behind ozone level analysis lies on the complex representation of such data which can be shown by time series. Clustering is one of the common techniques that have been used for time series metrological and environmental data. The way that clustering technique groups the similar sequences relies on a distance or similarity criteria. Several distance measures have been integrated with various types of clustering techniques. However, identifying an appropriate distance measure for a particular field is a challenging task. Since the hierarchical clustering has been considered as the state of the art for metrological and climate change data, this paper proposes an agglomerative hierarchical clustering for ozone level analysis in Putrajaya, Malaysia using three distance measures i.e. Euclidean, Minkowski and Dynamic Time Warping. Results shows that Dynamic Time Warping has outperformed the other two distance measures
- …