4,819 research outputs found
A Topological Approach to Spectral Clustering
We propose two related unsupervised clustering algorithms which, for input,
take data assumed to be sampled from a uniform distribution supported on a
metric space , and output a clustering of the data based on the selection of
a topological model for the connected components of . Both algorithms work
by selecting a graph on the samples from a natural one-parameter family of
graphs, using a geometric criterion in the first case and an information
theoretic criterion in the second. The estimated connected components of
are identified with the kernel of the associated graph Laplacian, which allows
the algorithm to work without requiring the number of expected clusters or
other auxiliary data as input.Comment: 21 Page
Methods of Hierarchical Clustering
We survey agglomerative hierarchical clustering algorithms and discuss
efficient implementations that are available in R and other software
environments. We look at hierarchical self-organizing maps, and mixture models.
We review grid-based clustering, focusing on hierarchical density-based
approaches. Finally we describe a recently developed very efficient (linear
time) hierarchical clustering algorithm, which can also be viewed as a
hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference
Climate Dynamics: A Network-Based Approach for the Analysis of Global Precipitation
Precipitation is one of the most important meteorological variables for defining the climate dynamics, but the spatial patterns of precipitation have not been fully investigated yet. The complex network theory, which provides a robust tool to investigate the statistical interdependence of many interacting elements, is used here to analyze the spatial dynamics of annual precipitation over seventy years (1941-2010). The precipitation network is built associating a node to a geographical region, which has a temporal distribution of precipitation, and identifying possible links among nodes through the correlation function. The precipitation network reveals significant spatial variability with barely connected regions, as Eastern China and Japan, and highly connected regions, such as the African Sahel, Eastern Australia and, to a lesser extent, Northern Europe. Sahel and Eastern Australia are remarkably dry regions, where low amounts of rainfall are uniformly distributed on continental scales and small-scale extreme events are rare. As a consequence, the precipitation gradient is low, making these regions well connected on a large spatial scale. On the contrary, the Asiatic South-East is often reached by extreme events such as monsoons, tropical cyclones and heat waves, which can all contribute to reduce the correlation to the short-range scale only. Some patterns emerging between mid-latitude and tropical regions suggest a possible impact of the propagation of planetary waves on precipitation at a global scale. Other links can be qualitatively associated to the atmospheric and oceanic circulation. To analyze the sensitivity of the network to the physical closeness of the nodes, short-term connections are broken. The African Sahel, Eastern Australia and Northern Europe regions again appear as the supernodes of the network, confirming furthermore their long-range connection structure. Almost all North-American and Asian nodes vanish, revealing that extreme events can enhance high precipitation gradients, leading to a systematic absence of long-range patterns
The Internet AS-Level Topology: Three Data Sources and One Definitive Metric
We calculate an extensive set of characteristics for Internet AS topologies
extracted from the three data sources most frequently used by the research
community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP
topologies are similar to one another but differ substantially from the WHOIS
topology. Among the widely considered metrics, we find that the joint degree
distribution appears to fundamentally characterize Internet AS topologies as
well as narrowly define values for other important metrics. We discuss the
interplay between the specifics of the three data collection mechanisms and the
resulting topology views. In particular, we show how the data collection
peculiarities explain differences in the resulting joint degree distributions
of the respective topologies. Finally, we release to the community the input
topology datasets, along with the scripts and output of our calculations. This
supplement should enable researchers to validate their models against real data
and to make more informed selection of topology data sources for their specific
needs.Comment: This paper is a revised journal version of cs.NI/050803
Different approaches to community detection
A precise definition of what constitutes a community in networks has remained
elusive. Consequently, network scientists have compared community detection
algorithms on benchmark networks with a particular form of community structure
and classified them based on the mathematical techniques they employ. However,
this comparison can be misleading because apparent similarities in their
mathematical machinery can disguise different reasons for why we would want to
employ community detection in the first place. Here we provide a focused review
of these different motivations that underpin community detection. This
problem-driven classification is useful in applied network science, where it is
important to select an appropriate algorithm for the given purpose. Moreover,
highlighting the different approaches to community detection also delineates
the many lines of research and points out open directions and avenues for
future research.Comment: 14 pages, 2 figures. Written as a chapter for forthcoming Advances in
network clustering and blockmodeling, and based on an extended version of The
many facets of community detection in complex networks, Appl. Netw. Sci. 2: 4
(2017) by the same author
- …