57 research outputs found
Multi-level algorithms for modularity clustering
Modularity is one of the most widely used quality measures for graph
clusterings. Maximizing modularity is NP-hard, and the runtime of exact
algorithms is prohibitive for large graphs. A simple and effective class of
heuristics coarsens the graph by iteratively merging clusters (starting from
singletons), and optionally refines the resulting clustering by iteratively
moving individual vertices between clusters. Several heuristics of this type
have been proposed in the literature, but little is known about their relative
performance.
This paper experimentally compares existing and new coarsening- and
refinement-based heuristics with respect to their effectiveness (achieved
modularity) and efficiency (runtime). Concerning coarsening, it turns out that
the most widely used criterion for merging clusters (modularity increase) is
outperformed by other simple criteria, and that a recent algorithm by Schuetz
and Caflisch is no improvement over simple greedy coarsening for these
criteria. Concerning refinement, a new multi-level algorithm is shown to
produce significantly better clusterings than conventional single-level
algorithms. A comparison with published benchmark results and algorithm
implementations shows that combinations of coarsening and multi-level
refinement are competitive with the best algorithms in the literature.Comment: 12 pages, 10 figures, see
http://www.informatik.tu-cottbus.de/~rrotta/ for downloading the graph
clustering softwar
Modularity-Based Clustering for Network-Constrained Trajectories
We present a novel clustering approach for moving object trajectories that
are constrained by an underlying road network. The approach builds a similarity
graph based on these trajectories then uses modularity-optimization hiearchical
graph clustering to regroup trajectories with similar profiles. Our
experimental study shows the superiority of the proposed approach over classic
hierarchical clustering and gives a brief insight to visualization of the
clustering results.Comment: 20-th European Symposium on Artificial Neural Networks, Computational
Intelligence and Machine Learning (ESANN 2012), Bruges : Belgium (2012
Dissimilarity Clustering by Hierarchical Multi-Level Refinement
We introduce in this paper a new way of optimizing the natural extension of
the quantization error using in k-means clustering to dissimilarity data. The
proposed method is based on hierarchical clustering analysis combined with
multi-level heuristic refinement. The method is computationally efficient and
achieves better quantization errors than theComment: 20-th European Symposium on Artificial Neural Networks, Computational
Intelligence and Machine Learning (ESANN 2012), Bruges : Belgium (2012
Visual Mining of Epidemic Networks
We show how an interactive graph visualization method based on maximal
modularity clustering can be used to explore a large epidemic network. The
visual representation is used to display statistical tests results that expose
the relations between the propagation of HIV in a sexual contact network and
the sexual orientation of the patients.Comment: 8 page
Co-Clustering Network-Constrained Trajectory Data
Recently, clustering moving object trajectories kept gaining interest from
both the data mining and machine learning communities. This problem, however,
was studied mainly and extensively in the setting where moving objects can move
freely on the euclidean space. In this paper, we study the problem of
clustering trajectories of vehicles whose movement is restricted by the
underlying road network. We model relations between these trajectories and road
segments as a bipartite graph and we try to cluster its vertices. We
demonstrate our approaches on synthetic data and show how it could be useful in
inferring knowledge about the flow dynamics and the behavior of the drivers
using the road network
Graphs in machine learning: an introduction
Graphs are commonly used to characterise interactions between objects of
interest. Because they are based on a straightforward formalism, they are used
in many scientific fields from computer science to historical sciences. In this
paper, we give an introduction to some methods relying on graphs for learning.
This includes both unsupervised and supervised methods. Unsupervised learning
algorithms usually aim at visualising graphs in latent spaces and/or clustering
the nodes. Both focus on extracting knowledge from graph topologies. While most
existing techniques are only applicable to static graphs, where edges do not
evolve through time, recent developments have shown that they could be extended
to deal with evolving networks. In a supervised context, one generally aims at
inferring labels or numerical values attached to nodes using both the graph
and, when they are available, node characteristics. Balancing the two sources
of information can be challenging, especially as they can disagree locally or
globally. In both contexts, supervised and un-supervised, data can be
relational (augmented with one or several global graphs) as described above, or
graph valued. In this latter case, each object of interest is given as a full
graph (possibly completed by other characteristics). In this context, natural
tasks include graph clustering (as in producing clusters of graphs rather than
clusters of nodes in a single graph), graph classification, etc. 1 Real
networks One of the first practical studies on graphs can be dated back to the
original work of Moreno [51] in the 30s. Since then, there has been a growing
interest in graph analysis associated with strong developments in the modelling
and the processing of these data. Graphs are now used in many scientific
fields. In Biology [54, 2, 7], for instance, metabolic networks can describe
pathways of biochemical reactions [41], while in social sciences networks are
used to represent relation ties between actors [66, 56, 36, 34]. Other examples
include powergrids [71] and the web [75]. Recently, networks have also been
considered in other areas such as geography [22] and history [59, 39]. In
machine learning, networks are seen as powerful tools to model problems in
order to extract information from data and for prediction purposes. This is the
object of this paper. For more complete surveys, we refer to [28, 62, 49, 45].
In this section, we introduce notations and highlight properties shared by most
real networks. In Section 2, we then consider methods aiming at extracting
information from a unique network. We will particularly focus on clustering
methods where the goal is to find clusters of vertices. Finally, in Section 3,
techniques that take a series of networks into account, where each network i
Recommended from our members
The Core-Periphery Structure of Destination Image: An Exploration via Social Network Analysis
The core-periphery structure (C/PS) of destination image, recently proposed as an alternative image model, has scarcely been tested. Presumably, the lack of appropriate techniques plays an important role in the limited theory testing efforts. Besides, it remains unclear how people retrieve destination image from memory, when we place destination image into a C-P model. Understanding the structure of destination image could reveal insights into image retrieval. This research aims to introduce social network analysis as a new approach to test the C/PS of destination image, and identify the retrieving paths of destination image adopted by potential tourists. Image descriptions about the Shanghai Disney Resort (SHDR) were collected from 1,000 respondents, and the data was analyzed via social network techniques. Results show that the image of SHDR has both single and multiple C/PSs, and image retrieval of respondents either follows a core-to-periphery path or fluctuates between two neighboring levels of core/periphery
Laplacian Dynamics and Multiscale Modular Structure in Networks
Most methods proposed to uncover communities in complex networks rely on
their structural properties. Here we introduce the stability of a network
partition, a measure of its quality defined in terms of the statistical
properties of a dynamical process taking place on the graph. The time-scale of
the process acts as an intrinsic parameter that uncovers community structures
at different resolutions. The stability extends and unifies standard notions
for community detection: modularity and spectral partitioning can be seen as
limiting cases of our dynamic measure. Similarly, recently proposed
multi-resolution methods correspond to linearisations of the stability at short
times. The connection between community detection and Laplacian dynamics
enables us to establish dynamically motivated stability measures linked to
distinct null models. We apply our method to find multi-scale partitions for
different networks and show that the stability can be computed efficiently for
large networks with extended versions of current algorithms.Comment: New discussions on the selection of the most significant scales and
the generalisation of stability to directed network
- …