26,794 research outputs found
Communication-Optimal Distributed Dynamic Graph Clustering
We consider the problem of clustering graph nodes over large-scale dynamic
graphs, such as citation networks, images and web networks, when graph updates
such as node/edge insertions/deletions are observed distributively. We propose
communication-efficient algorithms for two well-established communication
models namely the message passing and the blackboard models. Given a graph with
nodes that is observed at remote sites over time , the two
proposed algorithms have communication costs and
( hides a polylogarithmic factor), almost matching
their lower bounds, and , respectively, in the
message passing and the blackboard models. More importantly, we prove that at
each time point in our algorithms generate clustering quality nearly as
good as that of centralizing all updates up to that time and then applying a
standard centralized clustering algorithm. We conducted extensive experiments
on both synthetic and real-life datasets which confirmed the communication
efficiency of our approach over baseline algorithms while achieving comparable
clustering results.Comment: Accepted and to appear in AAAI'1
Visualization of large citation networks with space-efficient multi-layer optimization
This paper describes a technique for visualizing large citation networks (or bibliography networks) using a space-efficient multi-layer optimization visualization, technique. Our technique first use a fast clustering algorithm to discover community structure in the bibliographic networks. The clustering process partitions an entire network into relevant abstract subgroups so that the visualization, can provide a clearer and less density of display of global view of the complete graph of citations. We next use a new space-efficient visualization algorithm to archive the optimization of graph layout within the limited display space so that our technique can theoretically handle a very large bibliography network with several thousands of elements. Our technique also employs rich graphics to enhance the attributed property of the visualization including publication years and number of citations. Finally, the system provides an interaction technique in cooperating with the layout to allow users to navigate through the citation network. Animation is also implemented to preserve the users' mental maps during the interaction
Clustering and Community Detection in Directed Networks: A Survey
Networks (or graphs) appear as dominant structures in diverse domains,
including sociology, biology, neuroscience and computer science. In most of the
aforementioned cases graphs are directed - in the sense that there is
directionality on the edges, making the semantics of the edges non symmetric.
An interesting feature that real networks present is the clustering or
community structure property, under which the graph topology is organized into
modules commonly called communities or clusters. The essence here is that nodes
of the same community are highly similar while on the contrary, nodes across
communities present low similarity. Revealing the underlying community
structure of directed complex networks has become a crucial and
interdisciplinary topic with a plethora of applications. Therefore, naturally
there is a recent wealth of research production in the area of mining directed
graphs - with clustering being the primary method and tool for community
detection and evaluation. The goal of this paper is to offer an in-depth review
of the methods presented so far for clustering directed networks along with the
relevant necessary methodological background and also related applications. The
survey commences by offering a concise review of the fundamental concepts and
methodological base on which graph clustering algorithms capitalize on. Then we
present the relevant work along two orthogonal classifications. The first one
is mostly concerned with the methodological principles of the clustering
algorithms, while the second one approaches the methods from the viewpoint
regarding the properties of a good cluster in a directed network. Further, we
present methods and metrics for evaluating graph clustering results,
demonstrate interesting application domains and provide promising future
research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches
We implemented three recently proposed approaches to the identification of
overlapping and hierarchical substructures in graphs and applied the
corresponding algorithms to a network of 492 information-science papers coupled
via their cited sources. The thematic substructures obtained and overlaps
produced by the three hierarchical cluster algorithms were compared to a
content-based categorisation, which we based on the interpretation of titles
and keywords. We defined sets of papers dealing with three topics located on
different levels of aggregation: h-index, webometrics, and bibliometrics. We
identified these topics with branches in the dendrograms produced by the three
cluster algorithms and compared the overlapping topics they detected with one
another and with the three pre-defined paper sets. We discuss the advantages
and drawbacks of applying the three approaches to paper networks in research
fields.Comment: 18 pages, 9 figure
Multi-level algorithms for modularity clustering
Modularity is one of the most widely used quality measures for graph
clusterings. Maximizing modularity is NP-hard, and the runtime of exact
algorithms is prohibitive for large graphs. A simple and effective class of
heuristics coarsens the graph by iteratively merging clusters (starting from
singletons), and optionally refines the resulting clustering by iteratively
moving individual vertices between clusters. Several heuristics of this type
have been proposed in the literature, but little is known about their relative
performance.
This paper experimentally compares existing and new coarsening- and
refinement-based heuristics with respect to their effectiveness (achieved
modularity) and efficiency (runtime). Concerning coarsening, it turns out that
the most widely used criterion for merging clusters (modularity increase) is
outperformed by other simple criteria, and that a recent algorithm by Schuetz
and Caflisch is no improvement over simple greedy coarsening for these
criteria. Concerning refinement, a new multi-level algorithm is shown to
produce significantly better clusterings than conventional single-level
algorithms. A comparison with published benchmark results and algorithm
implementations shows that combinations of coarsening and multi-level
refinement are competitive with the best algorithms in the literature.Comment: 12 pages, 10 figures, see
http://www.informatik.tu-cottbus.de/~rrotta/ for downloading the graph
clustering softwar
- …