17,042 research outputs found
Considerations about multistep community detection
The problem and implications of community detection in networks have raised a
huge attention, for its important applications in both natural and social
sciences. A number of algorithms has been developed to solve this problem,
addressing either speed optimization or the quality of the partitions
calculated. In this paper we propose a multi-step procedure bridging the
fastest, but less accurate algorithms (coarse clustering), with the slowest,
most effective ones (refinement). By adopting heuristic ranking of the nodes,
and classifying a fraction of them as `critical', a refinement step can be
restricted to this subset of the network, thus saving computational time.
Preliminary numerical results are discussed, showing improvement of the final
partition.Comment: 12 page
Partitioning networks into cliques: a randomized heuristic approach
In the context of community detection in social networks, the term community can be grounded in the strict way that simply everybody should know each other within the community. We consider the corresponding community detection problem. We search for a partitioning of a network into the minimum number of non-overlapping cliques, such that the cliques cover all vertices. This problem is called the clique covering problem (CCP) and is one of the classical NP-hard problems. For CCP, we propose a randomized heuristic approach. To construct a high quality solution to CCP, we present an iterated greedy (IG) algorithm. IG can also be combined with a heuristic used to determine how far the algorithm is from the optimum in the worst case. Randomized local search (RLS) for maximum independent set was proposed to find such a bound. The experimental results of IG and the bounds obtained by RLS indicate that IG is a very suitable technique for solving CCP in real-world graphs. In addition, we summarize our basic rigorous results, which were developed for analysis of IG and understanding of its behavior on several relevant graph classes
Fast counting with tensor networks
We introduce tensor network contraction algorithms for counting satisfying
assignments of constraint satisfaction problems (#CSPs). We represent each
arbitrary #CSP formula as a tensor network, whose full contraction yields the
number of satisfying assignments of that formula, and use graph theoretical
methods to determine favorable orders of contraction. We employ our heuristics
for the solution of #P-hard counting boolean satisfiability (#SAT) problems,
namely monotone #1-in-3SAT and #Cubic-Vertex-Cover, and find that they
outperform state-of-the-art solvers by a significant margin.Comment: v2: added results for monotone #1-in-3SAT; published versio
Window-based Streaming Graph Partitioning Algorithm
In the recent years, the scale of graph datasets has increased to such a
degree that a single machine is not capable of efficiently processing large
graphs. Thereby, efficient graph partitioning is necessary for those large
graph applications. Traditional graph partitioning generally loads the whole
graph data into the memory before performing partitioning; this is not only a
time consuming task but it also creates memory bottlenecks. These issues of
memory limitation and enormous time complexity can be resolved using
stream-based graph partitioning. A streaming graph partitioning algorithm reads
vertices once and assigns that vertex to a partition accordingly. This is also
called an one-pass algorithm. This paper proposes an efficient window-based
streaming graph partitioning algorithm called WStream. The WStream algorithm is
an edge-cut partitioning algorithm, which distributes a vertex among the
partitions. Our results suggest that the WStream algorithm is able to partition
large graph data efficiently while keeping the load balanced across different
partitions, and communication to a minimum. Evaluation results with real
workloads also prove the effectiveness of our proposed algorithm, and it
achieves a significant reduction in load imbalance and edge-cut with different
ranges of dataset
A Fast and Efficient Incremental Approach toward Dynamic Community Detection
Community detection is a discovery tool used by network scientists to analyze
the structure of real-world networks. It seeks to identify natural divisions
that may exist in the input networks that partition the vertices into coherent
modules (or communities). While this problem space is rich with efficient
algorithms and software, most of this literature caters to the static use-case
where the underlying network does not change. However, many emerging real-world
use-cases give rise to a need to incorporate dynamic graphs as inputs.
In this paper, we present a fast and efficient incremental approach toward
dynamic community detection. The key contribution is a generic technique called
, which examines the most recent batch of changes made to an
input graph and selects a subset of vertices to reevaluate for potential
community (re)assignment. This technique can be incorporated into any of the
community detection methods that use modularity as its objective function for
clustering. For demonstration purposes, we incorporated the technique into two
well-known community detection tools. Our experiments demonstrate that our new
incremental approach is able to generate performance speedups without
compromising on the output quality (despite its heuristic nature). For
instance, on a real-world network with 63M temporal edges (over 12 time steps),
our approach was able to complete in 1056 seconds, yielding a 3x speedup over a
baseline implementation. In addition to demonstrating the performance benefits,
we also show how to use our approach to delineate appropriate intervals of
temporal resolutions at which to analyze an input network
RASCAL: calculation of graph similarity using maximum common edge subgraphs
A new graph similarity calculation procedure is introduced for comparing labeled graphs. Given a minimum similarity threshold, the procedure consists of an initial screening process to determine whether it is possible for the measure of similarity between the two graphs to exceed the minimum threshold, followed by a rigorous maximum common edge subgraph (MCES) detection algorithm to compute the exact degree and composition of similarity. The proposed MCES algorithm is based on a maximum clique formulation of the problem and is a significant improvement over other published algorithms. It presents new approaches to both lower and upper bounding as well as vertex selection
Hierarchical path-finding for Navigation Meshes (HNA*)
Path-finding can become an important bottleneck as both the size of the virtual environments and the number of agents navigating them increase. It is important to develop techniques that can be efficiently applied to any environment independently of its abstract representation. In this paper we present a hierarchical NavMesh representation to speed up path-finding. Hierarchical path-finding (HPA*) has been successfully applied to regular grids, but there is a need to extend the benefits of this method to polygonal navigation meshes. As opposed to regular grids, navigation meshes offer representations with higher accuracy regarding the underlying geometry, while containing a smaller number of cells. Therefore, we present a bottom-up method to create a hierarchical representation based on a multilevel k-way partitioning algorithm (MLkP), annotated with sub-paths that can be accessed online by our Hierarchical NavMesh Path-finding algorithm (HNA*). The algorithm benefits from searching in graphs with a much smaller number of cells, thus performing up to 7.7 times faster than traditional A¿ over the initial NavMesh. We present results of HNA* over a variety of scenarios and discuss the benefits of the algorithm together with areas for improvement.Peer ReviewedPostprint (author's final draft
Cut Size Statistics of Graph Bisection Heuristics
We investigate the statistical properties of cut sizes generated by heuristic
algorithms which solve approximately the graph bisection problem. On an
ensemble of sparse random graphs, we find empirically that the distribution of
the cut sizes found by ``local'' algorithms becomes peaked as the number of
vertices in the graphs becomes large. Evidence is given that this distribution
tends towards a Gaussian whose mean and variance scales linearly with the
number of vertices of the graphs. Given the distribution of cut sizes
associated with each heuristic, we provide a ranking procedure which takes into
account both the quality of the solutions and the speed of the algorithms. This
procedure is demonstrated for a selection of local graph bisection heuristics.Comment: 17 pages, 5 figures, submitted to SIAM Journal on Optimization also
available at http://ipnweb.in2p3.fr/~martin
- …