1,350 research outputs found

    Community detection algorithms: a comparative analysis

    Full text link
    Uncovering the community structure exhibited by real networks is a crucial step towards an understanding of complex systems that goes beyond the local organization of their constituents. Many algorithms have been proposed so far, but none of them has been subjected to strict tests to evaluate their performance. Most of the sporadic tests performed so far involved small networks with known community structure and/or artificial graphs with a simplified structure, which is very uncommon in real systems. Here we test several methods against a recently introduced class of benchmark graphs, with heterogeneous distributions of degree and community size. The methods are also tested against the benchmark by Girvan and Newman and on random graphs. As a result of our analysis, three recent algorithms introduced by Rosvall and Bergstrom, Blondel et al. and Ronhovde and Nussinov, respectively, have an excellent performance, with the additional advantage of low computational complexity, which enables one to analyze large systems.Comment: 12 pages, 8 figures. The software to compute the values of our general normalized mutual information is available at http://santo.fortunato.googlepages.com/inthepress

    A Logical Approach to Efficient Max-SAT solving

    Get PDF
    Weighted Max-SAT is the optimization version of SAT and many important problems can be naturally encoded as such. Solving weighted Max-SAT is an important problem from both a theoretical and a practical point of view. In recent years, there has been considerable interest in finding efficient solving techniques. Most of this work focus on the computation of good quality lower bounds to be used within a branch and bound DPLL-like algorithm. Most often, these lower bounds are described in a procedural way. Because of that, it is difficult to realize the {\em logic} that is behind. In this paper we introduce an original framework for Max-SAT that stresses the parallelism with classical SAT. Then, we extend the two basic SAT solving techniques: {\em search} and {\em inference}. We show that many algorithmic {\em tricks} used in state-of-the-art Max-SAT solvers are easily expressable in {\em logic} terms with our framework in a unified manner. Besides, we introduce an original search algorithm that performs a restricted amount of {\em weighted resolution} at each visited node. We empirically compare our algorithm with a variety of solving alternatives on several benchmarks. Our experiments, which constitute to the best of our knowledge the most comprehensive Max-sat evaluation ever reported, show that our algorithm is generally orders of magnitude faster than any competitor

    Algebraic Methods in the Congested Clique

    Full text link
    In this work, we use algebraic methods for studying distance computation and subgraph detection tasks in the congested clique model. Specifically, we adapt parallel matrix multiplication implementations to the congested clique, obtaining an O(n12/ω)O(n^{1-2/\omega}) round matrix multiplication algorithm, where ω<2.3728639\omega < 2.3728639 is the exponent of matrix multiplication. In conjunction with known techniques from centralised algorithmics, this gives significant improvements over previous best upper bounds in the congested clique model. The highlight results include: -- triangle and 4-cycle counting in O(n0.158)O(n^{0.158}) rounds, improving upon the O(n1/3)O(n^{1/3}) triangle detection algorithm of Dolev et al. [DISC 2012], -- a (1+o(1))(1 + o(1))-approximation of all-pairs shortest paths in O(n0.158)O(n^{0.158}) rounds, improving upon the O~(n1/2)\tilde{O} (n^{1/2})-round (2+o(1))(2 + o(1))-approximation algorithm of Nanongkai [STOC 2014], and -- computing the girth in O(n0.158)O(n^{0.158}) rounds, which is the first non-trivial solution in this model. In addition, we present a novel constant-round combinatorial algorithm for detecting 4-cycles.Comment: This is work is a merger of arxiv:1412.2109 and arxiv:1412.266

    Shortest Distances as Enumeration Problem

    Full text link
    We investigate the single source shortest distance (SSSD) and all pairs shortest distance (APSD) problems as enumeration problems (on unweighted and integer weighted graphs), meaning that the elements (u,v,d(u,v))(u, v, d(u, v)) -- where uu and vv are vertices with shortest distance d(u,v)d(u, v) -- are produced and listed one by one without repetition. The performance is measured in the RAM model of computation with respect to preprocessing time and delay, i.e., the maximum time that elapses between two consecutive outputs. This point of view reveals that specific types of output (e.g., excluding the non-reachable pairs (u,v,)(u, v, \infty), or excluding the self-distances (u,u,0)(u, u, 0)) and the order of enumeration (e.g., sorted by distance, sorted row-wise with respect to the distance matrix) have a huge impact on the complexity of APSD while they appear to have no effect on SSSD. In particular, we show for APSD that enumeration without output restrictions is possible with delay in the order of the average degree. Excluding non-reachable pairs, or requesting the output to be sorted by distance, increases this delay to the order of the maximum degree. Further, for weighted graphs, a delay in the order of the average degree is also not possible without preprocessing or considering self-distances as output. In contrast, for SSSD we find that a delay in the order of the maximum degree without preprocessing is attainable and unavoidable for any of these requirements.Comment: Updated version adds the study of space complexit

    Improved Parallel Algorithms for Spanners and Hopsets

    Full text link
    We use exponential start time clustering to design faster and more work-efficient parallel graph algorithms involving distances. Previous algorithms usually rely on graph decomposition routines with strict restrictions on the diameters of the decomposed pieces. We weaken these bounds in favor of stronger local probabilistic guarantees. This allows more direct analyses of the overall process, giving: * Linear work parallel algorithms that construct spanners with O(k)O(k) stretch and size O(n1+1/k)O(n^{1+1/k}) in unweighted graphs, and size O(n1+1/klogk)O(n^{1+1/k} \log k) in weighted graphs. * Hopsets that lead to the first parallel algorithm for approximating shortest paths in undirected graphs with O(m  polylog  n)O(m\;\mathrm{polylog}\;n) work

    Using Triangles to Improve Community Detection in Directed Networks

    Full text link
    In a graph, a community may be loosely defined as a group of nodes that are more closely connected to one another than to the rest of the graph. While there are a variety of metrics that can be used to specify the quality of a given community, one common theme is that flows tend to stay within communities. Hence, we expect cycles to play an important role in community detection. For undirected graphs, the importance of triangles -- an undirected 3-cycle -- has been known for a long time and can be used to improve community detection. In directed graphs, the situation is more nuanced. The smallest cycle is simply two nodes with a reciprocal connection, and using information about reciprocation has proven to improve community detection. Our new idea is based on the four types of directed triangles that contain cycles. To identify communities in directed networks, then, we propose an undirected edge-weighting scheme based on the type of the directed triangles in which edges are involved. We also propose a new metric on quality of the communities that is based on the number of 3-cycles that are split across communities. To demonstrate the impact of our new weighting, we use the standard METIS graph partitioning tool to determine communities and show experimentally that the resulting communities result in fewer 3-cycles being cut. The magnitude of the effect varies between a 10 and 50% reduction, and we also find evidence that this weighting scheme improves a task where plausible ground-truth communities are known.Comment: 10 pages, 3 figure

    Partitioning networks into cliques: a randomized heuristic approach

    Get PDF
    In the context of community detection in social networks, the term community can be grounded in the strict way that simply everybody should know each other within the community. We consider the corresponding community detection problem. We search for a partitioning of a network into the minimum number of non-overlapping cliques, such that the cliques cover all vertices. This problem is called the clique covering problem (CCP) and is one of the classical NP-hard problems. For CCP, we propose a randomized heuristic approach. To construct a high quality solution to CCP, we present an iterated greedy (IG) algorithm. IG can also be combined with a heuristic used to determine how far the algorithm is from the optimum in the worst case. Randomized local search (RLS) for maximum independent set was proposed to find such a bound. The experimental results of IG and the bounds obtained by RLS indicate that IG is a very suitable technique for solving CCP in real-world graphs. In addition, we summarize our basic rigorous results, which were developed for analysis of IG and understanding of its behavior on several relevant graph classes
    corecore