61,200 research outputs found

    Extremal Optimization for Graph Partitioning

    Full text link
    Extremal optimization is a new general-purpose method for approximating solutions to hard optimization problems. We study the method in detail by way of the NP-hard graph partitioning problem. We discuss the scaling behavior of extremal optimization, focusing on the convergence of the average run as a function of runtime and system size. The method has a single free parameter, which we determine numerically and justify using a simple argument. Our numerical results demonstrate that on random graphs, extremal optimization maintains consistent accuracy for increasing system sizes, with an approximation error decreasing over runtime roughly as a power law t^(-0.4). On geometrically structured graphs, the scaling of results from the average run suggests that these are far from optimal, with large fluctuations between individual trials. But when only the best runs are considered, results consistent with theoretical arguments are recovered.Comment: 34 pages, RevTex4, 1 table and 20 ps-figures included, related papers available at http://www.physics.emory.edu/faculty/boettcher

    Distributed Graph Embedding with Information-Oriented Random Walks

    Full text link
    Graph embedding maps graph nodes to low-dimensional vectors, and is widely adopted in machine learning tasks. The increasing availability of billion-edge graphs underscores the importance of learning efficient and effective embeddings on large graphs, such as link prediction on Twitter with over one billion edges. Most existing graph embedding methods fall short of reaching high data scalability. In this paper, we present a general-purpose, distributed, information-centric random walk-based graph embedding framework, DistGER, which can scale to embed billion-edge graphs. DistGER incrementally computes information-centric random walks. It further leverages a multi-proximity-aware, streaming, parallel graph partitioning strategy, simultaneously achieving high local partition quality and excellent workload balancing across machines. DistGER also improves the distributed Skip-Gram learning model to generate node embeddings by optimizing the access locality, CPU throughput, and synchronization efficiency. Experiments on real-world graphs demonstrate that compared to state-of-the-art distributed graph embedding frameworks, including KnightKing, DistDGL, and Pytorch-BigGraph, DistGER exhibits 2.33x-129x acceleration, 45% reduction in cross-machines communication, and > 10% effectiveness improvement in downstream tasks

    Extremal Optimization of Graph Partitioning at the Percolation Threshold

    Full text link
    The benefits of a recently proposed method to approximate hard optimization problems are demonstrated on the graph partitioning problem. The performance of this new method, called Extremal Optimization, is compared to Simulated Annealing in extensive numerical simulations. While generally a complex (NP-hard) problem, the optimization of the graph partitions is particularly difficult for sparse graphs with average connectivities near the percolation threshold. At this threshold, the relative error of Simulated Annealing for large graphs is found to diverge relative to Extremal Optimization at equalized runtime. On the other hand, Extremal Optimization, based on the extremal dynamics of self-organized critical systems, reproduces known results about optimal partitions at this critical point quite well.Comment: 7 pages, RevTex, 9 ps-figures included, as to appear in Journal of Physics

    Random Geometric Graphs

    Full text link
    We analyse graphs in which each vertex is assigned random coordinates in a geometric space of arbitrary dimensionality and only edges between adjacent points are present. The critical connectivity is found numerically by examining the size of the largest cluster. We derive an analytical expression for the cluster coefficient which shows that the graphs are distinctly different from standard random graphs, even for infinite dimensionality. Insights relevant for graph bi-partitioning are included.Comment: 16 pages, 10 figures. Minor changes. Added reference

    Parallel Graph Partitioning for Complex Networks

    Full text link
    Processing large complex networks like social networks or web graphs has recently attracted considerable interest. In order to do this in parallel, we need to partition them into pieces of about equal size. Unfortunately, previous parallel graph partitioners originally developed for more regular mesh-like networks do not work well for these networks. This paper addresses this problem by parallelizing and adapting the label propagation technique originally developed for graph clustering. By introducing size constraints, label propagation becomes applicable for both the coarsening and the refinement phase of multilevel graph partitioning. We obtain very high quality by applying a highly parallel evolutionary algorithm to the coarsened graph. The resulting system is both more scalable and achieves higher quality than state-of-the-art systems like ParMetis or PT-Scotch. For large complex networks the performance differences are very big. For example, our algorithm can partition a web graph with 3.3 billion edges in less than sixteen seconds using 512 cores of a high performance cluster while producing a high quality partition -- none of the competing systems can handle this graph on our system.Comment: Review article. Parallelization of our previous approach arXiv:1402.328
    • …
    corecore