    Cut Size Statistics of Graph Bisection Heuristics

    We investigate the statistical properties of cut sizes generated by heuristic algorithms which solve approximately the graph bisection problem. On an ensemble of sparse random graphs, we find empirically that the distribution of the cut sizes found by ``local'' algorithms becomes peaked as the number of vertices in the graphs becomes large. Evidence is given that this distribution tends towards a Gaussian whose mean and variance scales linearly with the number of vertices of the graphs. Given the distribution of cut sizes associated with each heuristic, we provide a ranking procedure which takes into account both the quality of the solutions and the speed of the algorithms. This procedure is demonstrated for a selection of local graph bisection heuristics.Comment: 17 pages, 5 figures, submitted to SIAM Journal on Optimization also available at http://ipnweb.in2p3.fr/~martin

    Throughput-driven Partitioning of Stream Programs on Heterogeneous Distributed Systems

    This is an Open Access article. © 2015 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.Graph partitioning is an important problem in computer science and is of NP-hard complexity. In practice it is usually solved using heuristics. In this article we introduce the use of graph partitioning to partition the workload of stream programs to optimise the throughput on heterogeneous distributed platforms. Existing graph partitioning heuristics are not adequate for this problem domain. In this article we present two new heuristics to capture the problem space of graph partitioning for stream programs to optimise throughput. The first algorithm is an adaptation of the well-known Kernighan-Lin algorithm, called KL-Adapted (KLA), which is relatively slow. As a second algorithm we have developed the Congestion Avoidance (CA) partitioning algorithm, which performs reconfiguration moves optimised to our problem type. We compare both KLA and CA with the generic meta-heuristic Simulated Annealing (SA). All three methods achieve similar throughput results for most cases, but with significant differences in calculation time. For small graphs KLA is faster than SA, but KLA is slower for larger graphs. CA on the other hand is always orders of magnitudes faster than both KLA and SA, even for large graphs. This makes CA potentially useful for re-partitioning of systems during runtime.Peer reviewedFinal Published versio

    Considerations about multistep community detection

    The problem and implications of community detection in networks have raised a huge attention, for its important applications in both natural and social sciences. A number of algorithms has been developed to solve this problem, addressing either speed optimization or the quality of the partitions calculated. In this paper we propose a multi-step procedure bridging the fastest, but less accurate algorithms (coarse clustering), with the slowest, most effective ones (refinement). By adopting heuristic ranking of the nodes, and classifying a fraction of them as `critical', a refinement step can be restricted to this subset of the network, thus saving computational time. Preliminary numerical results are discussed, showing improvement of the final partition.Comment: 12 page

    Graph Partitioning: A Heuristic Procedure to Partition Network Graphs

    Graphs are mathematical structures used to model pair wise relationship between objects of a certain collection. It consists of collection of vertices or “nodes” and a collection of edges that connect these nodes. Graphs can be directed from one vertex to another or undirected. In our context, a graph denotes a network with computers distributed as nodes while the communication channel acting as the edges. These are directed graphs where each edge has a capacity which cannot be exceeded. In real life applications, it becomes very essential that graphs are partitioned in some way so as to satisfy certain conditions. For example, while placing components of electronic circuit on circuit boards or substrates, components that are highly dependent on each other (exchanging maximum information) should be placed on the same board. Also an important factor is the number of connections between these boards should be minimized. Similar situation arises in a computer network where computer systems are distributed over a wide geographic location. This is the basis of graph partitioning problem. The classical graph partitioning problem consists of dividing a graph into pieces, such that the pieces are of about same size and there exists very few connections between these pieces. The objective is to partition the nodes of a graph with costs on its edges into subsets so as to minimize the sum of the costs on all edges that are cut. Let G be graph with n nodes, of sizes (weights) wi > 0, i = 1, 2, …., n. Let p be a positive number, such that 0 < wi < p for all i. Let C = (cij), i,j = 1, 2, ……, n be a weighted connectivity matrix describing the edges of G. Let k be a positive integer. A k-way partition of G is a set of disjoint subsets of G, v1, v2, …, vk such that A partition is admissible if for all i. The cost of partition is the summation of (cij), where i and j belong to different subsets. A strictly exhaustive procedure for finding the optimal partition is out of question because the problem of graph partitioning is NP-Hard problem. For a graph with 40 nodes and 4 partitions, the possible number of partitioned cases will be of the order of 1036. Hence, any direct approach to find an optimal solution from these many cases is not a feasible option. As a result heuristic approaches are employed in these cases. We use a heuristic partitioning algorithm that divides a network into 2 disjoint sets based on the distance between any two nodes. The network used is a real network termed ARPANET and is regarded as the origin of the Internet

    PT-Scotch: A tool for efficient parallel graph ordering

    The parallel ordering of large graphs is a difficult problem, because on the one hand minimum degree algorithms do not parallelize well, and on the other hand the obtainment of high quality orderings with the nested dissection algorithm requires efficient graph bipartitioning heuristics, the best sequential implementations of which are also hard to parallelize. This paper presents a set of algorithms, implemented in the PT-Scotch software package, which allows one to order large graphs in parallel, yielding orderings the quality of which is only slightly worse than the one of state-of-the-art sequential algorithms. Our implementation uses the classical nested dissection approach but relies on several novel features to solve the parallel graph bipartitioning problem. Thanks to these improvements, PT-Scotch produces consistently better orderings than ParMeTiS on large numbers of processors

    The stability of a graph partition: A dynamics-based framework for community detection

    Recent years have seen a surge of interest in the analysis of complex networks, facilitated by the availability of relational data and the increasingly powerful computational resources that can be employed for their analysis. Naturally, the study of real-world systems leads to highly complex networks and a current challenge is to extract intelligible, simplified descriptions from the network in terms of relevant subgraphs, which can provide insight into the structure and function of the overall system. Sparked by seminal work by Newman and Girvan, an interesting line of research has been devoted to investigating modular community structure in networks, revitalising the classic problem of graph partitioning. However, modular or community structure in networks has notoriously evaded rigorous definition. The most accepted notion of community is perhaps that of a group of elements which exhibit a stronger level of interaction within themselves than with the elements outside the community. This concept has resulted in a plethora of computational methods and heuristics for community detection. Nevertheless a firm theoretical understanding of most of these methods, in terms of how they operate and what they are supposed to detect, is still lacking to date. Here, we will develop a dynamical perspective towards community detection enabling us to define a measure named the stability of a graph partition. It will be shown that a number of previously ad-hoc defined heuristics for community detection can be seen as particular cases of our method providing us with a dynamic reinterpretation of those measures. Our dynamics-based approach thus serves as a unifying framework to gain a deeper understanding of different aspects and problems associated with community detection and allows us to propose new dynamically-inspired criteria for community structure.Comment: 3 figures; published as book chapte

    Relaxation-Based Coarsening for Multilevel Hypergraph Partitioning

    Multilevel partitioning methods that are inspired by principles of multiscaling are the most powerful practical hypergraph partitioning solvers. Hypergraph partitioning has many applications in disciplines ranging from scientific computing to data science. In this paper we introduce the concept of algebraic distance on hypergraphs and demonstrate its use as an algorithmic component in the coarsening stage of multilevel hypergraph partitioning solvers. The algebraic distance is a vertex distance measure that extends hyperedge weights for capturing the local connectivity of vertices which is critical for hypergraph coarsening schemes. The practical effectiveness of the proposed measure and corresponding coarsening scheme is demonstrated through extensive computational experiments on a diverse set of problems. Finally, we propose a benchmark of hypergraph partitioning problems to compare the quality of other solvers