356 research outputs found

    Kirchhoff Index As a Measure of Edge Centrality in Weighted Networks: Nearly Linear Time Algorithms

    Full text link
    Most previous work of centralities focuses on metrics of vertex importance and methods for identifying powerful vertices, while related work for edges is much lesser, especially for weighted networks, due to the computational challenge. In this paper, we propose to use the well-known Kirchhoff index as the measure of edge centrality in weighted networks, called θ\theta-Kirchhoff edge centrality. The Kirchhoff index of a network is defined as the sum of effective resistances over all vertex pairs. The centrality of an edge ee is reflected in the increase of Kirchhoff index of the network when the edge ee is partially deactivated, characterized by a parameter θ\theta. We define two equivalent measures for θ\theta-Kirchhoff edge centrality. Both are global metrics and have a better discriminating power than commonly used measures, based on local or partial structural information of networks, e.g. edge betweenness and spanning edge centrality. Despite the strong advantages of Kirchhoff index as a centrality measure and its wide applications, computing the exact value of Kirchhoff edge centrality for each edge in a graph is computationally demanding. To solve this problem, for each of the θ\theta-Kirchhoff edge centrality metrics, we present an efficient algorithm to compute its ϵ\epsilon-approximation for all the mm edges in nearly linear time in mm. The proposed θ\theta-Kirchhoff edge centrality is the first global metric of edge importance that can be provably approximated in nearly-linear time. Moreover, according to the θ\theta-Kirchhoff edge centrality, we present a θ\theta-Kirchhoff vertex centrality measure, as well as a fast algorithm that can compute ϵ\epsilon-approximate Kirchhoff vertex centrality for all the nn vertices in nearly linear time in mm

    Sampling Random Spanning Trees Faster than Matrix Multiplication

    Full text link
    We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in O~(n4/3m1/2+n2)\tilde{O}(n^{4/3}m^{1/2}+n^{2}) time (The O~()\tilde{O}(\cdot) notation hides polylog(n)\operatorname{polylog}(n) factors). The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, O(nω)O(n^\omega). For the special case of unweighted graphs, this improves upon the best previously known running time of O~(min{nω,mn,m4/3})\tilde{O}(\min\{n^{\omega},m\sqrt{n},m^{4/3}\}) for mn5/3m \gg n^{5/3} (Colbourn et al. '96, Kelner-Madry '09, Madry et al. '15). The effective resistance metric is essential to our algorithm, as in the work of Madry et al., but we eschew determinant-based and random walk-based techniques used by previous algorithms. Instead, our algorithm is based on Gaussian elimination, and the fact that effective resistance is preserved in the graph resulting from eliminating a subset of vertices (called a Schur complement). As part of our algorithm, we show how to compute ϵ\epsilon-approximate effective resistances for a set SS of vertex pairs via approximate Schur complements in O~(m+(n+S)ϵ2)\tilde{O}(m+(n + |S|)\epsilon^{-2}) time, without using the Johnson-Lindenstrauss lemma which requires O~(min{(m+S)ϵ2,m+nϵ4+Sϵ2})\tilde{O}( \min\{(m + |S|)\epsilon^{-2}, m+n\epsilon^{-4} +|S|\epsilon^{-2}\}) time. We combine this approximation procedure with an error correction procedure for handing edges where our estimate isn't sufficiently accurate

    A design method for parallel programs

    Get PDF

    A fast parallel algorithm for special linear systems of equations using processor arrays with reconfigurable bus systems

    Get PDF
    A parallel algorithm using Processor Arrays with Reconfigurable Bus Systems has been designed to solve dense Symmetric Positive Definite (SPD) systems of equations Ax = b. The key content of this report is the parallelisation of the algorithm by Delosme & Ipson [8]. In order to design a parallel algorithm for PARBS, many procedures involved in [8] are handled in a slightly different way. The parallel time and processor’s complexity of each step of the algorithm is calculated. The parallel time complexity is O(n) using 2n × 2n × 5n number of Processing Elements

    Static and Dynamic Scheduling for Effective Use of Multicore Systems

    Get PDF
    Multicore systems have increasingly gained importance in high performance computers. Compared to the traditional microarchitectures, multicore architectures have a simpler design, higher performance-to-area ratio, and improved power efficiency. Although the multicore architecture has various advantages, traditional parallel programming techniques do not apply to the new architecture efficiently. This dissertation addresses how to determine optimized thread schedules to improve data reuse on shared-memory multicore systems and how to seek a scalable solution to designing parallel software on both shared-memory and distributed-memory multicore systems. We propose an analytical cache model to predict the number of cache misses on the time-sharing L2 cache on a multicore processor. The model provides an insight into the impact of cache sharing and cache contention between threads. Inspired by the model, we build the framework of affinity based thread scheduling to determine optimized thread schedules to improve data reuse on all the levels in a complex memory hierarchy. The affinity based thread scheduling framework includes a model to estimate the cost of a thread schedule, which consists of three submodels: an affinity graph submodel, a memory hierarchy submodel, and a cost submodel. Based on the model, we design a hierarchical graph partitioning algorithm to determine near-optimal solutions. We have also extended the algorithm to support threads with data dependences. The algorithms are implemented and incorporated into a feedback directed optimization prototype system. The prototype system builds upon a binary instrumentation tool and can improve program performance greatly on shared-memory multicore architectures. We also study the dynamic data-availability driven scheduling approach to designing new parallel software on distributed-memory multicore architectures. We have implemented a decentralized dynamic runtime system. The design of the runtime system is focused on the scalability metric. At any time only a small portion of a task graph exists in memory. We propose an algorithm to solve data dependences without process cooperation in a distributed manner. Our experimental results demonstrate the scalability and practicality of the approach for both shared-memory and distributed-memory multicore systems. Finally, we present a scalable nonblocking topology-aware multicast scheme for distributed DAG scheduling applications
    corecore