32,704 research outputs found

    Expander 0\ell_0-Decoding

    Get PDF
    We introduce two new algorithms, Serial-0\ell_0 and Parallel-0\ell_0 for solving a large underdetermined linear system of equations y=AxRmy = Ax \in \mathbb{R}^m when it is known that xRnx \in \mathbb{R}^n has at most k<mk < m nonzero entries and that AA is the adjacency matrix of an unbalanced left dd-regular expander graph. The matrices in this class are sparse and allow a highly efficient implementation. A number of algorithms have been designed to work exclusively under this setting, composing the branch of combinatorial compressed-sensing (CCS). Serial-0\ell_0 and Parallel-0\ell_0 iteratively minimise yAx^0\|y - A\hat x\|_0 by successfully combining two desirable features of previous CCS algorithms: the information-preserving strategy of ER, and the parallel updating mechanism of SMP. We are able to link these elements and guarantee convergence in O(dnlogk)\mathcal{O}(dn \log k) operations by assuming that the signal is dissociated, meaning that all of the 2k2^k subset sums of the support of xx are pairwise different. However, we observe empirically that the signal need not be exactly dissociated in practice. Moreover, we observe Serial-0\ell_0 and Parallel-0\ell_0 to be able to solve large scale problems with a larger fraction of nonzeros than other algorithms when the number of measurements is substantially less than the signal length; in particular, they are able to reliably solve for a kk-sparse vector xRnx\in\mathbb{R}^n from mm expander measurements with n/m=103n/m=10^3 and k/mk/m up to four times greater than what is achievable by 1\ell_1-regularization from dense Gaussian measurements. Additionally, Serial-0\ell_0 and Parallel-0\ell_0 are observed to be able to solve large problems sizes in substantially less time than other algorithms for compressed sensing. In particular, Parallel-0\ell_0 is structured to take advantage of massively parallel architectures.Comment: 14 pages, 10 figure

    Algorithms for flows and disjoint paths in planar graphs

    Get PDF
    In this dissertation we describe several algorithms for computing flows, connectivity, and disjoint paths in planar graphs. In all cases, the algorithms are either the first polynomial-time algorithms or are faster than all previously-known algorithms. First, we describe algorithms for the maximum flow problem in directed planar graphs with integer capacities on both vertices and arcs and with multiple sources and sinks. The algorithms are the first to solve the problem in near-linear time when the number of terminals is fixed and the capacities are polynomially bounded. As a byproduct, we get the first algorithm to solve the vertex-disjoint S-T paths problem in near-linear time when the number of terminals is fixed but greater than 2. We also modify our algorithms to handle real capacities in near-linear time when they are three terminals. Second, we describe algorithms to compute element-connectivity and a related structure called the reduced graph. We show that global element-connectivity in planar graphs can be found in linear time if the terminals can be covered by O(1) faces. We also show that the reduced graph can be computed in subquadratic time in planar graphs if the number of terminals is fixed. Third, we describe algorithms for solving or approximately solving the vertex-disjoint paths problem when we want to minimize the total length of the paths. For planar graphs, we describe: (1) an exact algorithm for the case of four pairs of terminals on a single face; and (2) a k-approximation algorithm for the case of k pairs of terminals on a single face. Fourth, we describe algorithms and a hardness result for the ideal orientation problem. We show that the problem is NP-hard in planar graphs. On the other hand, we show that the problem is polynomial-time solvable in planar graphs when the number of terminals is fixed, the terminals are all on the same face, and no two of the terminal pairs cross. We also describe an algorithm for serial instances of a generalization of the ideal orientation problem called the k-min-sum orientation problem

    A Parallel Algorithm for Exact Bayesian Structure Discovery in Bayesian Networks

    Full text link
    Exact Bayesian structure discovery in Bayesian networks requires exponential time and space. Using dynamic programming (DP), the fastest known sequential algorithm computes the exact posterior probabilities of structural features in O(2(d+1)n2n)O(2(d+1)n2^n) time and space, if the number of nodes (variables) in the Bayesian network is nn and the in-degree (the number of parents) per node is bounded by a constant dd. Here we present a parallel algorithm capable of computing the exact posterior probabilities for all n(n1)n(n-1) edges with optimal parallel space efficiency and nearly optimal parallel time efficiency. That is, if p=2kp=2^k processors are used, the run-time reduces to O(5(d+1)n2nk+k(nk)d)O(5(d+1)n2^{n-k}+k(n-k)^d) and the space usage becomes O(n2nk)O(n2^{n-k}) per processor. Our algorithm is based the observation that the subproblems in the sequential DP algorithm constitute a nn-DD hypercube. We take a delicate way to coordinate the computation of correlated DP procedures such that large amount of data exchange is suppressed. Further, we develop parallel techniques for two variants of the well-known \emph{zeta transform}, which have applications outside the context of Bayesian networks. We demonstrate the capability of our algorithm on datasets with up to 33 variables and its scalability on up to 2048 processors. We apply our algorithm to a biological data set for discovering the yeast pheromone response pathways.Comment: 32 pages, 12 figure

    A Parallel Solver for Graph Laplacians

    Full text link
    Problems from graph drawing, spectral clustering, network flow and graph partitioning can all be expressed in terms of graph Laplacian matrices. There are a variety of practical approaches to solving these problems in serial. However, as problem sizes increase and single core speeds stagnate, parallelism is essential to solve such problems quickly. We present an unsmoothed aggregation multigrid method for solving graph Laplacians in a distributed memory setting. We introduce new parallel aggregation and low degree elimination algorithms targeted specifically at irregular degree graphs. These algorithms are expressed in terms of sparse matrix-vector products using generalized sum and product operations. This formulation is amenable to linear algebra using arbitrary distributions and allows us to operate on a 2D sparse matrix distribution, which is necessary for parallel scalability. Our solver outperforms the natural parallel extension of the current state of the art in an algorithmic comparison. We demonstrate scalability to 576 processes and graphs with up to 1.7 billion edges.Comment: PASC '18, Code: https://github.com/ligmg/ligm

    Forbidden Subgraphs in Connected Graphs

    Get PDF
    Given a set ξ={H1,H2,...}\xi=\{H_1,H_2,...\} of connected non acyclic graphs, a ξ\xi-free graph is one which does not contain any member of % \xi as copy. Define the excess of a graph as the difference between its number of edges and its number of vertices. Let {\gr{W}}_{k,\xi} be theexponential generating function (EGF for brief) of connected ξ\xi-free graphs of excess equal to kk (k1k \geq 1). For each fixed ξ\xi, a fundamental differential recurrence satisfied by the EGFs {\gr{W}}_{k,\xi} is derived. We give methods on how to solve this nonlinear recurrence for the first few values of kk by means of graph surgery. We also show that for any finite collection ξ\xi of non-acyclic graphs, the EGFs {\gr{W}}_{k,\xi} are always rational functions of the generating function, TT, of Cayley's rooted (non-planar) labelled trees. From this, we prove that almost all connected graphs with nn nodes and n+kn+k edges are ξ\xi-free, whenever k=o(n1/3)k=o(n^{1/3}) and ξ<|\xi| < \infty by means of Wright's inequalities and saddle point method. Limiting distributions are derived for sparse connected ξ\xi-free components that are present when a random graph on nn nodes has approximately n2\frac{n}{2} edges. In particular, the probability distribution that it consists of trees, unicyclic components, ......, (q+1)(q+1)-cyclic components all ξ\xi-free is derived. Similar results are also obtained for multigraphs, which are graphs where self-loops and multiple-edges are allowed

    Parallel Peeling Algorithms

    Full text link
    The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than k are removed until there are no vertices of degree less than k left. The remaining hypergraph is known as the k-core. In this paper, we analyze parallel peeling processes, where in each round, all vertices of degree less than k are removed. It is known that, below a specific edge density threshold, the k-core is empty with high probability. We show that, with high probability, below this threshold, only (log log n)/log(k-1)(r-1) + O(1) rounds of peeling are needed to obtain the empty k-core for r-uniform hypergraphs. Interestingly, we show that above this threshold, Omega(log n) rounds of peeling are required to find the non-empty k-core. Since most algorithms and data structures aim to peel to an empty k-core, this asymmetry appears fortunate. We verify the theoretical results both with simulation and with a parallel implementation using graphics processing units (GPUs). Our implementation provides insights into how to structure parallel peeling algorithms for efficiency in practice.Comment: Appears in SPAA 2014. Minor typo corrections relative to previous versio

    Efficient Parallel Translating Embedding For Knowledge Graphs

    Full text link
    Knowledge graph embedding aims to embed entities and relations of knowledge graphs into low-dimensional vector spaces. Translating embedding methods regard relations as the translation from head entities to tail entities, which achieve the state-of-the-art results among knowledge graph embedding methods. However, a major limitation of these methods is the time consuming training process, which may take several days or even weeks for large knowledge graphs, and result in great difficulty in practical applications. In this paper, we propose an efficient parallel framework for translating embedding methods, called ParTrans-X, which enables the methods to be paralleled without locks by utilizing the distinguished structures of knowledge graphs. Experiments on two datasets with three typical translating embedding methods, i.e., TransE [3], TransH [17], and a more efficient variant TransE- AdaGrad [10] validate that ParTrans-X can speed up the training process by more than an order of magnitude.Comment: WI 2017: 460-46
    corecore