51,401 research outputs found

    Connectivity-dependent properties of diluted sytems in a transfer-matrix description

    Full text link
    We introduce a new approach to connectivity-dependent properties of diluted systems, which is based on the transfer-matrix formulation of the percolation problem. It simultaneously incorporates the connective properties reflected in non-zero matrix elements and allows one to use standard random-matrix multiplication techniques. Thus it is possible to investigate physical processes on the percolation structure with the high efficiency and precision characteristic of transfer-matrix methods, while avoiding disconnections. The method is illustrated for two-dimensional site percolation by calculating (i) the critical correlation length along the strip, and the finite-size longitudinal DC conductivity: (ii) at the percolation threshold, and (iii) very near the pure-system limit.Comment: 4 pages, no figures, RevTeX, Phys. Rev. E Rapid Communications (to be published

    If the Current Clique Algorithms are Optimal, so is Valiant's Parser

    Full text link
    The CFG recognition problem is: given a context-free grammar G\mathcal{G} and a string ww of length nn, decide if ww can be obtained from G\mathcal{G}. This is the most basic parsing question and is a core computer science problem. Valiant's parser from 1975 solves the problem in O(nω)O(n^{\omega}) time, where ω<2.373\omega<2.373 is the matrix multiplication exponent. Dozens of parsing algorithms have been proposed over the years, yet Valiant's upper bound remains unbeaten. The best combinatorial algorithms have mildly subcubic O(n3/log3n)O(n^3/\log^3{n}) complexity. Lee (JACM'01) provided evidence that fast matrix multiplication is needed for CFG parsing, and that very efficient and practical algorithms might be hard or even impossible to obtain. Lee showed that any algorithm for a more general parsing problem with running time O(Gn3ε)O(|\mathcal{G}|\cdot n^{3-\varepsilon}) can be converted into a surprising subcubic algorithm for Boolean Matrix Multiplication. Unfortunately, Lee's hardness result required that the grammar size be G=Ω(n6)|\mathcal{G}|=\Omega(n^6). Nothing was known for the more relevant case of constant size grammars. In this work, we prove that any improvement on Valiant's algorithm, even for constant size grammars, either in terms of runtime or by avoiding the inefficiencies of fast matrix multiplication, would imply a breakthrough algorithm for the kk-Clique problem: given a graph on nn nodes, decide if there are kk that form a clique. Besides classifying the complexity of a fundamental problem, our reduction has led us to similar lower bounds for more modern and well-studied cubic time problems for which faster algorithms are highly desirable in practice: RNA Folding, a central problem in computational biology, and Dyck Language Edit Distance, answering an open question of Saha (FOCS'14)

    Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis

    Full text link
    Data movement is the dominating factor affecting performance and energy in modern computing systems. Consequently, many algorithms have been developed to minimize the number of I/O operations for common computing patterns. Matrix multiplication is no exception, and lower bounds have been proven and implemented both for shared and distributed memory systems. Reconfigurable hardware platforms are a lucrative target for I/O minimizing algorithms, as they offer full control of memory accesses to the programmer. While bounds developed in the context of fixed architectures still apply to these platforms, the spatially distributed nature of their computational and memory resources requires a decentralized approach to optimize algorithms for maximum hardware utilization. We present a model to optimize matrix multiplication for FPGA platforms, simultaneously targeting maximum performance and minimum off-chip data movement, within constraints set by the hardware. We map the model to a concrete architecture using a high-level synthesis tool, maintaining a high level of abstraction, allowing us to support arbitrary data types, and enables maintainability and portability across FPGA devices. Kernels generated from our architecture are shown to offer competitive performance in practice, scaling with both compute and memory resources. We offer our design as an open source project to encourage the open development of linear algebra and I/O minimizing algorithms on reconfigurable hardware platforms

    Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation

    Full text link
    Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern high-dimensional data sets (often on the order of terabytes), and assume Gaussian samples. To address these deficiencies, we introduce HP-CONCORD, a highly scalable optimization method for estimating a sparse inverse covariance matrix based on a regularized pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving parallel scalability on problems with up to ~819 billion parameters (1.28 million dimensions); even on a single node, HP-CONCORD demonstrates scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to estimate the underlying dependency structure of the brain from fMRI data, and use the result to identify functional regions automatically. The results show good agreement with a clustering from the neuroscience literature.Comment: Main paper: 15 pages, appendix: 24 page

    OverSketch: Approximate Matrix Multiplication for the Cloud

    Full text link
    We propose OverSketch, an approximate algorithm for distributed matrix multiplication in serverless computing. OverSketch leverages ideas from matrix sketching and high-performance computing to enable cost-efficient multiplication that is resilient to faults and straggling nodes pervasive in low-cost serverless architectures. We establish statistical guarantees on the accuracy of OverSketch and empirically validate our results by solving a large-scale linear program using interior-point methods and demonstrate a 34% reduction in compute time on AWS Lambda.Comment: Published in Proc. IEEE Big Data 2018. Updated version provides details of distributed sketching and highlights other advantages of OverSketc

    Construction of a Large Class of Deterministic Sensing Matrices that Satisfy a Statistical Isometry Property

    Full text link
    Compressed Sensing aims to capture attributes of kk-sparse signals using very few measurements. In the standard Compressed Sensing paradigm, the \m\times \n measurement matrix \A is required to act as a near isometry on the set of all kk-sparse signals (Restricted Isometry Property or RIP). Although it is known that certain probabilistic processes generate \m \times \n matrices that satisfy RIP with high probability, there is no practical algorithm for verifying whether a given sensing matrix \A has this property, crucial for the feasibility of the standard recovery algorithms. In contrast this paper provides simple criteria that guarantee that a deterministic sensing matrix satisfying these criteria acts as a near isometry on an overwhelming majority of kk-sparse signals; in particular, most such signals have a unique representation in the measurement domain. Probability still plays a critical role, but it enters the signal model rather than the construction of the sensing matrix. We require the columns of the sensing matrix to form a group under pointwise multiplication. The construction allows recovery methods for which the expected performance is sub-linear in \n, and only quadratic in \m; the focus on expected performance is more typical of mainstream signal processing than the worst-case analysis that prevails in standard Compressed Sensing. Our framework encompasses many families of deterministic sensing matrices, including those formed from discrete chirps, Delsarte-Goethals codes, and extended BCH codes.Comment: 16 Pages, 2 figures, to appear in IEEE Journal of Selected Topics in Signal Processing, the special issue on Compressed Sensin

    Relations between connected and self-avoiding walks in a digraph

    Get PDF
    Walks in a directed graph can be given a partially ordered structure that extends to possibly unconnected objects, called hikes. Studying the incidence algebra on this poset reveals unsuspected relations between walks and self-avoiding hikes. These relations are derived by considering truncated versions of the characteristic polynomial of the weighted adjacency matrix, resulting in a collection of matrices whose entries enumerate the self-avoiding hikes of length \ell from one vertex to another
    corecore