102 research outputs found

    Community detection and stochastic block models: recent developments

    Full text link
    The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed

    Communication-Computation Efficient Gradient Coding

    Full text link
    This paper develops coding techniques to reduce the running time of distributed learning tasks. It characterizes the fundamental tradeoff to compute gradients (and more generally vector summations) in terms of three parameters: computation load, straggler tolerance and communication cost. It further gives an explicit coding scheme that achieves the optimal tradeoff based on recursive polynomial constructions, coding both across data subsets and vector components. As a result, the proposed scheme allows to minimize the running time for gradient computations. Implementations are made on Amazon EC2 clusters using Python with mpi4py package. Results show that the proposed scheme maintains the same generalization error while reducing the running time by 32%32\% compared to uncoded schemes and 23%23\% compared to prior coded schemes focusing only on stragglers (Tandon et al., ICML 2017)

    Polarization of the Renyi Information Dimension with Applications to Compressed Sensing

    Full text link
    In this paper, we show that the Hadamard matrix acts as an extractor over the reals of the Renyi information dimension (RID), in an analogous way to how it acts as an extractor of the discrete entropy over finite fields. More precisely, we prove that the RID of an i.i.d. sequence of mixture random variables polarizes to the extremal values of 0 and 1 (corresponding to discrete and continuous distributions) when transformed by a Hadamard matrix. Further, we prove that the polarization pattern of the RID admits a closed form expression and follows exactly the Binary Erasure Channel (BEC) polarization pattern in the discrete setting. We also extend the results from the single- to the multi-terminal setting, obtaining a Slepian-Wolf counterpart of the RID polarization. We discuss applications of the RID polarization to Compressed Sensing of i.i.d. sources. In particular, we use the RID polarization to construct a family of deterministic ±1\pm 1-valued sensing matrices for Compressed Sensing. We run numerical simulations to compare the performance of the resulting matrices with that of random Gaussian and random Hadamard matrices. The results indicate that the proposed matrices afford competitive performances while being explicitly constructed.Comment: 12 pages, 2 figure

    Polynomial complexity of polar codes for non-binary alphabets, key agreement and Slepian-Wolf coding

    Full text link
    We consider polar codes for memoryless sources with side information and show that the blocklength, construction, encoding and decoding complexities are bounded by a polynomial of the reciprocal of the gap between the compression rate and the conditional entropy. This extends the recent results of Guruswami and Xia to a slightly more general setting, which in turn can be applied to (1) sources with non-binary alphabets, (2) key generation for discrete and Gaussian sources, and (3) Slepian-Wolf coding and multiple accessing. In each of these cases, the complexity scaling with respect to the number of users is also controlled. In particular, we construct coding schemes for these multi-user information theory problems which achieve optimal rates with an overall polynomial complexity.Comment: 6 pages; presented at CISS 201

    Polar Codes for the m-User MAC

    Get PDF
    In this paper, polar codes for the mm-user multiple access channel (MAC) with binary inputs are constructed. It is shown that Ar{\i}kan's polarization technique applied individually to each user transforms independent uses of a mm-user binary input MAC into successive uses of extremal MACs. This transformation has a number of desirable properties: (i) the `uniform sum rate' of the original MAC is preserved, (ii) the extremal MACs have uniform rate regions that are not only polymatroids but matroids and thus (iii) their uniform sum rate can be reached by each user transmitting either uncoded or fixed bits; in this sense they are easy to communicate over. A polar code can then be constructed with an encoding and decoding complexity of O(nlog⁥n)O(n \log n) (where nn is the block length), a block error probability of o(\exp(- n^{1/2 - \e})), and capable of achieving the uniform sum rate of any binary input MAC with arbitrary many users. An application of this polar code construction to communicating on the AWGN channel is also discussed

    High-Girth Matrices and Polarization

    Full text link
    The girth of a matrix is the least number of linearly dependent columns, in contrast to the rank which is the largest number of linearly independent columns. This paper considers the construction of {\it high-girth} matrices, whose probabilistic girth is close to its rank. Random matrices can be used to show the existence of high-girth matrices with constant relative rank, but the construction is non-explicit. This paper uses a polar-like construction to obtain a deterministic and efficient construction of high-girth matrices for arbitrary fields and relative ranks. Applications to coding and sparse recovery are discussed
    • 

    corecore