2 research outputs found

    High-Dimensional Inference on Dense Graphs with Applications to Coding Theory and Machine Learning

    Get PDF
    We are living in the era of "Big Data", an era characterized by a voluminous amount of available data. Such amount is mainly due to the continuing advances in the computational capabilities for capturing, storing, transmitting and processing data. However, it is not always the volume of data that matters, but rather the "relevant" information that resides in it. Exactly 70 years ago, Claude Shannon, the father of information theory, was able to quantify the amount of information in a communication scenario based on a probabilistic model of the data. It turns out that Shannon's theory can be adapted to various probability-based information processing fields, ranging from coding theory to machine learning. The computation of some information theoretic quantities, such as the mutual information, can help in setting fundamental limits and devising more efficient algorithms for many inference problems. This thesis deals with two different, yet intimately related, inference problems in the fields of coding theory and machine learning. We use Bayesian probabilistic formulations for both problems, and we analyse them in the asymptotic high-dimensional regime. The goal of our analysis is to assess the algorithmic performance on the first hand and to predict the Bayes-optimal performance on the second hand, using an information theoretic approach. To this end, we employ powerful analytical tools from statistical physics. The first problem is a recent forward-error-correction code called sparse superposition code. We consider the extension of such code to a large class of noisy channels by exploiting the similarity with the compressed sensing paradigm. Moreover, we show the amenability of sparse superposition codes to perform joint distribution matching and channel coding. In the second problem, we study symmetric rank-one matrix factorization, a prominent model in machine learning and statistics with many applications ranging from community detection to sparse principal component analysis. We provide an explicit expression for the normalized mutual information and the minimum mean-square error of this model in the asymptotic limit. This allows us to prove the optimality of a certain iterative algorithm on a large set of parameters. A common feature of the two problems stems from the fact that both of them are represented on dense graphical models. Hence, similar message-passing algorithms and analysis tools can be adopted. Furthermore, spatial coupling, a new technique introduced in the context of low-density parity-check (LDPC) codes, can be applied to both problems. Spatial coupling is used in this thesis as a "construction technique" to boost the algorithmic performance and as a "proof technique" to compute some information theoretic quantities. Moreover, both of our problems retain close connections with spin glass models studied in statistical mechanics of disordered systems. This allows us to use sophisticated techniques developed in statistical physics. In this thesis, we use the potential function predicted by the replica method in order to prove the threshold saturation phenomenon associated with spatially coupled models. Moreover, one of the main contributions of this thesis is proving that the predictions given by the "heuristic" replica method are exact. Hence, our results could be of great interest for the statistical physics community as well, as they help to set a rigorous mathematical foundation of the replica predictions

    Polarization and Spatial Coupling:Two Techniques to Boost Performance

    Get PDF
    During the last two decades we have witnessed considerable activity in building bridges between the fields of information theory/communications, computer science, and statistical physics. This is due to the realization that many fundamental concepts and notions in these fields are in fact related and that each field can benefit from the insight and techniques developed in the others. For instance, the notion of channel capacity in information theory, threshold phenomena in computer science, and phase transitions in statistical physics are all expressions of the same concept. Therefore, it would be beneficial to develop a common framework that unifies these notions and that could help to leverage knowledge in one field to make progress in the others. A particularly striking example is the celebrated belief propagation algorithm. It was independently invented in each of these fields but for very different purposes. The realization of the commonality has benefited each of the areas. We investigate polarization and spatial coupling: two techniques that were originally invented in the context of channel coding (communications) thus resulting for the first time in efficient capacity-achieving codes for a wide range of channels. As we will discuss, both techniques play a fundamental role also in computer science and statistical physics and so these two techniques can be seen as further fundamental building blocks that unite all three areas. We demonstrate applications of these techniques, as well as the fundamental phenomena they provide. In more detail, this thesis consists of two parts. In the first part, we consider the technique of polarization and its resultant class of channel codes, called polar codes. Our main focus is the analysis and improvement of the behavior of polarization towards the most significant aspects of modern channel-coding theory: scaling laws, universality, and complexity (quantization). For each of these aspects, we derive fundamental laws that govern the behavior of polarization and polar codes. Even though we concentrate on applications in communications, the analysis that we provide is general and can be carried over to applications of polarization in computer science and statistical physics. As we will show, our investigations confirm some of the inherent strengths of polar codes such as their robustness with respect to quantization. But they also make clear in which aspects further improvement of polar codes is needed. For example, we will explain that the scaling behavior of polar codes is quite slow compared to the optimal one. Hence, further research is required in order to enhance the scaling behavior of polar codes towards optimality. In the second part of this thesis, we investigate spatial coupling. By now, there exists already a considerable literature on spatial coupling in the realm of information theory and communications. We therefore investigate mainly the impact of spatial coupling on the fields of statistical physics and computer science. We consider two well-known models. The first is the Curie-Weiss model that provides us with the simplest model for understanding the mechanism of spatial coupling in the perspective of statistical physics. Many fundamental features of spatial coupling can be simply explained here. In particular, we will show how the well-known Maxwell construction in statistical physics manifests itself through spatial coupling. We then focus on a much richer class of graphical models called constraint satisfaction problems (CSP) (e.g., K-SAT and Q-COL). These models are central to computer science. We follow a general framework: First, we introduce interpolation procedures for proving that the coupled and standard (un-coupled) models are fundamentally related, in that their static properties (such as their SAT/UNSAT threshold) are the same. We then use tools from spin glass theory (cavity method) to demonstrate the so-called phenomenon of threshold saturation in these coupled models. Finally, we present the algorithmic implications and argue that all these features provide a new avenue for obtaining better, provable, algorithmic lower bounds on static thresholds of the individual standard CSP models. We consider simple decimation algorithms (e.g., the unit clause propagation algorithm) for the coupled CSP models and provide a machinery to analyze these algorithms. These analyses enable us to observe that the algorithmic thresholds on the coupled model are significantly improved over the standard model. For some models (e.g., 3-SAT, 3-COL), these coupled algorithmic thresholds surpass the best lower bounds on the SAT/UNSAT threshold in the literature and provide us with a new lower bound. We conclude by pointing out that although we only considered some specific graphical models, our results are of general nature hence applicable to a broad set of models. In particular, a main contribution of this thesis is to firmly establish both polarization, as well as spatial coupling, in the common toolbox of information theory/communication, statistical physics, and computer science
    corecore