Search CORE

91 research outputs found

Coding against synchronisation and related errors

Author: Lourenço Ribeiro João Miguel
Publication venue: Computing, Imperial College London
Publication date: 01/06/2021
Field of study

In this thesis, we study aspects of coding against synchronisation errors, such as deletions and replications, and related errors. Synchronisation errors are a source of fundamental open problems in information theory, because they introduce correlations between output symbols even when input symbols are independently distributed. We focus on random errors, and consider two complementary problems: We study the optimal rate of reliable information transmission through channels with synchronisation and related errors (the channel capacity). Unlike simpler error models, the capacity of such channels is unknown. We first consider the geometric sticky channel, which replicates input bits according to a geometric distribution. Previously, bounds on its capacity were known only via numerical methods, which do not aid our conceptual understanding of this quantity. We derive sharp analytical capacity upper bounds which approach, and sometimes surpass, numerical bounds. This opens the door to a mathematical treatment of its capacity. We consider also the geometric deletion channel, combining deletions and geometric replications. We derive analytical capacity upper bounds, and notably prove that the capacity is bounded away from the maximum when the deletion probability is small, meaning that this channel behaves differently than related well-studied channels in this regime. Finally, we adapt techniques developed to handle synchronisation errors to derive improved upper bounds and structural results on the capacity of the discrete-time Poisson channel, a model of optical communication. Motivated by portable DNA-based storage and trace reconstruction, we introduce and study the coded trace reconstruction problem, where the goal is to design efficiently encodable high-rate codes whose codewords can be efficiently reconstructed from few reads corrupted by deletions. Remarkably, we design such n-bit codes with rate 1-O(1/log n) that require exponentially fewer reads than average-case trace reconstruction algorithms.Open Acces

Spiral - Imperial College Digital Repository

Optimal information storage : nonsequential sources and neural channels

Author: Varshney Lav R. (Lav Raj)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.MIT Institute Archives copy: pages 101-163 bound in reverse order.Includes bibliographical references (p. 141-163).Information storage and retrieval systems are communication systems from the present to the future and fall naturally into the framework of information theory. The goal of information storage is to preserve as much signal fidelity under resource constraints as possible. The information storage theorem delineates average fidelity and average resource values that are achievable and those that are not. Moreover, observable properties of optimal information storage systems and the robustness of optimal systems to parameter mismatch may be determined. In this thesis, we study the physical properties of a neural information storage channel and also the fundamental bounds on the storage of sources that have nonsequential semantics. Experimental investigations have revealed that synapses in the mammalian brain possess unexpected properties. Adopting the optimization approach to biology, we cast the brain as an optimal information storage system and propose a theoretical framework that accounts for many of these physical properties. Based on previous experimental and theoretical work, we use volume as a limited resource and utilize the empirical relationship between volume anrid synaptic weight.(cont.) Our scientific hypotheses are based on maximizing information storage capacity per unit cost. We use properties of the capacity-cost function, e-capacity cost approximations, and measure matching to develop optimization principles. We find that capacity-achieving input distributions not only explain existing experimental measurements but also make non-trivial predictions about the physical structure of the brain. Numerous information storage applications have semantics such that the order of source elements is irrelevant, so the source sequence can be treated as a multiset. We formulate fidelity criteria that consider asymptotically large multisets and give conclusive, but trivialized, results in rate distortion theory. For fidelity criteria that consider fixed-size multisets. we give some conclusive results in high-rate quantization theory, low-rate quantization. and rate distortion theory. We also provide bounds on the rate-distortion function for other nonsequential fidelity criteria problems. System resource consumption can be significantly reduced by recognizing the correct invariance properties and semantics of the information storage task at hand.by Lav R. Varshney.S.M

DSpace@MIT

Distinguishing codes from noise : fundamental limits and applications to sparse communication

Author: Wang Da, S.M. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 99-100).This thesis investigates the problem of distinguishing codes from noise. We develop a slotted channel model where in each time slot, the channel input is either a codeword or a noise sequence. In this model, successful communication requires both correctly detecting the presence of a codeword and decoding it to the correct message. While the decoding problem has been extensively studied, the problem of distinguishing codes from noise is relatively new, and we ask the following question regarding the "distinguishability" of a channel code: given a noisy channel and a code with a certain rate, what are the fundamental limits of distinguishing this code from noise at the output of the channel? The problem of distinguishing codes from noise involves both detection and decoding. In our analysis, we first extend the classical channel coding problem to incorporate the requirement of detection, which admits both miss and false alarm errors. Then we investigate the fundamental limits of code distinguishing in terms of the error exponents of miss and false alarm error probabilities. In a scenario that miss probability is required to vanish asymptotically but not necessarily exponentially, we characterize the maximum false alarm error exponent at each rate, and show that an i.i.d. codebook with typicality decoding is sufficient to achieve the maximum exponent. In another scenario that requires certain miss error exponent, we show that for DMC channels, the i.i.d. codebook is suboptimal and the constant composition codebook achieves the best known performance. For AWGN channels, we develop a clustered spherical codebook that achieves the best known performance in all operating regimes. This code distinguishability problem is strongly motivated by the synchronization problem in sparse communication, a new communication paradigm where transmissions take place intermittently and each transmission consists of a small amount of data. Our results show that, in sparse communication, the traditional approach of conducting synchronization and coding separately is suboptimal, and our approach of designing codes for joint synchronization and information transmission achieves better performance, especially at high rates. Therefore, for systems with sparse transmissions such as sensor networks, it is beneficial to adopt the joint sync-coding architecture instead of the traditional separate sync-coding architecture.by Da Wang.S.M

DSpace@MIT