461 research outputs found
Efficient sphere-covering and converse measure concentration via generalized coding theorems
Suppose A is a finite set equipped with a probability measure P and let M be
a ``mass'' function on A. We give a probabilistic characterization of the most
efficient way in which A^n can be almost-covered using spheres of a fixed
radius. An almost-covering is a subset C_n of A^n, such that the union of the
spheres centered at the points of C_n has probability close to one with respect
to the product measure P^n. An efficient covering is one with small mass
M^n(C_n); n is typically large. With different choices for M and the geometry
on A our results give various corollaries as special cases, including Shannon's
data compression theorem, a version of Stein's lemma (in hypothesis testing),
and a new converse to some measure concentration inequalities on discrete
spaces. Under mild conditions, we generalize our results to abstract spaces and
non-product measures.Comment: 29 pages. See also http://www.stat.purdue.edu/~yiannis
Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities
This monograph presents a unified treatment of single- and multi-user
problems in Shannon's information theory where we depart from the requirement
that the error probability decays asymptotically in the blocklength. Instead,
the error probabilities for various problems are bounded above by a
non-vanishing constant and the spotlight is shone on achievable coding rates as
functions of the growing blocklengths. This represents the study of asymptotic
estimates with non-vanishing error probabilities.
In Part I, after reviewing the fundamentals of information theory, we discuss
Strassen's seminal result for binary hypothesis testing where the type-I error
probability is non-vanishing and the rate of decay of the type-II error
probability with growing number of independent observations is characterized.
In Part II, we use this basic hypothesis testing result to develop second- and
sometimes, even third-order asymptotic expansions for point-to-point
communication. Finally in Part III, we consider network information theory
problems for which the second-order asymptotics are known. These problems
include some classes of channels with random state, the multiple-encoder
distributed lossless source coding (Slepian-Wolf) problem and special cases of
the Gaussian interference and multiple-access channels. Finally, we discuss
avenues for further research.Comment: Further comments welcom
Data compression with low distortion and finite blocklength
This paper considers lossy source coding of n-dimensional continuous memoryless sources with low mean-square error distortion and shows a simple, explicit approximation to the minimum source coding rate. More precisely, a nonasymptotic version of Shannon's lower bound is presented. Lattice quantizers are shown to approach that lower bound, provided that the source density is smooth enough and the distortion is low, which implies that fine multidimensional lattice coverings are nearly optimal in the rate-distortion sense even at finite n. The achievability proof technique avoids both the usual random coding argument and the simplifying assumption of the presence of a dither signal
Concentration of Measure Inequalities in Information Theory, Communications and Coding (Second Edition)
During the last two decades, concentration inequalities have been the subject
of exciting developments in various areas, including convex geometry,
functional analysis, statistical physics, high-dimensional statistics, pure and
applied probability theory, information theory, theoretical computer science,
and learning theory. This monograph focuses on some of the key modern
mathematical tools that are used for the derivation of concentration
inequalities, on their links to information theory, and on their various
applications to communications and coding. In addition to being a survey, this
monograph also includes various new recent results derived by the authors. The
first part of the monograph introduces classical concentration inequalities for
martingales, as well as some recent refinements and extensions. The power and
versatility of the martingale approach is exemplified in the context of codes
defined on graphs and iterative decoding algorithms, as well as codes for
wireless communication. The second part of the monograph introduces the entropy
method, an information-theoretic technique for deriving concentration
inequalities. The basic ingredients of the entropy method are discussed first
in the context of logarithmic Sobolev inequalities, which underlie the
so-called functional approach to concentration of measure, and then from a
complementary information-theoretic viewpoint based on transportation-cost
inequalities and probability in metric spaces. Some representative results on
concentration for dependent random variables are briefly summarized, with
emphasis on their connections to the entropy method. Finally, we discuss
several applications of the entropy method to problems in communications and
coding, including strong converses, empirical distributions of good channel
codes, and an information-theoretic converse for concentration of measure.Comment: Foundations and Trends in Communications and Information Theory, vol.
10, no 1-2, pp. 1-248, 2013. Second edition was published in October 2014.
ISBN to printed book: 978-1-60198-906-
Community detection and stochastic block models: recent developments
The stochastic block model (SBM) is a random graph model with planted
clusters. It is widely employed as a canonical model to study clustering and
community detection, and provides generally a fertile ground to study the
statistical and computational tradeoffs that arise in network and data
sciences.
This note surveys the recent developments that establish the fundamental
limits for community detection in the SBM, both with respect to
information-theoretic and computational thresholds, and for various recovery
requirements such as exact, partial and weak recovery (a.k.a., detection). The
main results discussed are the phase transitions for exact recovery at the
Chernoff-Hellinger threshold, the phase transition for weak recovery at the
Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial
recovery, the learning of the SBM parameters and the gap between
information-theoretic and computational thresholds.
The note also covers some of the algorithms developed in the quest of
achieving the limits, in particular two-round algorithms via graph-splitting,
semi-definite programming, linearized belief propagation, classical and
nonbacktracking spectral methods. A few open problems are also discussed
Data Compression with Low Distortion and Finite Blocklength
This paper considers lossy source coding of n-dimensional memoryless sources and shows an explicit approximation to the minimum source coding rate required to sustain the probability of exceeding distortion d no greater than ϵ, which is simpler than known dispersion-based approximations. Our approach takes inspiration in the celebrated classical result stating that the Shannon lower bound to rate-distortion function becomes tight in the limit d → 0. We formulate an abstract version of the Shannon lower bound that recovers both the classical Shannon lower bound and the rate-distortion function itself as special cases. Likewise, we show that a nonasymptotic version of the abstract Shannon lower bound recovers all previously known nonasymptotic converses. A necessary and sufficient condition for the Shannon lower bound to be attained exactly is presented. It is demonstrated that whenever that condition is met, the rate-dispersion function is given simply by the varentropy of the source. Remarkably, all finite alphabet sources with balanced distortion measures satisfy that condition in the range of low distortions. Most continuous sources violate that condition. Still, we show that lattice quantizers closely approach the nonasymptotic Shannon lower bound, provided that the source density is smooth enough and the distortion is low. This implies that fine multidimensional lattice coverings are nearly optimal in the rate-distortion sense even at finite . The achievability proof technique is based on a new bound on the output entropy of lattice quantizers in terms of the differential entropy of the source, the lattice cell size, and a smoothness parameter of the source density. The technique avoids both the usual random coding argument and the simplifying assumption of the presence of a dither signal
On the Reliability Function of Distributed Hypothesis Testing Under Optimal Detection
The distributed hypothesis testing problem with full side-information is
studied. The trade-off (reliability function) between the two types of error
exponents under limited rate is studied in the following way. First, the
problem is reduced to the problem of determining the reliability function of
channel codes designed for detection (in analogy to a similar result which
connects the reliability function of distributed lossless compression and
ordinary channel codes). Second, a single-letter random-coding bound based on a
hierarchical ensemble, as well as a single-letter expurgated bound, are derived
for the reliability of channel-detection codes. Both bounds are derived for a
system which employs the optimal detection rule. We conjecture that the
resulting random-coding bound is ensemble-tight, and consequently optimal
within the class of quantization-and-binning schemes
Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization
The affine rank minimization problem consists of finding a matrix of minimum
rank that satisfies a given system of linear equality constraints. Such
problems have appeared in the literature of a diverse set of fields including
system identification and control, Euclidean embedding, and collaborative
filtering. Although specific instances can often be solved with specialized
algorithms, the general affine rank minimization problem is NP-hard. In this
paper, we show that if a certain restricted isometry property holds for the
linear transformation defining the constraints, the minimum rank solution can
be recovered by solving a convex optimization problem, namely the minimization
of the nuclear norm over the given affine space. We present several random
ensembles of equations where the restricted isometry property holds with
overwhelming probability. The techniques used in our analysis have strong
parallels in the compressed sensing framework. We discuss how affine rank
minimization generalizes this pre-existing concept and outline a dictionary
relating concepts from cardinality minimization to those of rank minimization
- …