14 research outputs found
Generalizations of Fano's Inequality for Conditional Information Measures via Majorization Theory
Fano's inequality is one of the most elementary, ubiquitous, and important
tools in information theory. Using majorization theory, Fano's inequality is
generalized to a broad class of information measures, which contains those of
Shannon and R\'{e}nyi. When specialized to these measures, it recovers and
generalizes the classical inequalities. Key to the derivation is the
construction of an appropriate conditional distribution inducing a desired
marginal distribution on a countably infinite alphabet. The construction is
based on the infinite-dimensional version of Birkhoff's theorem proven by
R\'{e}v\'{e}sz [Acta Math. Hungar. 1962, 3, 188{\textendash}198], and the
constraint of maintaining a desired marginal distribution is similar to
coupling in probability theory. Using our Fano-type inequalities for Shannon's
and R\'{e}nyi's information measures, we also investigate the asymptotic
behavior of the sequence of Shannon's and R\'{e}nyi's equivocations when the
error probabilities vanish. This asymptotic behavior provides a novel
characterization of the asymptotic equipartition property (AEP) via Fano's
inequality.Comment: 44 pages, 3 figure
Divergence Measures
Data science, information theory, probability theory, statistical learning and other related disciplines greatly benefit from non-negative measures of dissimilarity between pairs of probability measures. These are known as divergence measures, and exploring their mathematical foundations and diverse applications is of significant interest. The present Special Issue, entitled “Divergence Measures: Mathematical Foundations and Applications in Information-Theoretic and Statistical Problems”, includes eight original contributions, and it is focused on the study of the mathematical properties and applications of classical and generalized divergence measures from an information-theoretic perspective. It mainly deals with two key generalizations of the relative entropy: namely, the R_ényi divergence and the important class of f -divergences. It is our hope that the readers will find interest in this Special Issue, which will stimulate further research in the study of the mathematical foundations and applications of divergence measures
A Simple and Tighter Derivation of Achievability for Classical Communication over Quantum Channels
Achievability in information theory refers to demonstrating a coding strategy
that accomplishes a prescribed performance benchmark for the underlying task.
In quantum information theory, the crafted Hayashi-Nagaoka operator inequality
is an essential technique in proving a wealth of one-shot achievability bounds
since it effectively resembles a union bound in various problems. In this work,
we show that the pretty-good measurement naturally plays a role as the union
bound as well. A judicious application of it considerably simplifies the
derivation of one-shot achievability for classical-quantum (c-q) channel coding
via an elegant three-line proof.
The proposed analysis enjoys the following favorable features: (i) The
established one-shot bound admits a closed-form expression as in the celebrated
Holevo-Helstrom Theorem. Namely, the average error probability of sending
messages through a c-q channel is upper bounded by the error of distinguishing
the joint state between channel input and output against -many products
of its marginals. (ii) Our bound directly yields asymptotic results in the
large deviation, small deviation, and moderate deviation regimes in a unified
manner. (iii) The coefficients incurred in applying the Hayashi-Nagaoka
operator inequality are no longer needed. Hence, the derived one-shot bound
sharpens existing results that rely on the Hayashi-Nagaoka operator inequality.
In particular, we obtain the tightest achievable -one-shot capacity
for c-q channel heretofore, and it improves the third-order coding rate in the
asymptotic scenario. (iv) Our result holds for infinite-dimensional Hilbert
space. (v) The proposed method applies to deriving one-shot bounds for data
compression with quantum side information, entanglement-assisted classical
communication over quantum channels, and various quantum network
information-processing protocols
A Brief Introduction to Machine Learning for Engineers
This monograph aims at providing an introduction to key concepts, algorithms,
and theoretical results in machine learning. The treatment concentrates on
probabilistic models for supervised and unsupervised learning problems. It
introduces fundamental concepts and algorithms by building on first principles,
while also exposing the reader to more advanced topics with extensive pointers
to the literature, within a unified notation and mathematical framework. The
material is organized according to clearly defined categories, such as
discriminative and generative models, frequentist and Bayesian approaches,
exact and approximate inference, as well as directed and undirected models.
This monograph is meant as an entry point for researchers with a background in
probability and linear algebra.Comment: This is an expanded and improved version of the original posting.
Feedback is welcom
Information Theory and Machine Learning
The recent successes of machine learning, especially regarding systems based on deep neural networks, have encouraged further research activities and raised a new set of challenges in understanding and designing complex machine learning algorithms. New applications require learning algorithms to be distributed, have transferable learning results, use computation resources efficiently, convergence quickly on online settings, have performance guarantees, satisfy fairness or privacy constraints, incorporate domain knowledge on model structures, etc. A new wave of developments in statistical learning theory and information theory has set out to address these challenges. This Special Issue, "Machine Learning and Information Theory", aims to collect recent results in this direction reflecting a diverse spectrum of visions and efforts to extend conventional theories and develop analysis tools for these complex machine learning systems