14 research outputs found

    Generalizations of Fano's Inequality for Conditional Information Measures via Majorization Theory

    Full text link
    Fano's inequality is one of the most elementary, ubiquitous, and important tools in information theory. Using majorization theory, Fano's inequality is generalized to a broad class of information measures, which contains those of Shannon and R\'{e}nyi. When specialized to these measures, it recovers and generalizes the classical inequalities. Key to the derivation is the construction of an appropriate conditional distribution inducing a desired marginal distribution on a countably infinite alphabet. The construction is based on the infinite-dimensional version of Birkhoff's theorem proven by R\'{e}v\'{e}sz [Acta Math. Hungar. 1962, 3, 188{\textendash}198], and the constraint of maintaining a desired marginal distribution is similar to coupling in probability theory. Using our Fano-type inequalities for Shannon's and R\'{e}nyi's information measures, we also investigate the asymptotic behavior of the sequence of Shannon's and R\'{e}nyi's equivocations when the error probabilities vanish. This asymptotic behavior provides a novel characterization of the asymptotic equipartition property (AEP) via Fano's inequality.Comment: 44 pages, 3 figure

    Divergence Measures

    Get PDF
    Data science, information theory, probability theory, statistical learning and other related disciplines greatly benefit from non-negative measures of dissimilarity between pairs of probability measures. These are known as divergence measures, and exploring their mathematical foundations and diverse applications is of significant interest. The present Special Issue, entitled “Divergence Measures: Mathematical Foundations and Applications in Information-Theoretic and Statistical Problems”, includes eight original contributions, and it is focused on the study of the mathematical properties and applications of classical and generalized divergence measures from an information-theoretic perspective. It mainly deals with two key generalizations of the relative entropy: namely, the R_ényi divergence and the important class of f -divergences. It is our hope that the readers will find interest in this Special Issue, which will stimulate further research in the study of the mathematical foundations and applications of divergence measures

    A Simple and Tighter Derivation of Achievability for Classical Communication over Quantum Channels

    Full text link
    Achievability in information theory refers to demonstrating a coding strategy that accomplishes a prescribed performance benchmark for the underlying task. In quantum information theory, the crafted Hayashi-Nagaoka operator inequality is an essential technique in proving a wealth of one-shot achievability bounds since it effectively resembles a union bound in various problems. In this work, we show that the pretty-good measurement naturally plays a role as the union bound as well. A judicious application of it considerably simplifies the derivation of one-shot achievability for classical-quantum (c-q) channel coding via an elegant three-line proof. The proposed analysis enjoys the following favorable features: (i) The established one-shot bound admits a closed-form expression as in the celebrated Holevo-Helstrom Theorem. Namely, the average error probability of sending MM messages through a c-q channel is upper bounded by the error of distinguishing the joint state between channel input and output against (M1)(M-1)-many products of its marginals. (ii) Our bound directly yields asymptotic results in the large deviation, small deviation, and moderate deviation regimes in a unified manner. (iii) The coefficients incurred in applying the Hayashi-Nagaoka operator inequality are no longer needed. Hence, the derived one-shot bound sharpens existing results that rely on the Hayashi-Nagaoka operator inequality. In particular, we obtain the tightest achievable ϵ\epsilon-one-shot capacity for c-q channel heretofore, and it improves the third-order coding rate in the asymptotic scenario. (iv) Our result holds for infinite-dimensional Hilbert space. (v) The proposed method applies to deriving one-shot bounds for data compression with quantum side information, entanglement-assisted classical communication over quantum channels, and various quantum network information-processing protocols

    A Brief Introduction to Machine Learning for Engineers

    Full text link
    This monograph aims at providing an introduction to key concepts, algorithms, and theoretical results in machine learning. The treatment concentrates on probabilistic models for supervised and unsupervised learning problems. It introduces fundamental concepts and algorithms by building on first principles, while also exposing the reader to more advanced topics with extensive pointers to the literature, within a unified notation and mathematical framework. The material is organized according to clearly defined categories, such as discriminative and generative models, frequentist and Bayesian approaches, exact and approximate inference, as well as directed and undirected models. This monograph is meant as an entry point for researchers with a background in probability and linear algebra.Comment: This is an expanded and improved version of the original posting. Feedback is welcom

    Information Theory and Machine Learning

    Get PDF
    The recent successes of machine learning, especially regarding systems based on deep neural networks, have encouraged further research activities and raised a new set of challenges in understanding and designing complex machine learning algorithms. New applications require learning algorithms to be distributed, have transferable learning results, use computation resources efficiently, convergence quickly on online settings, have performance guarantees, satisfy fairness or privacy constraints, incorporate domain knowledge on model structures, etc. A new wave of developments in statistical learning theory and information theory has set out to address these challenges. This Special Issue, "Machine Learning and Information Theory", aims to collect recent results in this direction reflecting a diverse spectrum of visions and efforts to extend conventional theories and develop analysis tools for these complex machine learning systems
    corecore