101 research outputs found

    Rate distortion functions of countably infinite alphabet memoryless sources

    Get PDF
    The Shannon lower bound approach to the evaluation of rate distortion functions R(D) for countably infinite alphabet memoryless sources is considered. Sufficient conditions based on the Contraction Mapping Theorem for the existence of the Shannon lower bound RL(D) to R(D) in a region of distortion [0, D1], D1 > 0 are obtained. Sufficient conditions based on the Schauder Fixed Point Theorem for the existence of a Dc > 0 such that R(D) = RL(D) for all D ε [0, Dc] are derived. Explicit evaluation of R(D) is considered for a class of column balanced distortion measures. Other results for distortion measures with no symmetry conditions are also discussed

    A Generalized Typicality for Abstract Alphabets

    Full text link
    A new notion of typicality for arbitrary probability measures on standard Borel spaces is proposed, which encompasses the classical notions of weak and strong typicality as special cases. Useful lemmas about strong typical sets, including conditional typicality lemma, joint typicality lemma, and packing and covering lemmas, which are fundamental tools for deriving many inner bounds of various multi-terminal coding problems, are obtained in terms of the proposed notion. This enables us to directly generalize lots of results on finite alphabet problems to general problems involving abstract alphabets, without any complicated additional arguments. For instance, quantization procedure is no longer necessary to achieve such generalizations. Another fundamental lemma, Markov lemma, is also obtained but its scope of application is quite limited compared to others. Yet, an alternative theory of typical sets for Gaussian measures, free from this limitation, is also developed. Some remarks on a possibility to generalize the proposed notion for sources with memory are also given.Comment: 44 pages; submitted to IEEE Transactions on Information Theor

    A vector quantization approach to universal noiseless coding and quantization

    Get PDF
    A two-stage code is a block code in which each block of data is coded in two stages: the first stage codes the identity of a block code among a collection of codes, and the second stage codes the data using the identified code. The collection of codes may be noiseless codes, fixed-rate quantizers, or variable-rate quantizers. We take a vector quantization approach to two-stage coding, in which the first stage code can be regarded as a vector quantizer that “quantizes” the input data of length n to one of a fixed collection of block codes. We apply the generalized Lloyd algorithm to the first-stage quantizer, using induced measures of rate and distortion, to design locally optimal two-stage codes. On a source of medical images, two-stage variable-rate vector quantizers designed in this way outperform standard (one-stage) fixed-rate vector quantizers by over 9 dB. The tail of the operational distortion-rate function of the first-stage quantizer determines the optimal rate of convergence of the redundancy of a universal sequence of two-stage codes. We show that there exist two-stage universal noiseless codes, fixed-rate quantizers, and variable-rate quantizers whose per-letter rate and distortion redundancies converge to zero as (k/2)n -1 log n, when the universe of sources has finite dimension k. This extends the achievability part of Rissanen's theorem from universal noiseless codes to universal quantizers. Further, we show that the redundancies converge as O(n-1) when the universe of sources is countable, and as O(n-1+ϵ) when the universe of sources is infinite-dimensional, under appropriate conditions

    Estimation of the Rate-Distortion Function

    Full text link
    Motivated by questions in lossy data compression and by theoretical considerations, we examine the problem of estimating the rate-distortion function of an unknown (not necessarily discrete-valued) source from empirical data. Our focus is the behavior of the so-called "plug-in" estimator, which is simply the rate-distortion function of the empirical distribution of the observed data. Sufficient conditions are given for its consistency, and examples are provided to demonstrate that in certain cases it fails to converge to the true rate-distortion function. The analysis of its performance is complicated by the fact that the rate-distortion function is not continuous in the source distribution; the underlying mathematical problem is closely related to the classical problem of establishing the consistency of maximum likelihood estimators. General consistency results are given for the plug-in estimator applied to a broad class of sources, including all stationary and ergodic ones. A more general class of estimation problems is also considered, arising in the context of lossy data compression when the allowed class of coding distributions is restricted; analogous results are developed for the plug-in estimator in that case. Finally, consistency theorems are formulated for modified (e.g., penalized) versions of the plug-in, and for estimating the optimal reproduction distribution.Comment: 18 pages, no figures [v2: removed an example with an error; corrected typos; a shortened version will appear in IEEE Trans. Inform. Theory

    Empirical processes, typical sequences and coordinated actions in standard Borel spaces

    Full text link
    This paper proposes a new notion of typical sequences on a wide class of abstract alphabets (so-called standard Borel spaces), which is based on approximations of memoryless sources by empirical distributions uniformly over a class of measurable "test functions." In the finite-alphabet case, we can take all uniformly bounded functions and recover the usual notion of strong typicality (or typicality under the total variation distance). For a general alphabet, however, this function class turns out to be too large, and must be restricted. With this in mind, we define typicality with respect to any Glivenko-Cantelli function class (i.e., a function class that admits a Uniform Law of Large Numbers) and demonstrate its power by giving simple derivations of the fundamental limits on the achievable rates in several source coding scenarios, in which the relevant operational criteria pertain to reproducing empirical averages of a general-alphabet stationary memoryless source with respect to a suitable function class.Comment: 14 pages, 3 pdf figures; accepted to IEEE Transactions on Information Theor

    Tight Bounds on the R\'enyi Entropy via Majorization with Applications to Guessing and Compression

    Full text link
    This paper provides tight bounds on the R\'enyi entropy of a function of a discrete random variable with a finite number of possible values, where the considered function is not one-to-one. To that end, a tight lower bound on the R\'enyi entropy of a discrete random variable with a finite support is derived as a function of the size of the support, and the ratio of the maximal to minimal probability masses. This work was inspired by the recently published paper by Cicalese et al., which is focused on the Shannon entropy, and it strengthens and generalizes the results of that paper to R\'enyi entropies of arbitrary positive orders. In view of these generalized bounds and the works by Arikan and Campbell, non-asymptotic bounds are derived for guessing moments and lossless data compression of discrete memoryless sources.Comment: The paper was published in the Entropy journal (special issue on Probabilistic Methods in Information Theory, Hypothesis Testing, and Coding), vol. 20, no. 12, paper no. 896, November 22, 2018. Online available at https://www.mdpi.com/1099-4300/20/12/89

    Joint source-channel coding with feedback

    Get PDF
    This paper quantifies the fundamental limits of variable-length transmission of a general (possibly analog) source over a memoryless channel with noiseless feedback, under a distortion constraint. We consider excess distortion, average distortion and guaranteed distortion (dd-semifaithful codes). In contrast to the asymptotic fundamental limit, a general conclusion is that allowing variable-length codes and feedback leads to a sizable improvement in the fundamental delay-distortion tradeoff. In addition, we investigate the minimum energy required to reproduce kk source samples with a given fidelity after transmission over a memoryless Gaussian channel, and we show that the required minimum energy is reduced with feedback and an average (rather than maximal) power constraint.Comment: To appear in IEEE Transactions on Information Theor

    Mismatched Rate-Distortion Theory: Ensembles, Bounds, and General Alphabets

    Full text link
    In this paper, we consider the mismatched rate-distortion problem, in which the encoding is done using a codebook, and the encoder chooses the minimum-distortion codeword according to a mismatched distortion function that differs from the true one. For the case of discrete memoryless sources, we establish achievable rate-distortion bounds using multi-user coding techniques, namely, superposition coding and expurgated parallel coding. We give examples where these attain the matched rate-distortion trade-off but a standard ensemble with independent codewords fails to do so. On the other hand, in contrast with the channel coding counterpart, we show that there are cases where structured codebooks can perform worse than their unstructured counterparts. In addition, in view of the difficulties in adapting the existing and above-mentioned results to general alphabets, we consider a simpler i.i.d.~random coding ensemble, and establish its achievable rate-distortion bounds for general alphabets
    • …
    corecore