157 research outputs found

    About Adaptive Coding on Countable Alphabets: Max-Stable Envelope Classes

    Full text link
    In this paper, we study the problem of lossless universal source coding for stationary memoryless sources on countably infinite alphabets. This task is generally not achievable without restricting the class of sources over which universality is desired. Building on our prior work, we propose natural families of sources characterized by a common dominating envelope. We particularly emphasize the notion of adaptivity, which is the ability to perform as well as an oracle knowing the envelope, without actually knowing it. This is closely related to the notion of hierarchical universal source coding, but with the important difference that families of envelope classes are not discretely indexed and not necessarily nested. Our contribution is to extend the classes of envelopes over which adaptive universal source coding is possible, namely by including max-stable (heavy-tailed) envelopes which are excellent models in many applications, such as natural language modeling. We derive a minimax lower bound on the redundancy of any code on such envelope classes, including an oracle that knows the envelope. We then propose a constructive code that does not use knowledge of the envelope. The code is computationally efficient and is structured to use an {E}xpanding {T}hreshold for {A}uto-{C}ensoring, and we therefore dub it the \textsc{ETAC}-code. We prove that the \textsc{ETAC}-code achieves the lower bound on the minimax redundancy within a factor logarithmic in the sequence length, and can be therefore qualified as a near-adaptive code over families of heavy-tailed envelopes. For finite and light-tailed envelopes the penalty is even less, and the same code follows closely previous results that explicitly made the light-tailed assumption. Our technical results are founded on methods from regular variation theory and concentration of measure

    Universal Source Coding in the Non-Asymptotic Regime

    Get PDF
    abstract: Fundamental limits of fixed-to-variable (F-V) and variable-to-fixed (V-F) length universal source coding at short blocklengths is characterized. For F-V length coding, the Type Size (TS) code has previously been shown to be optimal up to the third-order rate for universal compression of all memoryless sources over finite alphabets. The TS code assigns sequences ordered based on their type class sizes to binary strings ordered lexicographically. Universal F-V coding problem for the class of first-order stationary, irreducible and aperiodic Markov sources is first considered. Third-order coding rate of the TS code for the Markov class is derived. A converse on the third-order coding rate for the general class of F-V codes is presented which shows the optimality of the TS code for such Markov sources. This type class approach is then generalized for compression of the parametric sources. A natural scheme is to define two sequences to be in the same type class if and only if they are equiprobable under any model in the parametric class. This natural approach, however, is shown to be suboptimal. A variation of the Type Size code is introduced, where type classes are defined based on neighborhoods of minimal sufficient statistics. Asymptotics of the overflow rate of this variation is derived and a converse result establishes its optimality up to the third-order term. These results are derived for parametric families of i.i.d. sources as well as Markov sources. Finally, universal V-F length coding of the class of parametric sources is considered in the short blocklengths regime. The proposed dictionary which is used to parse the source output stream, consists of sequences in the boundaries of transition from low to high quantized type complexity, hence the name Type Complexity (TC) code. For large enough dictionary, the Ï”\epsilon-coding rate of the TC code is derived and a converse result is derived showing its optimality up to the third-order term.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Comparing dynamic equilibrium economies to data

    Get PDF
    This paper studies the properties of the Bayesian approach to estimation and comparison of dynamic equilibrium economies. Both tasks can be performed even if the models are nonnested, misspecified, and nonlinear. First, the authors show that Bayesian methods have a classical interpretation: asymptotically the parameter point estimates converge to their pseudotrue values, and the best model under the Kullback-Leibler will have the highest posterior probability. Second, they illustrate the strong small sample behavior of the approach using a well-known application: the U.S. cattle cycle. Bayesian estimates outperform maximum likelihood results, and the proposed model is easily compared with a set of BVARs.Econometric models

    Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities

    Full text link
    This monograph presents a unified treatment of single- and multi-user problems in Shannon's information theory where we depart from the requirement that the error probability decays asymptotically in the blocklength. Instead, the error probabilities for various problems are bounded above by a non-vanishing constant and the spotlight is shone on achievable coding rates as functions of the growing blocklengths. This represents the study of asymptotic estimates with non-vanishing error probabilities. In Part I, after reviewing the fundamentals of information theory, we discuss Strassen's seminal result for binary hypothesis testing where the type-I error probability is non-vanishing and the rate of decay of the type-II error probability with growing number of independent observations is characterized. In Part II, we use this basic hypothesis testing result to develop second- and sometimes, even third-order asymptotic expansions for point-to-point communication. Finally in Part III, we consider network information theory problems for which the second-order asymptotics are known. These problems include some classes of channels with random state, the multiple-encoder distributed lossless source coding (Slepian-Wolf) problem and special cases of the Gaussian interference and multiple-access channels. Finally, we discuss avenues for further research.Comment: Further comments welcom

    On the Compression of Unknown Sources

    Get PDF
    Ph.D. Thesis. University of Hawaiʻi at Mānoa 2018
    • 

    corecore