157 research outputs found
About Adaptive Coding on Countable Alphabets: Max-Stable Envelope Classes
In this paper, we study the problem of lossless universal source coding for
stationary memoryless sources on countably infinite alphabets. This task is
generally not achievable without restricting the class of sources over which
universality is desired. Building on our prior work, we propose natural
families of sources characterized by a common dominating envelope. We
particularly emphasize the notion of adaptivity, which is the ability to
perform as well as an oracle knowing the envelope, without actually knowing it.
This is closely related to the notion of hierarchical universal source coding,
but with the important difference that families of envelope classes are not
discretely indexed and not necessarily nested.
Our contribution is to extend the classes of envelopes over which adaptive
universal source coding is possible, namely by including max-stable
(heavy-tailed) envelopes which are excellent models in many applications, such
as natural language modeling. We derive a minimax lower bound on the redundancy
of any code on such envelope classes, including an oracle that knows the
envelope. We then propose a constructive code that does not use knowledge of
the envelope. The code is computationally efficient and is structured to use an
{E}xpanding {T}hreshold for {A}uto-{C}ensoring, and we therefore dub it the
\textsc{ETAC}-code. We prove that the \textsc{ETAC}-code achieves the lower
bound on the minimax redundancy within a factor logarithmic in the sequence
length, and can be therefore qualified as a near-adaptive code over families of
heavy-tailed envelopes. For finite and light-tailed envelopes the penalty is
even less, and the same code follows closely previous results that explicitly
made the light-tailed assumption. Our technical results are founded on methods
from regular variation theory and concentration of measure
æ ć ±æș珊ć·ćăźæéé·è§ŁæăšæŒžèżè§Łæ -ăȘăŒăăŒăăăŒçąșçă«ççźăăăąăăăŒă-
æ©ć€§ćŠäœèšçȘć·:æ°7784æ©çšČç°ć€§
Universal Source Coding in the Non-Asymptotic Regime
abstract: Fundamental limits of fixed-to-variable (F-V) and variable-to-fixed (V-F) length universal source coding at short blocklengths is characterized. For F-V length coding, the Type Size (TS) code has previously been shown to be optimal up to the third-order rate for universal compression of all memoryless sources over finite alphabets. The TS code assigns sequences ordered based on their type class sizes to binary strings ordered lexicographically.
Universal F-V coding problem for the class of first-order stationary, irreducible and aperiodic Markov sources is first considered. Third-order coding rate of the TS code for the Markov class is derived. A converse on the third-order coding rate for the general class of F-V codes is presented which shows the optimality of the TS code for such Markov sources.
This type class approach is then generalized for compression of the parametric sources. A natural scheme is to define two sequences to be in the same type class if and only if they are equiprobable under any model in the parametric class. This natural approach, however, is shown to be suboptimal. A variation of the Type Size code is introduced, where type classes are defined based on neighborhoods of minimal sufficient statistics. Asymptotics of the overflow rate of this variation is derived and a converse result establishes its optimality up to the third-order term. These results are derived for parametric families of i.i.d. sources as well as Markov sources.
Finally, universal V-F length coding of the class of parametric sources is considered in the short blocklengths regime. The proposed dictionary which is used to parse the source output stream, consists of sequences in the boundaries of transition from low to high quantized type complexity, hence the name Type Complexity (TC) code. For large enough dictionary, the -coding rate of the TC code is derived and a converse result is derived showing its optimality up to the third-order term.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
Comparing dynamic equilibrium economies to data
This paper studies the properties of the Bayesian approach to estimation and comparison of dynamic equilibrium economies. Both tasks can be performed even if the models are nonnested, misspecified, and nonlinear. First, the authors show that Bayesian methods have a classical interpretation: asymptotically the parameter point estimates converge to their pseudotrue values, and the best model under the Kullback-Leibler will have the highest posterior probability. Second, they illustrate the strong small sample behavior of the approach using a well-known application: the U.S. cattle cycle. Bayesian estimates outperform maximum likelihood results, and the proposed model is easily compared with a set of BVARs.Econometric models
Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities
This monograph presents a unified treatment of single- and multi-user
problems in Shannon's information theory where we depart from the requirement
that the error probability decays asymptotically in the blocklength. Instead,
the error probabilities for various problems are bounded above by a
non-vanishing constant and the spotlight is shone on achievable coding rates as
functions of the growing blocklengths. This represents the study of asymptotic
estimates with non-vanishing error probabilities.
In Part I, after reviewing the fundamentals of information theory, we discuss
Strassen's seminal result for binary hypothesis testing where the type-I error
probability is non-vanishing and the rate of decay of the type-II error
probability with growing number of independent observations is characterized.
In Part II, we use this basic hypothesis testing result to develop second- and
sometimes, even third-order asymptotic expansions for point-to-point
communication. Finally in Part III, we consider network information theory
problems for which the second-order asymptotics are known. These problems
include some classes of channels with random state, the multiple-encoder
distributed lossless source coding (Slepian-Wolf) problem and special cases of
the Gaussian interference and multiple-access channels. Finally, we discuss
avenues for further research.Comment: Further comments welcom
On the Compression of Unknown Sources
Ph.D. Thesis. University of HawaiÊ»i at MÄnoa 2018
- âŠ