    Universally Typical Sets for Ergodic Sources of Multidimensional Data

    We lift important results about universally typical sets, typically sampled sets, and empirical entropy estimation in the theory of samplings of discrete ergodic information sources from the usual one-dimensional discrete-time setting to a multidimensional lattice setting. We use techniques of packings and coverings with multidimensional windows to construct sequences of multidimensional array sets which in the limit build the generated samples of any ergodic source of entropy rate below an h0h_0 with probability one and whose cardinality grows at most at exponential rate h0h_0.Comment: 15 pages, 1 figure. To appear in Kybernetika. This replacement corrects typos and slightly strengthens the main theore

    Displacement autocorrelation functions for strong anomalous diffusion: A scaling form, universal behavior, and corrections to scaling

    Strong anomalous diffusion is characterized by asymptotic power-law growth of the moments of displacement, with exponents that do not depend linearly on the order of the moment. The exponents concerning small-order moments are dominated by random motion, while higher-order exponents grow by faster trajectories, such as ballistic excursions or "light fronts." Often such a situation is characterized by two linear dependencies of the exponents on their order. Here, we introduce a simple exactly solvable model, the fly-and-die (FnD) model, that sheds light on this behavior and on the consequences of light fronts on displacement autocorrelation functions in transport processes. We present analytical expressions for the moments and derive a scaling form that expresses the long-time asymptotics of the autocorrelation function in terms of the dimensionless time difference (t(2) - t(1))/t(1). The scaling form provides a faithful collapse of numerical data for vastly different systems. This is demonstrated here for the Lorentz gas with infinite horizon, polygonal billiards with finite and infinite horizon, the Levy-Lorentz gas, the slicer map, and Levy walks. Our analysis also captures the system-specific corrections to scaling

    Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science

    Introduction The behavior of the atmosphere, oceans, and climate is intrinsically uncertain. The basic physical principles that govern atmospheric and oceanic flows are well known, for example, the Navier-Stokes equations for fluid flow, thermodynamic properties of moist air, and the effects of density stratification and Coriolis force. Notwithstanding, there are major sources of randomness and uncertainty that prevent perfect prediction and complete understanding of these flows. The climate system involves a wide spectrum of space and time scales due to processes occurring on the order of microns and milliseconds such as the formation of cloud and rain droplets to global phenomena involving annual and decadal oscillations such as the EL Nio-Southern Oscillation (ENSO) and the Pacific Decadal Oscillation (PDO) [5]. Moreover, climate records display a spectral variability ranging from 1 cycle per month to 1 cycle per 100, 000 years [23]. The complexity of the climate system stems in large part from the inherent nonlinearities of fluid mechanics and the phase changes of water substances. The atmosphere and oceans are turbulent, nonlinear systems that display chaotic behavior (e.g., [39]). The time evolutions of the same chaotic system starting from two slightly different initial states diverge exponentially fast, so that chaotic systems are marked by limited predictability. Beyond the so-called predictability horizon (on the order of 10 days for the atmosphere), initial state uncertainties (e.g., due to imperfect observations) have grown to the point that straightforward forecasts are no longer useful. Another major source of uncertainty stems from the fact that numerical models for atmospheric and oceanic flows cannot describe all relevant physical processes at once. These models are in essence discretized partial differential equations (PDEs), and the derivation of suitable PDEs (e.g., the so-called primitive equations) from more general ones that are less convenient for computation (e.g., the full Navier-Stokes equations) involves approximations and simplifications that introduce errors in the equations. Furthermore, as a result of spatial discretization of the PDEs, numerical models have finite resolution so that small-scale processes with length scales below the model grid scale are not resolved. These limitations are unavoidable, leading to model error and uncertainty. The uncertainties due to chaotic behavior and unresolved processes motivate the use of stochastic and statistical methods for modeling and understanding climate, atmosphere, and oceans. Models can be augmented with random elements in order to represent time-evolving uncertainties, leading to stochastic models. Weather forecasts and climate predictions are increasingly expressed in probabilistic terms, making explicit the margins of uncertainty inherent to any prediction

    Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities

    This monograph presents a unified treatment of single- and multi-user problems in Shannon's information theory where we depart from the requirement that the error probability decays asymptotically in the blocklength. Instead, the error probabilities for various problems are bounded above by a non-vanishing constant and the spotlight is shone on achievable coding rates as functions of the growing blocklengths. This represents the study of asymptotic estimates with non-vanishing error probabilities. In Part I, after reviewing the fundamentals of information theory, we discuss Strassen's seminal result for binary hypothesis testing where the type-I error probability is non-vanishing and the rate of decay of the type-II error probability with growing number of independent observations is characterized. In Part II, we use this basic hypothesis testing result to develop second- and sometimes, even third-order asymptotic expansions for point-to-point communication. Finally in Part III, we consider network information theory problems for which the second-order asymptotics are known. These problems include some classes of channels with random state, the multiple-encoder distributed lossless source coding (Slepian-Wolf) problem and special cases of the Gaussian interference and multiple-access channels. Finally, we discuss avenues for further research.Comment: Further comments welcom

    The QCD confinement transition: hadron formation

    We review the foundations and the applications of the statistical and the quark recombination model as hadronization models.Comment: 45 pages, 16 figures, accepted for publication in Landolt-Boernstein Volume 1-23

    Benefiting from disorder: source coding for unordered data

    The order of letters is not always relevant in a communication task. This paper discusses the implications of order irrelevance on source coding, presenting results in several major branches of source coding theory: lossless coding, universal lossless coding, rate-distortion, high-rate quantization, and universal lossy coding. The main conclusions demonstrate that there is a significant rate savings when order is irrelevant. In particular, lossless coding of n letters from a finite alphabet requires Theta(log n) bits and universal lossless coding requires n + o(n) bits for many countable alphabet sources. However, there are no universal schemes that can drive a strong redundancy measure to zero. Results for lossy coding include distribution-free expressions for the rate savings from order irrelevance in various high-rate quantization schemes. Rate-distortion bounds are given, and it is shown that the analogue of the Shannon lower bound is loose at all finite rates.First author draf

    Structural and Computational Existence Results for Multidimensional Subshifts

    Symbolic dynamics is a branch of mathematics that studies the structure of infinite sequences of symbols, or in the multidimensional case, infinite grids of symbols. Classes of such sequences and grids defined by collections of forbidden patterns are called subshifts, and subshifts of finite type are defined by finitely many forbidden patterns. The simplest examples of multidimensional subshifts are sets of Wang tilings, infinite arrangements of square tiles with colored edges, where adjacent edges must have the same color. Multidimensional symbolic dynamics has strong connections to computability theory, since most of the basic properties of subshifts cannot be recognized by computer programs, but are instead characterized by some higher-level notion of computability. This dissertation focuses on the structure of multidimensional subshifts, and the ways in which it relates to their computational properties. In the first part, we study the subpattern posets and Cantor-Bendixson ranks of countable subshifts of finite type, which can be seen as measures of their structural complexity. We show, by explicitly constructing subshifts with the desired properties, that both notions are essentially restricted only by computability conditions. In the second part of the dissertation, we study different methods of defining (classes of ) multidimensional subshifts, and how they relate to each other and existing methods. We present definitions that use monadic second-order logic, a more restricted kind of logical quantification called quantifier extension, and multi-headed finite state machines. Two of the definitions give rise to hierarchies of subshift classes, which are a priori infinite, but which we show to collapse into finitely many levels. The quantifier extension provides insight to the somewhat mysterious class of multidimensional sofic subshifts, since we prove a characterization for the class of subshifts that can extend a sofic subshift into a nonsofic one.Symbolidynamiikka on matematiikan ala, joka tutkii äärettömän pituisten symbolijonojen ominaisuuksia, tai moniulotteisessa tapauksessa äärettömän laajoja symbolihiloja. Siirtoavaruudet ovat tällaisten jonojen tai hilojen kokoelmia, jotka on määritelty kieltämällä jokin joukko äärellisen kokoisia kuvioita, ja äärellisen tyypin siirtoavaruudet saadaan kieltämällä vain äärellisen monta kuviota. Wangin tiilitykset ovat yksinkertaisin esimerkki moniulotteisista siirtoavaruuksista. Ne ovat värillisistä neliöistä muodostettuja tiilityksiä, joissa kaikkien vierekkäisten sivujen on oltava samanvärisiä. Moniulotteinen symbolidynamiikka on vahvasti yhteydessä laskettavuuden teoriaan, sillä monia siirtoavaruuksien perusominaisuuksia ei ole mahdollista tunnistaa tietokoneohjelmilla, vaan korkeamman tason laskennallisilla malleilla. Väitöskirjassani tutkin moniulotteisten siirtoavaruuksien rakennetta ja sen suhdetta niiden laskennallisiin ominaisuuksiin. Ensimmäisessä osassa keskityn tiettyihin äärellisen tyypin siirtoavaruuksien rakenteellisiin ominaisuuksiin: äärellisten kuvioiden muodostamaan järjestykseen ja Cantor-Bendixsonin astelukuun. Halutunlaisia siirtoavaruuksia rakentamalla osoitan, että molemmat ominaisuudet ovat olennaisesti laskennallisten ehtojen rajoittamia. Väitöskirjan toisessa osassa tutkin erilaisia tapoja määritellä moniulotteisia siirtoavaruuksia, sekä sitä, miten nämä tavat vertautuvat toisiinsa ja tunnettuihin siirtoavaruuksien luokkiin. Käsittelen määritelmiä, jotka perustuvat toisen kertaluvun logiikkaan, kvanttorilaajennukseksi kutsuttuun rajoitettuun loogiseen kvantifiointiin, sekä monipäisiin äärellisiin automaatteihin. Näistä kolmesta määritelmästä kahteen liittyy erilliset siirtoavaruuksien hierarkiat, joiden todistan romahtavan äärellisen korkuisiksi. Kvanttorilaajennuksen tutkimus valottaa myös niin kutsuttujen sofisten siirtoavaruuksien rakennetta, jota ei vielä tunneta hyvin: kyseisessä luvussa selvitän tarkasti, mitkä siirtoavaruudet voivat laajentaa sofisen avaruuden ei-sofiseksi.Siirretty Doriast