42,093 research outputs found
Minimum Rates of Approximate Sufficient Statistics
Given a sufficient statistic for a parametric family of distributions, one
can estimate the parameter without access to the data. However, the memory or
code size for storing the sufficient statistic may nonetheless still be
prohibitive. Indeed, for independent samples drawn from a -nomial
distribution with degrees of freedom, the length of the code scales as
. In many applications, we may not have a useful notion of
sufficient statistics (e.g., when the parametric family is not an exponential
family) and we also may not need to reconstruct the generating distribution
exactly. By adopting a Shannon-theoretic approach in which we allow a small
error in estimating the generating distribution, we construct various {\em
approximate sufficient statistics} and show that the code length can be reduced
to . We consider errors measured according to the
relative entropy and variational distance criteria. For the code constructions,
we leverage Rissanen's minimum description length principle, which yields a
non-vanishing error measured according to the relative entropy. For the
converse parts, we use Clarke and Barron's formula for the relative entropy of
a parametrized distribution and the corresponding mixture distribution.
However, this method only yields a weak converse for the variational distance.
We develop new techniques to achieve vanishing errors and we also prove strong
converses. The latter means that even if the code is allowed to have a
non-vanishing error, its length must still be at least .Comment: To appear in the IEEE Transactions on Information Theor
A Short Introduction to Model Selection, Kolmogorov Complexity and Minimum Description Length (MDL)
The concept of overfitting in model selection is explained and demonstrated
with an example. After providing some background information on information
theory and Kolmogorov complexity, we provide a short explanation of Minimum
Description Length and error minimization. We conclude with a discussion of the
typical features of overfitting in model selection.Comment: 20 pages, Chapter 1 of The Paradox of Overfitting, Master's thesis,
Rijksuniversiteit Groningen, 200
Finite-Block-Length Analysis in Classical and Quantum Information Theory
Coding technology is used in several information processing tasks. In
particular, when noise during transmission disturbs communications, coding
technology is employed to protect the information. However, there are two types
of coding technology: coding in classical information theory and coding in
quantum information theory. Although the physical media used to transmit
information ultimately obey quantum mechanics, we need to choose the type of
coding depending on the kind of information device, classical or quantum, that
is being used. In both branches of information theory, there are many elegant
theoretical results under the ideal assumption that an infinitely large system
is available. In a realistic situation, we need to account for finite size
effects. The present paper reviews finite size effects in classical and quantum
information theory with respect to various topics, including applied aspects
The velocity distribution of nearby stars from Hipparcos data I. The significance of the moving groups
We present a three-dimensional reconstruction of the velocity distribution of
nearby stars (<~ 100 pc) using a maximum likelihood density estimation
technique applied to the two-dimensional tangential velocities of stars. The
underlying distribution is modeled as a mixture of Gaussian components. The
algorithm reconstructs the error-deconvolved distribution function, even when
the individual stars have unique error and missing-data properties. We apply
this technique to the tangential velocity measurements from a kinematically
unbiased sample of 11,865 main sequence stars observed by the Hipparcos
satellite. We explore various methods for validating the complexity of the
resulting velocity distribution function, including criteria based on Bayesian
model selection and how accurately our reconstruction predicts the radial
velocities of a sample of stars from the Geneva-Copenhagen survey (GCS). Using
this very conservative external validation test based on the GCS, we find that
there is little evidence for structure in the distribution function beyond the
moving groups established prior to the Hipparcos mission. This is in sharp
contrast with internal tests performed here and in previous analyses, which
point consistently to maximal structure in the velocity distribution. We
quantify the information content of the radial velocity measurements and find
that the mean amount of new information gained from a radial velocity
measurement of a single star is significant. This argues for complementary
radial velocity surveys to upcoming astrometric surveys
D3 branes in a Melvin universe: a new realm for gravitational holography
The decoupling limit of a certain configuration of D3 branes in a Melvin
universe defines a sector of string theory known as Puff Field Theory (PFT) - a
theory with non-local dynamics but without gravity. In this work, we present a
systematic analysis of the non-local states of strongly coupled PFT using
gravitational holography. And we are led to a remarkable new holographic
dictionary. We show that the theory admits states that may be viewed as brane
protrusions from the D3 brane worldvolume. The footprint of a protrusion has
finite size - the scale of non-locality in the PFT - and corresponds to an
operator insertion in the PFT. We compute correlators of these states, and we
demonstrate that only part of the holographic bulk is explored by this
computation. We then show that the remaining space holographically encodes the
dynamics of the D3 brane tentacles. The two sectors are coupled: in this
holographic description, this is realized via quantum entanglement across a
holographic screen - a throat in the geometry - that splits the bulk into the
two regions in question. We then propose a description of PFT through a direct
product of two Fock spaces - akin to other non-local settings that employ
quantum group structures.Comment: 44 pages, 13 figures; v2: minor corrections, citations added; v3:
typos corrected in section on local operators, some asymptotic expansions
improved and made more consistent with rest of paper in section on non-local
operator
Divergence rates of Markov order estimators and their application to statistical estimation of stationary ergodic processes
Stationary ergodic processes with finite alphabets are estimated by finite
memory processes from a sample, an n-length realization of the process, where
the memory depth of the estimator process is also estimated from the sample
using penalized maximum likelihood (PML). Under some assumptions on the
continuity rate and the assumption of non-nullness, a rate of convergence in
-distance is obtained, with explicit constants. The result requires an
analysis of the divergence of PML Markov order estimators for not necessarily
finite memory processes. This divergence problem is investigated in more
generality for three information criteria: the Bayesian information criterion
with generalized penalty term yielding the PML, and the normalized maximum
likelihood and the Krichevsky-Trofimov code lengths. Lower and upper bounds on
the estimated order are obtained. The notion of consistent Markov order
estimation is generalized for infinite memory processes using the concept of
oracle order estimates, and generalized consistency of the PML Markov order
estimator is presented.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ468 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …