5,127 research outputs found
Estimating Mutual Information
We present two classes of improved estimators for mutual information
, from samples of random points distributed according to some joint
probability density . In contrast to conventional estimators based on
binnings, they are based on entropy estimates from -nearest neighbour
distances. This means that they are data efficient (with we resolve
structures down to the smallest possible scales), adaptive (the resolution is
higher where data are more numerous), and have minimal bias. Indeed, the bias
of the underlying entropy estimates is mainly due to non-uniformity of the
density at the smallest resolved scale, giving typically systematic errors
which scale as functions of for points. Numerically, we find that
both families become {\it exact} for independent distributions, i.e. the
estimator vanishes (up to statistical fluctuations) if . This holds for all tested marginal distributions and for all
dimensions of and . In addition, we give estimators for redundancies
between more than 2 random variables. We compare our algorithms in detail with
existing algorithms. Finally, we demonstrate the usefulness of our estimators
for assessing the actual independence of components obtained from independent
component analysis (ICA), for improving ICA, and for estimating the reliability
of blind source separation.Comment: 16 pages, including 18 figure
A vector quantization approach to universal noiseless coding and quantization
A two-stage code is a block code in which each block of data is coded in two stages: the first stage codes the identity of a block code among a collection of codes, and the second stage codes the data using the identified code. The collection of codes may be noiseless codes, fixed-rate quantizers, or variable-rate quantizers. We take a vector quantization approach to two-stage coding, in which the first stage code can be regarded as a vector quantizer that “quantizes” the input data of length n to one of a fixed collection of block codes. We apply the generalized Lloyd algorithm to the first-stage quantizer, using induced measures of rate and distortion, to design locally optimal two-stage codes. On a source of medical images, two-stage variable-rate vector quantizers designed in this way outperform standard (one-stage) fixed-rate vector quantizers by over 9 dB. The tail of the operational distortion-rate function of the first-stage quantizer determines the optimal rate of convergence of the redundancy of a universal sequence of two-stage codes. We show that there exist two-stage universal noiseless codes, fixed-rate quantizers, and variable-rate quantizers whose per-letter rate and distortion redundancies converge to zero as (k/2)n -1 log n, when the universe of sources has finite dimension k. This extends the achievability part of Rissanen's theorem from universal noiseless codes to universal quantizers. Further, we show that the redundancies converge as O(n-1) when the universe of sources is countable, and as O(n-1+ϵ) when the universe of sources is infinite-dimensional, under appropriate conditions
ADE string vacua with discrete torsion
We complete the classification of (2,2) string vacua that can be constructed
by diagonal twists of tensor products of minimal models with ADE invariants.
Using the \LG\ framework, we compute all spectra from inequivalent models of
this type. The completeness of our results is only possible by systematically
avoiding the huge redundancies coming from permutation symmetries of tensor
products. We recover the results for (2,2) vacua of an extensive computation of
simple current invariants by Schellekens and Yankielowitz, and find 4
additional mirror pairs of spectra that were missed by their stochastic method.
For the model we observe a relation between redundant spectra and
groups that are related in a particular way.Comment: 13 pages (LaTeX), preprint CERN-TH.6931/93 and ITP-UH-20/93
(reference added
Towards Bulk Metric Reconstruction from Extremal Area Variations
The Ryu-Takayanagi and Hubeny-Rangamani-Takayanagi formulae suggest that bulk
geometry emerges from the entanglement structure of the boundary theory. Using
these formulae, we build on a result of Alexakis, Balehowsky, and Nachman to
show that in four bulk dimensions, the entanglement entropies of boundary
regions of disk topology uniquely fix the bulk metric in any region foliated by
the corresponding HRT surfaces. More generally, for a bulk of any dimension , knowledge of the (variations of the) areas of two-dimensional
boundary-anchored extremal surfaces of disk topology uniquely fixes the bulk
metric wherever these surfaces reach. This result is covariant and not reliant
on any symmetry assumptions; its applicability thus includes regions of strong
dynamical gravity such as the early-time interior of black holes formed from
collapse. While we only show uniqueness of the metric, the approach we present
provides a clear path towards an explicit spacetime metric reconstruction.Comment: 33+4 pages, 7 figures; v2: addressed referee comment
Guessing under source uncertainty
This paper considers the problem of guessing the realization of a finite
alphabet source when some side information is provided. The only knowledge the
guesser has about the source and the correlated side information is that the
joint source is one among a family. A notion of redundancy is first defined and
a new divergence quantity that measures this redundancy is identified. This
divergence quantity shares the Pythagorean property with the Kullback-Leibler
divergence. Good guessing strategies that minimize the supremum redundancy
(over the family) are then identified. The min-sup value measures the richness
of the uncertainty set. The min-sup redundancies for two examples - the
families of discrete memoryless sources and finite-state arbitrarily varying
sources - are then determined.Comment: 27 pages, submitted to IEEE Transactions on Information Theory, March
2006, revised September 2006, contains minor modifications and restructuring
based on reviewers' comment
A Seismic Inversion Problem for an Anisotropic, Inhomogeneous Medium
In this report, we consider the propagation of seismic waves through a medium that can be subdivided into of two distinct parts. The upper part is assumed to be azimuthally symmetric, linearly nonuniform with increasing depth, and the velocity dependence with direction consistent with elliptical anisotropy. The lower part, which is the layer of interest, is assumed to also be azimuthally symmetric, but uniform and nonelliptically anisotropic. Despite nonellipticity, we assume the angular dependence of the velocity can be described by a convex curve.
Our goal is to produce a single source-single receiver model which uses modern seismic measurements to determine the elastic moduli of the lower media. Once known, geoscientists could better describe the angular dependence of the velocity in the layer of interest and also would have some clues at to the actual material composing it
- …