80,879 research outputs found
Dimension, entropy, and the local distribution of measures
We present a general approach to the study of the local distribution of
measures on Euclidean spaces, based on local entropy averages. As concrete
applications, we unify, generalize, and simplify a number of recent results on
local homogeneity, porosity and conical densities of measures.Comment: v2: 23 pages, 6 figures. Updated references. Accepted to J. London
Math. So
Local Intrinsic Dimensional Entropy
Most entropy measures depend on the spread of the probability distribution
over the sample space X, and the maximum entropy achievable scales
proportionately with the sample space cardinality |X|. For a finite |X|, this
yields robust entropy measures which satisfy many important properties, such as
invariance to bijections, while the same is not true for continuous spaces
(where |X|=infinity). Furthermore, since R and R^d (d in Z+) have the same
cardinality (from Cantor's correspondence argument), cardinality-dependent
entropy measures cannot encode the data dimensionality. In this work, we
question the role of cardinality and distribution spread in defining entropy
measures for continuous spaces, which can undergo multiple rounds of
transformations and distortions, e.g., in neural networks. We find that the
average value of the local intrinsic dimension of a distribution, denoted as
ID-Entropy, can serve as a robust entropy measure for continuous spaces, while
capturing the data dimensionality. We find that ID-Entropy satisfies many
desirable properties and can be extended to conditional entropy, joint entropy
and mutual-information variants. ID-Entropy also yields new information
bottleneck principles and also links to causality. In the context of deep
learning, for feedforward architectures, we show, theoretically and
empirically, that the ID-Entropy of a hidden layer directly controls the
generalization gap for both classifiers and auto-encoders, when the target
function is Lipschitz continuous. Our work primarily shows that, for continuous
spaces, taking a structural rather than a statistical approach yields entropy
measures which preserve intrinsic data dimensionality, while being relevant for
studying various architectures.Comment: Proceedings of the AAAI Conference on Artificial Intelligence 202
Local entropy averages and projections of fractal measures
We show that for families of measures on Euclidean space which satisfy an
ergodic-theoretic form of "self-similarity" under the operation of re-scaling,
the dimension of linear images of the measure behaves in a semi-continuous way.
We apply this to prove the following conjecture of Furstenberg: Let m,n be
integers which are not powers of the same integer, and let X,Y be closed
subsets of the unit interval which are invariant, respectively, under times-m
mod 1 and times-n mod 1. Then, for any non-zero t:
dim(X+tY)=min{1,dim(X)+dim(Y)}. A similar result holds for invariant measures,
and gives a simple proof of the Rudolph-Johnson theorem. Our methods also apply
to many other classes of conformal fractals and measures. As another
application, we extend and unify Results of Peres, Shmerkin and Nazarov, and of
Moreira, concerning projections of products self-similar measures and Gibbs
measures on regular Cantor sets. We show that under natural irreducibility
assumptions on the maps in the IFS, the image measure has the maximal possible
dimension under any linear projection other than the coordinate projections. We
also present applications to Bernoulli convolutions and to the images of
fractal measures under differentiable maps.Comment: 55 pages. Version 2: Corrected an error in proof Thm. 4.3; some new
references; various small correction
On distance sets, box-counting and Ahlfors-regular sets
We obtain box-counting estimates for the pinned distance sets of (dense
subsets of) planar discrete Ahlfors-regular sets of exponent . As a
corollary, we improve upon a recent result of Orponen, by showing that if
is Ahlfors-regular of dimension , then almost all pinned distance sets of
have lower box-counting dimension . We also show that if
have Hausdorff dimension and is
Ahlfors-regular, then the set of distances between and has modified
lower box-counting dimension , which taking improves Orponen's result
in a different direction, by lowering packing dimension to modified lower
box-counting dimension. The proofs involve ergodic-theoretic ideas, relying on
the theory of CP-processes and projections.Comment: 22 pages, no figures. v2: added Corollary 1.5 on box dimension of
pinned distance sets. v3: numerous fixes and clarifications based on referee
report
Equidistribution from Fractals
We give a fractal-geometric condition for a measure on [0,1] to be supported
on points x that are normal in base n, i.e. such that the sequence x,nx,n^2
x,... equidistributes modulo 1. This condition is robust under C^1 coordinate
changes, and it applies also when n is a Pisot number and equidistribution is
understood with respect to the beta-map and Parry measure. As applications we
obtain new results (and strengthen old ones) about the prevalence of normal
numbers in fractal sets, and new results on measure rigidity, specifically
completing Host's theorem to multiplicatively independent integers and proving
a Rudolph-Johnson-type theorem for certain pairs of beta transformations.Comment: 46 pages. v3: minor corrections and elaboration
JIDT: An information-theoretic toolkit for studying the dynamics of complex systems
Complex systems are increasingly being viewed as distributed information
processing systems, particularly in the domains of computational neuroscience,
bioinformatics and Artificial Life. This trend has resulted in a strong uptake
in the use of (Shannon) information-theoretic measures to analyse the dynamics
of complex systems in these fields. We introduce the Java Information Dynamics
Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3
licensed) open-source code implementation for empirical estimation of
information-theoretic measures from time-series data. While the toolkit
provides classic information-theoretic measures (e.g. entropy, mutual
information, conditional mutual information), it ultimately focusses on
implementing higher-level measures for information dynamics. That is, JIDT
focusses on quantifying information storage, transfer and modification, and the
dynamics of these operations in space and time. For this purpose, it includes
implementations of the transfer entropy and active information storage, their
multivariate extensions and local or pointwise variants. JIDT provides
implementations for both discrete and continuous-valued data for each measure,
including various types of estimator for continuous data (e.g. Gaussian,
box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time
due to Java's object-oriented polymorphism. Furthermore, while written in Java,
the toolkit can be used directly in MATLAB, GNU Octave, Python and other
environments. We present the principles behind the code design, and provide
several examples to guide users.Comment: 37 pages, 4 figure
- …