1,979 research outputs found
JIDT: An information-theoretic toolkit for studying the dynamics of complex systems
Complex systems are increasingly being viewed as distributed information
processing systems, particularly in the domains of computational neuroscience,
bioinformatics and Artificial Life. This trend has resulted in a strong uptake
in the use of (Shannon) information-theoretic measures to analyse the dynamics
of complex systems in these fields. We introduce the Java Information Dynamics
Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3
licensed) open-source code implementation for empirical estimation of
information-theoretic measures from time-series data. While the toolkit
provides classic information-theoretic measures (e.g. entropy, mutual
information, conditional mutual information), it ultimately focusses on
implementing higher-level measures for information dynamics. That is, JIDT
focusses on quantifying information storage, transfer and modification, and the
dynamics of these operations in space and time. For this purpose, it includes
implementations of the transfer entropy and active information storage, their
multivariate extensions and local or pointwise variants. JIDT provides
implementations for both discrete and continuous-valued data for each measure,
including various types of estimator for continuous data (e.g. Gaussian,
box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time
due to Java's object-oriented polymorphism. Furthermore, while written in Java,
the toolkit can be used directly in MATLAB, GNU Octave, Python and other
environments. We present the principles behind the code design, and provide
several examples to guide users.Comment: 37 pages, 4 figure
Optimal Rates for Regularized Conditional Mean Embedding Learning
We address the consistency of a kernel ridge regression estimate of the
conditional mean embedding (CME), which is an embedding of the conditional
distribution of given into a target reproducing kernel Hilbert space
. The CME allows us to take conditional expectations of target
RKHS functions, and has been employed in nonparametric causal and Bayesian
inference. We address the misspecified setting, where the target CME is in the
space of Hilbert-Schmidt operators acting from an input interpolation space
between and , to . This space of operators
is shown to be isomorphic to a newly defined vector-valued interpolation space.
Using this isomorphism, we derive a novel and adaptive statistical learning
rate for the empirical CME estimator under the misspecified setting. Our
analysis reveals that our rates match the optimal rates without
assuming to be finite dimensional. We further establish a lower
bound on the learning rate, which shows that the obtained upper bound is
optimal
Nonparametric approximation of conditional expectation operators
Given the joint distribution of two random variables on some second
countable locally compact Hausdorff space, we investigate the statistical
approximation of the -operator defined by under minimal assumptions. By modifying its domain, we prove that
can be arbitrarily well approximated in operator norm by Hilbert--Schmidt
operators acting on a reproducing kernel Hilbert space. This fact allows to
estimate uniformly by finite-rank operators over a dense subspace even when
is not compact. In terms of modes of convergence, we thereby obtain the
superiority of kernel-based techniques over classically used parametric
projection approaches such as Galerkin methods. This also provides a novel
perspective on which limiting object the nonparametric estimate of
converges to. As an application, we show that these results are particularly
important for a large family of spectral analysis techniques for Markov
transition operators. Our investigation also gives a new asymptotic perspective
on the so-called kernel conditional mean embedding, which is the theoretical
foundation of a wide variety of techniques in kernel-based nonparametric
inference
- …