27,928 research outputs found
JIDT: An information-theoretic toolkit for studying the dynamics of complex systems
Complex systems are increasingly being viewed as distributed information
processing systems, particularly in the domains of computational neuroscience,
bioinformatics and Artificial Life. This trend has resulted in a strong uptake
in the use of (Shannon) information-theoretic measures to analyse the dynamics
of complex systems in these fields. We introduce the Java Information Dynamics
Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3
licensed) open-source code implementation for empirical estimation of
information-theoretic measures from time-series data. While the toolkit
provides classic information-theoretic measures (e.g. entropy, mutual
information, conditional mutual information), it ultimately focusses on
implementing higher-level measures for information dynamics. That is, JIDT
focusses on quantifying information storage, transfer and modification, and the
dynamics of these operations in space and time. For this purpose, it includes
implementations of the transfer entropy and active information storage, their
multivariate extensions and local or pointwise variants. JIDT provides
implementations for both discrete and continuous-valued data for each measure,
including various types of estimator for continuous data (e.g. Gaussian,
box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time
due to Java's object-oriented polymorphism. Furthermore, while written in Java,
the toolkit can be used directly in MATLAB, GNU Octave, Python and other
environments. We present the principles behind the code design, and provide
several examples to guide users.Comment: 37 pages, 4 figure
Statistical Inference in a Directed Network Model with Covariates
Networks are often characterized by node heterogeneity for which nodes
exhibit different degrees of interaction and link homophily for which nodes
sharing common features tend to associate with each other. In this paper, we
propose a new directed network model to capture the former via node-specific
parametrization and the latter by incorporating covariates. In particular, this
model quantifies the extent of heterogeneity in terms of outgoingness and
incomingness of each node by different parameters, thus allowing the number of
heterogeneity parameters to be twice the number of nodes. We study the maximum
likelihood estimation of the model and establish the uniform consistency and
asymptotic normality of the resulting estimators. Numerical studies demonstrate
our theoretical findings and a data analysis confirms the usefulness of our
model.Comment: 29 pages. minor revisio
Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks
We present a procedure for effective estimation of entropy and mutual
information from small-sample data, and apply it to the problem of inferring
high-dimensional gene association networks. Specifically, we develop a
James-Stein-type shrinkage estimator, resulting in a procedure that is highly
efficient statistically as well as computationally. Despite its simplicity, we
show that it outperforms eight other entropy estimation procedures across a
diverse range of sampling scenarios and data-generating models, even in cases
of severe undersampling. We illustrate the approach by analyzing E. coli gene
expression data and computing an entropy-based gene-association network from
gene expression data. A computer program is available that implements the
proposed shrinkage estimator.Comment: 18 pages, 3 figures, 1 tabl
- β¦