620 research outputs found
Asymptotic Optimality of Antidictionary Codes
An antidictionary code is a lossless compression algorithm using an
antidictionary which is a set of minimal words that do not occur as substrings
in an input string. The code was proposed by Crochemore et al. in 2000, and its
asymptotic optimality has been proved with respect to only a specific
information source, called balanced binary source that is a binary Markov
source in which a state transition occurs with probability 1/2 or 1. In this
paper, we prove the optimality of both static and dynamic antidictionary codes
with respect to a stationary ergodic Markov source on finite alphabet such that
a state transition occurs with probability .Comment: 5 pages, to appear in the proceedings of 2010 IEEE International
Symposium on Information Theory (ISIT2010
Forbidden ordinal patterns in higher dimensional dynamics
Forbidden ordinal patterns are ordinal patterns (or `rank blocks') that
cannot appear in the orbits generated by a map taking values on a linearly
ordered space, in which case we say that the map has forbidden patterns. Once a
map has a forbidden pattern of a given length , it has forbidden
patterns of any length and their number grows superexponentially
with . Using recent results on topological permutation entropy, we study in
this paper the existence and some basic properties of forbidden ordinal
patterns for self maps on n-dimensional intervals. Our most applicable
conclusion is that expansive interval maps with finite topological entropy have
necessarily forbidden patterns, although we conjecture that this is also the
case under more general conditions. The theoretical results are nicely
illustrated for n=2 both using the naive counting estimator for forbidden
patterns and Chao's estimator for the number of classes in a population. The
robustness of forbidden ordinal patterns against observational white noise is
also illustrated.Comment: 19 pages, 6 figure
Entropy-based parametric estimation of spike train statistics
We consider the evolution of a network of neurons, focusing on the asymptotic
behavior of spikes dynamics instead of membrane potential dynamics. The spike
response is not sought as a deterministic response in this context, but as a
conditional probability : "Reading out the code" consists of inferring such a
probability. This probability is computed from empirical raster plots, by using
the framework of thermodynamic formalism in ergodic theory. This gives us a
parametric statistical model where the probability has the form of a Gibbs
distribution. In this respect, this approach generalizes the seminal and
profound work of Schneidman and collaborators. A minimal presentation of the
formalism is reviewed here, while a general algorithmic estimation method is
proposed yielding fast convergent implementations. It is also made explicit how
several spike observables (entropy, rate, synchronizations, correlations) are
given in closed-form from the parametric estimation. This paradigm does not
only allow us to estimate the spike statistics, given a design choice, but also
to compare different models, thus answering comparative questions about the
neural code such as : "are correlations (or time synchrony or a given set of
spike patterns, ..) significant with respect to rate coding only ?" A numerical
validation of the method is proposed and the perspectives regarding spike-train
code analysis are also discussed.Comment: 37 pages, 8 figures, submitte
Entropy of a bit-shift channel
We consider a simple transformation (coding) of an iid source called a
bit-shift channel. This simple transformation occurs naturally in magnetic or
optical data storage. The resulting process is not Markov of any order. We
discuss methods of computing the entropy of the transformed process, and study
some of its properties.Comment: Published at http://dx.doi.org/10.1214/074921706000000293 in the IMS
Lecture Notes--Monograph Series
(http://www.imstat.org/publications/lecnotes.htm) by the Institute of
Mathematical Statistics (http://www.imstat.org
Universal finitary codes with exponential tails
In 1977, Keane and Smorodinsky showed that there exists a finitary
homomorphism from any finite-alphabet Bernoulli process to any other
finite-alphabet Bernoulli process of strictly lower entropy. In 1996, Serafin
proved the existence of a finitary homomorphism with finite expected coding
length. In this paper, we construct such a homomorphism in which the coding
length has exponential tails. Our construction is source-universal, in the
sense that it does not use any information on the source distribution other
than the alphabet size and a bound on the entropy gap between the source and
target distributions. We also indicate how our methods can be extended to prove
a source-specific version of the result for Markov chains.Comment: 33 page
An Algorithm for Pattern Discovery in Time Series
We present a new algorithm for discovering patterns in time series and other
sequential data. We exhibit a reliable procedure for building the minimal set
of hidden, Markovian states that is statistically capable of producing the
behavior exhibited in the data -- the underlying process's causal states.
Unlike conventional methods for fitting hidden Markov models (HMMs) to data,
our algorithm makes no assumptions about the process's causal architecture (the
number of hidden states and their transition structure), but rather infers it
from the data. It starts with assumptions of minimal structure and introduces
complexity only when the data demand it. Moreover, the causal states it infers
have important predictive optimality properties that conventional HMM states
lack. We introduce the algorithm, review the theory behind it, prove its
asymptotic reliability, use large deviation theory to estimate its rate of
convergence, and compare it to other algorithms which also construct HMMs from
data. We also illustrate its behavior on an example process, and report
selected numerical results from an implementation.Comment: 26 pages, 5 figures; 5 tables;
http://www.santafe.edu/projects/CompMech Added discussion of algorithm
parameters; improved treatment of convergence and time complexity; added
comparison to older method
A certain synchronizing property of subshifts and flow equivalence
We will study a certain synchronizing property of subshifts called
-synchronization. The -synchronizing subshifts form a large
class of irreducible subshifts containing irreducible sofic shifts. We prove
that the -synchronization is invariant under flow equivalence of
subshifts. The -synchronizing K-groups and the -synchronizing
Bowen-Franks groups are studied and proved to be invariant under flow
equivalence of -synchronizing subshifts. They are new flow equivalence
invariants for -synchronizing subshifts.Comment: 28 page
Computational Mechanics of Input-Output Processes: Structured transformations and the -transducer
Computational mechanics quantifies structure in a stochastic process via its
causal states, leading to the process's minimal, optimal predictor---the
-machine. We extend computational mechanics to communication channels
between two processes, obtaining an analogous optimal model---the
-transducer---of the stochastic mapping between them. Here, we lay
the foundation of a structural analysis of communication channels, treating
joint processes and processes with input. The result is a principled structural
analysis of mechanisms that support information flow between processes. It is
the first in a series on the structural information theory of memoryful
channels, channel composition, and allied conditional information measures.Comment: 30 pages, 19 figures;
http://csc.ucdavis.edu/~cmg/compmech/pubs/et1.htm; Updated to conform to
published version plus additional corrections and update
- …