24,387 research outputs found
Signatures of Infinity: Nonergodicity and Resource Scaling in Prediction, Complexity, and Learning
We introduce a simple analysis of the structural complexity of
infinite-memory processes built from random samples of stationary, ergodic
finite-memory component processes. Such processes are familiar from the well
known multi-arm Bandit problem. We contrast our analysis with
computation-theoretic and statistical inference approaches to understanding
their complexity. The result is an alternative view of the relationship between
predictability, complexity, and learning that highlights the distinct ways in
which informational and correlational divergences arise in complex ergodic and
nonergodic processes. We draw out consequences for the resource divergences
that delineate the structural hierarchy of ergodic processes and for processes
that are themselves hierarchical.Comment: 8 pages, 1 figure; http://csc.ucdavis.edu/~cmg/compmech/pubs/soi.pd
Leading strategies in competitive on-line prediction
We start from a simple asymptotic result for the problem of on-line
regression with the quadratic loss function: the class of continuous
limited-memory prediction strategies admits a "leading prediction strategy",
which not only asymptotically performs at least as well as any continuous
limited-memory strategy but also satisfies the property that the excess loss of
any continuous limited-memory strategy is determined by how closely it imitates
the leading strategy. More specifically, for any class of prediction strategies
constituting a reproducing kernel Hilbert space we construct a leading
strategy, in the sense that the loss of any prediction strategy whose norm is
not too large is determined by how closely it imitates the leading strategy.
This result is extended to the loss functions given by Bregman divergences and
by strictly proper scoring rules.Comment: 20 pages; a conference version is to appear in the ALT'2006
proceeding
Dimension reduction for systems with slow relaxation
We develop reduced, stochastic models for high dimensional, dissipative
dynamical systems that relax very slowly to equilibrium and can encode long
term memory. We present a variety of empirical and first principles approaches
for model reduction, and build a mathematical framework for analyzing the
reduced models. We introduce the notions of universal and asymptotic filters to
characterize `optimal' model reductions for sloppy linear models. We illustrate
our methods by applying them to the practically important problem of modeling
evaporation in oil spills.Comment: 48 Pages, 13 figures. Paper dedicated to the memory of Leo Kadanof
Scanning and Sequential Decision Making for Multidimensional Data -- Part II: The Noisy Case
We consider the problem of sequential decision making for random fields corrupted by noise. In this scenario, the decision maker observes a noisy version of the data, yet judged with respect to the clean data. In particular, we first consider the problem of scanning and sequentially filtering noisy random fields. In this case, the sequential filter is given the freedom to choose the path over which it traverses the random field (e.g., noisy image or video sequence), thus it is natural to ask what is the best achievable performance and how sensitive this performance is to the choice of the scan. We formally define the problem of scanning and filtering, derive a bound on the best achievable performance, and quantify the excess loss occurring when nonoptimal scanners are used, compared to optimal scanning and filtering. We then discuss the problem of scanning and prediction for noisy random fields. This setting is a natural model for applications such as restoration and coding of noisy images. We formally define the problem of scanning and prediction of a noisy multidimensional array and relate the optimal performance to the clean scandictability defined by Merhav and Weissman. Moreover, bounds on the excess loss due to suboptimal scans are derived, and a universal prediction algorithm is suggested. This paper is the second part of a two-part paper. The first paper dealt with scanning and sequential decision making on noiseless data arrays
Scanning and Sequential Decision Making for Multi-Dimensional Data - Part I: the Noiseless Case
We investigate the problem of scanning and prediction ("scandiction", for
short) of multidimensional data arrays. This problem arises in several aspects
of image and video processing, such as predictive coding, for example, where an
image is compressed by coding the error sequence resulting from scandicting it.
Thus, it is natural to ask what is the optimal method to scan and predict a
given image, what is the resulting minimum prediction loss, and whether there
exist specific scandiction schemes which are universal in some sense.
Specifically, we investigate the following problems: First, modeling the data
array as a random field, we wish to examine whether there exists a scandiction
scheme which is independent of the field's distribution, yet asymptotically
achieves the same performance as if this distribution was known. This question
is answered in the affirmative for the set of all spatially stationary random
fields and under mild conditions on the loss function. We then discuss the
scenario where a non-optimal scanning order is used, yet accompanied by an
optimal predictor, and derive bounds on the excess loss compared to optimal
scanning and prediction.
This paper is the first part of a two-part paper on sequential decision
making for multi-dimensional data. It deals with clean, noiseless data arrays.
The second part deals with noisy data arrays, namely, with the case where the
decision maker observes only a noisy version of the data, yet it is judged with
respect to the original, clean data.Comment: 46 pages, 2 figures. Revised version: title changed, section 1
revised, section 3.1 added, a few minor/technical corrections mad
Predictive PAC Learning and Process Decompositions
We informally call a stochastic process learnable if it admits a
generalization error approaching zero in probability for any concept class with
finite VC-dimension (IID processes are the simplest example). A mixture of
learnable processes need not be learnable itself, and certainly its
generalization error need not decay at the same rate. In this paper, we argue
that it is natural in predictive PAC to condition not on the past observations
but on the mixture component of the sample path. This definition not only
matches what a realistic learner might demand, but also allows us to sidestep
several otherwise grave problems in learning from dependent data. In
particular, we give a novel PAC generalization bound for mixtures of learnable
processes with a generalization error that is not worse than that of each
mixture component. We also provide a characterization of mixtures of absolutely
regular (-mixing) processes, of independent probability-theoretic
interest.Comment: 9 pages, accepted in NIPS 201
Universal neural field computation
Turing machines and G\"odel numbers are important pillars of the theory of
computation. Thus, any computational architecture needs to show how it could
relate to Turing machines and how stable implementations of Turing computation
are possible. In this chapter, we implement universal Turing computation in a
neural field environment. To this end, we employ the canonical symbologram
representation of a Turing machine obtained from a G\"odel encoding of its
symbolic repertoire and generalized shifts. The resulting nonlinear dynamical
automaton (NDA) is a piecewise affine-linear map acting on the unit square that
is partitioned into rectangular domains. Instead of looking at point dynamics
in phase space, we then consider functional dynamics of probability
distributions functions (p.d.f.s) over phase space. This is generally described
by a Frobenius-Perron integral transformation that can be regarded as a neural
field equation over the unit square as feature space of a dynamic field theory
(DFT). Solving the Frobenius-Perron equation yields that uniform p.d.f.s with
rectangular support are mapped onto uniform p.d.f.s with rectangular support,
again. We call the resulting representation \emph{dynamic field automaton}.Comment: 21 pages; 6 figures. arXiv admin note: text overlap with
arXiv:1204.546
- …