25,352 research outputs found
A Minimum Relative Entropy Principle for Learning and Acting
This paper proposes a method to construct an adaptive agent that is universal
with respect to a given class of experts, where each expert is an agent that
has been designed specifically for a particular environment. This adaptive
control problem is formalized as the problem of minimizing the relative entropy
of the adaptive agent from the expert that is most suitable for the unknown
environment. If the agent is a passive observer, then the optimal solution is
the well-known Bayesian predictor. However, if the agent is active, then its
past actions need to be treated as causal interventions on the I/O stream
rather than normal probability conditions. Here it is shown that the solution
to this new variational problem is given by a stochastic controller called the
Bayesian control rule, which implements adaptive behavior as a mixture of
experts. Furthermore, it is shown that under mild assumptions, the Bayesian
control rule converges to the control law of the most suitable expert.Comment: 36 pages, 11 figure
About Adaptive Coding on Countable Alphabets: Max-Stable Envelope Classes
In this paper, we study the problem of lossless universal source coding for
stationary memoryless sources on countably infinite alphabets. This task is
generally not achievable without restricting the class of sources over which
universality is desired. Building on our prior work, we propose natural
families of sources characterized by a common dominating envelope. We
particularly emphasize the notion of adaptivity, which is the ability to
perform as well as an oracle knowing the envelope, without actually knowing it.
This is closely related to the notion of hierarchical universal source coding,
but with the important difference that families of envelope classes are not
discretely indexed and not necessarily nested.
Our contribution is to extend the classes of envelopes over which adaptive
universal source coding is possible, namely by including max-stable
(heavy-tailed) envelopes which are excellent models in many applications, such
as natural language modeling. We derive a minimax lower bound on the redundancy
of any code on such envelope classes, including an oracle that knows the
envelope. We then propose a constructive code that does not use knowledge of
the envelope. The code is computationally efficient and is structured to use an
{E}xpanding {T}hreshold for {A}uto-{C}ensoring, and we therefore dub it the
\textsc{ETAC}-code. We prove that the \textsc{ETAC}-code achieves the lower
bound on the minimax redundancy within a factor logarithmic in the sequence
length, and can be therefore qualified as a near-adaptive code over families of
heavy-tailed envelopes. For finite and light-tailed envelopes the penalty is
even less, and the same code follows closely previous results that explicitly
made the light-tailed assumption. Our technical results are founded on methods
from regular variation theory and concentration of measure
Bayesian shrinkage in mixture-of-experts models: identifying robust determinants of class membership
A method for implicit variable selection in mixture-of-experts frameworks is proposed.
We introduce a prior structure where information is taken from a set of independent
covariates. Robust class membership predictors are identified using a normal gamma
prior. The resulting model setup is used in a finite mixture of Bernoulli distributions
to find homogenous clusters of women in Mozambique based on their information
sources on HIV. Fully Bayesian inference is carried out via the implementation of a
Gibbs sampler
A control algorithm for autonomous optimization of extracellular recordings
This paper develops a control algorithm that can autonomously position an electrode so as to find and then maintain an optimal extracellular recording position. The algorithm was developed and tested in a two-neuron computational model representative of the cells found in cerebral cortex. The algorithm is based on a stochastic optimization of a suitably defined signal quality metric and is shown capable of finding the optimal recording position along representative sampling directions, as well as maintaining the optimal signal quality in the face of modeled tissue movements. The application of the algorithm to acute neurophysiological recording experiments and its potential implications to chronic recording electrode arrays are discussed
Causal inference using the algorithmic Markov condition
Inferring the causal structure that links n observables is usually based upon
detecting statistical dependences and choosing simple graphs that make the
joint measure Markovian. Here we argue why causal inference is also possible
when only single observations are present.
We develop a theory how to generate causal graphs explaining similarities
between single objects. To this end, we replace the notion of conditional
stochastic independence in the causal Markov condition with the vanishing of
conditional algorithmic mutual information and describe the corresponding
causal inference rules.
We explain why a consistent reformulation of causal inference in terms of
algorithmic complexity implies a new inference principle that takes into
account also the complexity of conditional probability densities, making it
possible to select among Markov equivalent causal graphs. This insight provides
a theoretical foundation of a heuristic principle proposed in earlier work.
We also discuss how to replace Kolmogorov complexity with decidable
complexity criteria. This can be seen as an algorithmic analog of replacing the
empirically undecidable question of statistical independence with practical
independence tests that are based on implicit or explicit assumptions on the
underlying distribution.Comment: 16 figure
A prototypical model for tensional wrinkling in thin sheets
The buckling and wrinkling of thin films has recently seen a surge of interest among physicists, biologists, mathematicians and engineers. This has been triggered by the growing interest in developing technologies at ever decreasing scales and the resulting necessity to control the mechanics of tiny structures, as well as by the realization that morphogenetic processes, such as the tissue-shaping instabilities occurring in animal epithelia or plant leaves, often emerge from mechanical instabilities of cell sheets. While the most basic buckling instability of uniaxially compressed plates was understood by Euler more than 200 years ago, recent experiments on nanometrically thin (ultrathin) films have shown significant deviations from predictions of standard buckling theory. Motivated by this puzzle, we introduce here a theoretical model that allows for a systematic analysis of wrinkling in sheets far from their instability threshold. We focus on the simplest extension of Euler buckling that exhibits wrinkles of finite length - a sheet under axisymmetric tensile loads. This geometry, whose first study is attributed to Lam´e, allows us to construct\ud
a phase diagram that demonstrates the dramatic variation of wrinkling patterns from near-threshold to far-from-threshold conditions. Theoretical arguments and comparison to experiments show that for thin sheets the far-from-threshold regime is expected to emerge under extremely small compressive loads, emphasizing the relevance of our analysis for nanomechanics applications
Beyond Word N-Grams
We describe, analyze, and evaluate experimentally a new probabilistic model
for word-sequence prediction in natural language based on prediction suffix
trees (PSTs). By using efficient data structures, we extend the notion of PST
to unbounded vocabularies. We also show how to use a Bayesian approach based on
recursive priors over all possible PSTs to efficiently maintain tree mixtures.
These mixtures have provably and practically better performance than almost any
single model. We evaluate the model on several corpora. The low perplexity
achieved by relatively small PST mixture models suggests that they may be an
advantageous alternative, both theoretically and practically, to the widely
used n-gram models.Comment: 15 pages, one PostScript figure, uses psfig.sty and fullname.sty.
Revised version of a paper in the Proceedings of the Third Workshop on Very
Large Corpora, MIT, 199
- …