208 research outputs found
Learning Robust Options
Robust reinforcement learning aims to produce policies that have strong
guarantees even in the face of environments/transition models whose parameters
have strong uncertainty. Existing work uses value-based methods and the usual
primitive action setting. In this paper, we propose robust methods for learning
temporally abstract actions, in the framework of options. We present a Robust
Options Policy Iteration (ROPI) algorithm with convergence guarantees, which
learns options that are robust to model uncertainty. We utilize ROPI to learn
robust options with the Robust Options Deep Q Network (RO-DQN) that solves
multiple tasks and mitigates model misspecification due to model uncertainty.
We present experimental results which suggest that policy iteration with linear
features may have an inherent form of robustness when using coarse feature
representations. In addition, we present experimental results which demonstrate
that robustness helps policy iteration implemented on top of deep neural
networks to generalize over a much broader range of dynamics than non-robust
policy iteration
The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation
State-of-the-art language generation models can degenerate when applied to
open-ended generation problems such as text completion, story generation, or
dialog modeling. This degeneration usually shows up in the form of incoherence,
lack of vocabulary diversity, and self-repetition or copying from the context.
In this paper, we postulate that ``human-like'' generations usually lie in a
narrow and nearly flat entropy band, and violation of these entropy bounds
correlates with degenerate behavior. Our experiments show that this stable
narrow entropy zone exists across models, tasks, and domains and confirm the
hypothesis that violations of this zone correlate with degeneration. We then
use this insight to propose an entropy-aware decoding algorithm that respects
these entropy bounds resulting in less degenerate, more contextual, and
"human-like" language generation in open-ended text generation settings
Information theoretic approach to interactive learning
The principles of statistical mechanics and information theory play an
important role in learning and have inspired both theory and the design of
numerous machine learning algorithms. The new aspect in this paper is a focus
on integrating feedback from the learner. A quantitative approach to
interactive learning and adaptive behavior is proposed, integrating model- and
decision-making into one theoretical framework. This paper follows simple
principles by requiring that the observer's world model and action policy
should result in maximal predictive power at minimal complexity. Classes of
optimal action policies and of optimal models are derived from an objective
function that reflects this trade-off between prediction and complexity. The
resulting optimal models then summarize, at different levels of abstraction,
the process's causal organization in the presence of the learner's actions. A
fundamental consequence of the proposed principle is that the learner's optimal
action policies balance exploration and control as an emerging property.
Interestingly, the explorative component is present in the absence of policy
randomness, i.e. in the optimal deterministic behavior. This is a direct result
of requiring maximal predictive power in the presence of feedback.Comment: 6 page
Numerical reconstruction of brain tumours
We propose a nonlinear Landweber method for the inverse problem of locating the brain tumour source (origin where the tumour formed) based on well-established models of reaction–diffusion type for brain tumour growth. The approach consists of recovering the initial density of the tumour cells starting from a later state, which can be given by a medical image, by running the model backwards. Moreover, full three-dimensional simulations are given of the tumour source localization on two types of data, the three-dimensional Shepp–Logan phantom and an MRI T1-weighted brain scan. These simulations are obtained using standard finite difference discretizations of the space and time derivatives, generating a simple approach that performs well
Why highly expressed proteins evolve slowly
Much recent work has explored molecular and population-genetic constraints on
the rate of protein sequence evolution. The best predictor of evolutionary rate
is expression level, for reasons which have remained unexplained. Here, we
hypothesize that selection to reduce the burden of protein misfolding will
favor protein sequences with increased robustness to translational missense
errors. Pressure for translational robustness increases with expression level
and constrains sequence evolution. Using several sequenced yeast genomes,
global expression and protein abundance data, and sets of paralogs traceable to
an ancient whole-genome duplication in yeast, we rule out several confounding
effects and show that expression level explains roughly half the variation in
Saccharomyces cerevisiae protein evolutionary rates. We examine causes for
expression's dominant role and find that genome-wide tests favor the
translational robustness explanation over existing hypotheses that invoke
constraints on function or translational efficiency. Our results suggest that
proteins evolve at rates largely unrelated to their functions, and can explain
why highly expressed proteins evolve slowly across the tree of life.Comment: 40 pages, 3 figures, with supporting informatio
Determinants of translation efficiency and accuracy
A given protein sequence can be encoded by an astronomical number of alternative nucleotide sequences. Recent research has revealed that this flexibility provides evolution with multiple ways to tune the efficiency and fidelity of protein translation and folding
- …