4,771 research outputs found
Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation
The control of nonlinear dynamical systems remains a major challenge for
autonomous agents. Current trends in reinforcement learning (RL) focus on
complex representations of dynamics and policies, which have yielded impressive
results in solving a variety of hard control tasks. However, this new
sophistication and extremely over-parameterized models have come with the cost
of an overall reduction in our ability to interpret the resulting policies. In
this paper, we take inspiration from the control community and apply the
principles of hybrid switching systems in order to break down complex dynamics
into simpler components. We exploit the rich representational power of
probabilistic graphical models and derive an expectation-maximization (EM)
algorithm for learning a sequence model to capture the temporal structure of
the data and automatically decompose nonlinear dynamics into stochastic
switching linear dynamical systems. Moreover, we show how this framework of
switching models enables extracting hierarchies of Markovian and
auto-regressive locally linear controllers from nonlinear experts in an
imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro
f-Divergence constrained policy improvement
To ensure stability of learning, state-of-the-art generalized policy
iteration algorithms augment the policy improvement step with a trust region
constraint bounding the information loss. The size of the trust region is
commonly determined by the Kullback-Leibler (KL) divergence, which not only
captures the notion of distance well but also yields closed-form solutions. In
this paper, we consider a more general class of f-divergences and derive the
corresponding policy update rules. The generic solution is expressed through
the derivative of the convex conjugate function to f and includes the KL
solution as a special case. Within the class of f-divergences, we further focus
on a one-parameter family of -divergences to study effects of the
choice of divergence on policy improvement. Previously known as well as new
policy updates emerge for different values of . We show that every type
of policy update comes with a compatible policy evaluation resulting from the
chosen f-divergence. Interestingly, the mean-squared Bellman error minimization
is closely related to policy evaluation with the Pearson -divergence
penalty, while the KL divergence results in the soft-max policy update and a
log-sum-exp critic. We carry out asymptotic analysis of the solutions for
different values of and demonstrate the effects of using different
divergence functions on a multi-armed bandit problem and on common standard
reinforcement learning problems
Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks
Autonomous robots need to interact with unknown, unstructured and changing
environments, constantly facing novel challenges. Therefore, continuous online
adaptation for lifelong-learning and the need of sample-efficient mechanisms to
adapt to changes in the environment, the constraints, the tasks, or the robot
itself are crucial. In this work, we propose a novel framework for
probabilistic online motion planning with online adaptation based on a
bio-inspired stochastic recurrent neural network. By using learning signals
which mimic the intrinsic motivation signalcognitive dissonance in addition
with a mental replay strategy to intensify experiences, the stochastic
recurrent network can learn from few physical interactions and adapts to novel
environments in seconds. We evaluate our online planning and adaptation
framework on an anthropomorphic KUKA LWR arm. The rapid online adaptation is
shown by learning unknown workspace constraints sample-efficiently from few
physical interactions while following given way points.Comment: accepted in Neural Network
CRC 1114 - Report Membrane Deformation by N-BAR Proteins: Extraction of membrane geometry and protein diffusion characteristics from MD simulations
We describe simulations of Proteins and artificial pseudo-molecules
interacting and shaping lipid bilayer membranes. We extract protein diffusion
Parameters, membrane deformation profiles and the elastic properties of the
used membrane models in preparation of calculations based on a large scale
continuum model
Interactive television or enhanced televisiion? : the Dutch users interest in applications of ITV via set-top boxes
This paper is both an analysis of the phenomenon of interactive television with background concepts of interactivity and television and a report of an empirical investigation among Dutch users of set-top-box ITV. In the analytic part a distinction is made between levels of interactivity in the applications of ITV. Activities labelled as selection, customisation, transaction and reaction reveal low levels of interactivity. They may be called ‘enhanced television’. They are extensions of existing television programmes that keep their linear character. Activities called production and conversation have the potential of higher interactivity. They may lead to ‘real’ interactive television as the user input makes a difference to programmes. It is suggested that so-called hybrid ITV– TV combined with telephone and email reply channels- and (broadband) Internet ITV offer better opportunities for high interactivity than set-top-box ITV. \ud
The empirical investigation shows that the demand of subscribers to set-top-box ITV in the Netherlands matches supply. They favour the less interactive applications of selection and reaction. Other striking results are that young subscribers appreciate interactive applications more than the older ones and that those with a low level of education prefer these applications more than high educated subscribers. No significant gender differences were found
Convergence & Competition: United Ways and Community Foundations - A National Inquiry
This U.S. report summarizes key findings of the research that was commissioned to support the active dialogue among leaders of United Ways and other community foundations about their respective roles in community philanthropy and what the options for strategic co-existence -- if not full-fledged cooperation -- will look like in the coming years
- …