496 research outputs found
Information driven self-organization of complex robotic behaviors
Information theory is a powerful tool to express principles to drive
autonomous systems because it is domain invariant and allows for an intuitive
interpretation. This paper studies the use of the predictive information (PI),
also called excess entropy or effective measure complexity, of the sensorimotor
process as a driving force to generate behavior. We study nonlinear and
nonstationary systems and introduce the time-local predicting information
(TiPI) which allows us to derive exact results together with explicit update
rules for the parameters of the controller in the dynamical systems framework.
In this way the information principle, formulated at the level of behavior, is
translated to the dynamics of the synapses. We underpin our results with a
number of case studies with high-dimensional robotic systems. We show the
spontaneous cooperativity in a complex physical system with decentralized
control. Moreover, a jointly controlled humanoid robot develops a high
behavioral variety depending on its physics and the environment it is
dynamically embedded into. The behavior can be decomposed into a succession of
low-dimensional modes that increasingly explore the behavior space. This is a
promising way to avoid the curse of dimensionality which hinders learning
systems to scale well.Comment: 29 pages, 12 figure
Learning and Management for Internet-of-Things: Accounting for Adaptivity and Scalability
Internet-of-Things (IoT) envisions an intelligent infrastructure of networked
smart devices offering task-specific monitoring and control services. The
unique features of IoT include extreme heterogeneity, massive number of
devices, and unpredictable dynamics partially due to human interaction. These
call for foundational innovations in network design and management. Ideally, it
should allow efficient adaptation to changing environments, and low-cost
implementation scalable to massive number of devices, subject to stringent
latency constraints. To this end, the overarching goal of this paper is to
outline a unified framework for online learning and management policies in IoT
through joint advances in communication, networking, learning, and
optimization. From the network architecture vantage point, the unified
framework leverages a promising fog architecture that enables smart devices to
have proximity access to cloud functionalities at the network edge, along the
cloud-to-things continuum. From the algorithmic perspective, key innovations
target online approaches adaptive to different degrees of nonstationarity in
IoT dynamics, and their scalable model-free implementation under limited
feedback that motivates blind or bandit approaches. The proposed framework
aspires to offer a stepping stone that leads to systematic designs and analysis
of task-specific learning and management schemes for IoT, along with a host of
new research directions to build on.Comment: Submitted on June 15 to Proceeding of IEEE Special Issue on Adaptive
and Scalable Communication Network
Towards Continual Reinforcement Learning: A Review and Perspectives
In this article, we aim to provide a literature review of different
formulations and approaches to continual reinforcement learning (RL), also
known as lifelong or non-stationary RL. We begin by discussing our perspective
on why RL is a natural fit for studying continual learning. We then provide a
taxonomy of different continual RL formulations and mathematically characterize
the non-stationary dynamics of each setting. We go on to discuss evaluation of
continual RL agents, providing an overview of benchmarks used in the literature
and important metrics for understanding agent performance. Finally, we
highlight open problems and challenges in bridging the gap between the current
state of continual RL and findings in neuroscience. While still in its early
days, the study of continual RL has the promise to develop better incremental
reinforcement learners that can function in increasingly realistic applications
where non-stationarity plays a vital role. These include applications such as
those in the fields of healthcare, education, logistics, and robotics.Comment: Preprint, 52 pages, 8 figure
Problem hierarchies in continual learning
La recherche en apprentissage automatique peut être vue comme une quête vers l’aboutissement d’algorithmes d’apprentissage de plus en plus généraux, applicable à des problèmes de plus en plus réalistes. Selon cette perspective, le progrès dans ce domaine peut être réalisé de deux façons: par l’amélioration des méthodes algorithmiques associées aux problèmes existants, et par l’introduction de nouveaux types de problèmes. Avec le progrès marqué du côté des méthodes d’apprentissage machine, une panoplie de nouveaux types de problèmes d’apprentissage ont aussi été proposés, où les hypothèses de problèmes existants sont assouplies ou généralisées afin de mieux refléter les conditions du monde réel. Le domaine de l’apprentissage en continu (Continual Learning) est un exemple d’un tel domaine, où l’hypothèse de
la stationarité des distributions encourues lors de l’entrainement d’un modèles est assouplie, et où les algorithmes d’apprentissages doivent donc s’adapter à des changements soudains ou progressifs dans leur environnement. Dans cet ouvrage, nous introduisons les hiérarchiées de problèmes, une application du concept de hiérarchie des types provenant des sciences informatiques, au domaine des problèmes de recherche en apprentissage machine. Les hierarchies de problèmes organisent et structurent les problèmes d’apprentissage en fonction de leurs hypothéses. Les méthodes peuvent donc définir explicitement leur domaine d’application, leur permettant donc d’être partagées et réutilisées à travers différent types de problèmes de manière polymorphique: Une méthode conçue pour un domaine donné peut aussi être appli-
quée à un domaine plus précis que celui-ci, tel qu’indiqué par leur relation dans la hierarchie de problèmes. Nous démontrons que ce système, lorsque mis en oeuvre, comporte divers bienfaits qui addressent directement plusieurs des problèmes encourus par les chercheurs en apprentissage machine. Nous démontrons la viabilité de ce principe avec Sequoia, une
infrastructure logicielle libre qui implémente une hierarchie des problèmes en apprentissage continu. Nous espérons que ce nouveau paradigme, ainsi que sa première implémentation, pourra servir à unifier et accélérer les divers efforts de recherche en apprentissage continu, ainsi qu’à encourager des efforts similaires dans d’autres domaines de recherche. Vous pouvez nous aider à faire grandir l’arbre en visitant github.com/lebrice/Sequoia.Research in Machine Learning (ML) can be viewed as a quest to develop increasingly general
algorithmic solutions (methods) for increasingly challenging research problems (settings).
From this perspective, progress can be realized in two ways: by introducing better methods
for current settings, or by proposing interesting new settings for the research community to
solve. Alongside recent progress in methods, a wide variety of research settings have also been
introduced, often as variants of existing settings where underlying assumptions are removed
to make the problem more realistic or general. The field of Continual Learning (CL), for
example, consists of a family of settings where the stationarity assumption is removed, and
where methods as a result have to learn from environments or data distributions that can
change over time. In this work, we introduce the concept of problem hierarchies: hierarchical
structures in which research settings are systematically organized based on their assumptions.
Methods can then explicitly state their assumptions by selecting a target setting from this
hierarchy. Most importantly, these structures make it possible to easily share and reuse
research methods across different settings using inheritance, since a method developed for a
given setting is also directly applicable onto any of its children in the hierarchy. We argue
that this simple mechanism can have great implications for ML research in practice. As a
proof-of-concept of this approach, we introduce Sequoia, an open-source research framework
in which we construct a hierarchy of the settings and methods in CL. We hope that this
new paradigm and its first implementation can help unify and accelerate research in CL and
serve as inspiration for future work in other fields. You can help us grow the tree by visiting
github.com/lebrice/Sequoia
Robot Learning for Manipulation of Deformable Linear Objects
Deformable Object Manipulation (DOM) is a challenging problem in robotics. Until recently there has been limited research on the subject, with most robotic manipulation methods being developed for rigid objects. Part of the challenge in DOM is that non-rigid objects require solutions capable of generalizing to changes in shape and mechanical properties. Recently, Machine Learning (ML) has been proven successful in other fields where generalization is important such as computer vision, thus encouraging the application of ML to robotics as well. Notably, Reinforcement Learning (RL) has shown promise in finding control policies for manipulation of rigid objects. However, RL requires large amounts of data that are better satisfied in simulation while deformable objects are inherently more difficult to model and simulate. This thesis presents ReForm, a simulation sandbox for robotic manipulation of Deformable Linear Objects (DLOs) such as cables, ropes, and wires. DLO manipulation is an interesting problem for a variety of applications throughout manufacturing, agriculture, and medicine. Currently, this sandbox includes six shape control tasks, which are classified as explicit when a precise shape is to be achieved, or implicit when the deformation is just a consequence of a more abstract goal, e.g. wrapping a DLO around another object. The proposed simulation environments aim to facilitate comparison and reproducibility of robot learning research. To that end, an RL algorithm is tested on each simulated task providing initial benchmarking results. ReForm is one of three concurrent frameworks to first support DOM problems. This thesis also addresses the problem of DLO state representation for an explicit shape control problem. Moreover, the effects of elastoplastic properties on the RL reward definition are investigated. From a control perspective, DLOs with these properties are particularly challenging to manipulate due to their nonlinear behavior, acting elastic up to a yield point after which they become permanently deformed. A low-dimensional representation from discrete differential geometry is proposed, offering more descriptive shape information than a simple point-cloud while avoiding the need for curve fitting. Empirical results show that this representation leads to a better goal description in the presence of elastoplasticity, preventing the RL algorithm from converging to local minima which correspond to incorrect shapes of the DLO
- …