496 research outputs found

    Information driven self-organization of complex robotic behaviors

    Get PDF
    Information theory is a powerful tool to express principles to drive autonomous systems because it is domain invariant and allows for an intuitive interpretation. This paper studies the use of the predictive information (PI), also called excess entropy or effective measure complexity, of the sensorimotor process as a driving force to generate behavior. We study nonlinear and nonstationary systems and introduce the time-local predicting information (TiPI) which allows us to derive exact results together with explicit update rules for the parameters of the controller in the dynamical systems framework. In this way the information principle, formulated at the level of behavior, is translated to the dynamics of the synapses. We underpin our results with a number of case studies with high-dimensional robotic systems. We show the spontaneous cooperativity in a complex physical system with decentralized control. Moreover, a jointly controlled humanoid robot develops a high behavioral variety depending on its physics and the environment it is dynamically embedded into. The behavior can be decomposed into a succession of low-dimensional modes that increasingly explore the behavior space. This is a promising way to avoid the curse of dimensionality which hinders learning systems to scale well.Comment: 29 pages, 12 figure

    Learning and Management for Internet-of-Things: Accounting for Adaptivity and Scalability

    Get PDF
    Internet-of-Things (IoT) envisions an intelligent infrastructure of networked smart devices offering task-specific monitoring and control services. The unique features of IoT include extreme heterogeneity, massive number of devices, and unpredictable dynamics partially due to human interaction. These call for foundational innovations in network design and management. Ideally, it should allow efficient adaptation to changing environments, and low-cost implementation scalable to massive number of devices, subject to stringent latency constraints. To this end, the overarching goal of this paper is to outline a unified framework for online learning and management policies in IoT through joint advances in communication, networking, learning, and optimization. From the network architecture vantage point, the unified framework leverages a promising fog architecture that enables smart devices to have proximity access to cloud functionalities at the network edge, along the cloud-to-things continuum. From the algorithmic perspective, key innovations target online approaches adaptive to different degrees of nonstationarity in IoT dynamics, and their scalable model-free implementation under limited feedback that motivates blind or bandit approaches. The proposed framework aspires to offer a stepping stone that leads to systematic designs and analysis of task-specific learning and management schemes for IoT, along with a host of new research directions to build on.Comment: Submitted on June 15 to Proceeding of IEEE Special Issue on Adaptive and Scalable Communication Network

    Towards Continual Reinforcement Learning: A Review and Perspectives

    Full text link
    In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations and mathematically characterize the non-stationary dynamics of each setting. We go on to discuss evaluation of continual RL agents, providing an overview of benchmarks used in the literature and important metrics for understanding agent performance. Finally, we highlight open problems and challenges in bridging the gap between the current state of continual RL and findings in neuroscience. While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners that can function in increasingly realistic applications where non-stationarity plays a vital role. These include applications such as those in the fields of healthcare, education, logistics, and robotics.Comment: Preprint, 52 pages, 8 figure

    Reinforcement Learning and Its Applications in Modern Power and Energy Systems:A Review

    Get PDF

    Problem hierarchies in continual learning

    Full text link
    La recherche en apprentissage automatique peut être vue comme une quête vers l’aboutissement d’algorithmes d’apprentissage de plus en plus généraux, applicable à des problèmes de plus en plus réalistes. Selon cette perspective, le progrès dans ce domaine peut être réalisé de deux façons: par l’amélioration des méthodes algorithmiques associées aux problèmes existants, et par l’introduction de nouveaux types de problèmes. Avec le progrès marqué du côté des méthodes d’apprentissage machine, une panoplie de nouveaux types de problèmes d’apprentissage ont aussi été proposés, où les hypothèses de problèmes existants sont assouplies ou généralisées afin de mieux refléter les conditions du monde réel. Le domaine de l’apprentissage en continu (Continual Learning) est un exemple d’un tel domaine, où l’hypothèse de la stationarité des distributions encourues lors de l’entrainement d’un modèles est assouplie, et où les algorithmes d’apprentissages doivent donc s’adapter à des changements soudains ou progressifs dans leur environnement. Dans cet ouvrage, nous introduisons les hiérarchiées de problèmes, une application du concept de hiérarchie des types provenant des sciences informatiques, au domaine des problèmes de recherche en apprentissage machine. Les hierarchies de problèmes organisent et structurent les problèmes d’apprentissage en fonction de leurs hypothéses. Les méthodes peuvent donc définir explicitement leur domaine d’application, leur permettant donc d’être partagées et réutilisées à travers différent types de problèmes de manière polymorphique: Une méthode conçue pour un domaine donné peut aussi être appli- quée à un domaine plus précis que celui-ci, tel qu’indiqué par leur relation dans la hierarchie de problèmes. Nous démontrons que ce système, lorsque mis en oeuvre, comporte divers bienfaits qui addressent directement plusieurs des problèmes encourus par les chercheurs en apprentissage machine. Nous démontrons la viabilité de ce principe avec Sequoia, une infrastructure logicielle libre qui implémente une hierarchie des problèmes en apprentissage continu. Nous espérons que ce nouveau paradigme, ainsi que sa première implémentation, pourra servir à unifier et accélérer les divers efforts de recherche en apprentissage continu, ainsi qu’à encourager des efforts similaires dans d’autres domaines de recherche. Vous pouvez nous aider à faire grandir l’arbre en visitant github.com/lebrice/Sequoia.Research in Machine Learning (ML) can be viewed as a quest to develop increasingly general algorithmic solutions (methods) for increasingly challenging research problems (settings). From this perspective, progress can be realized in two ways: by introducing better methods for current settings, or by proposing interesting new settings for the research community to solve. Alongside recent progress in methods, a wide variety of research settings have also been introduced, often as variants of existing settings where underlying assumptions are removed to make the problem more realistic or general. The field of Continual Learning (CL), for example, consists of a family of settings where the stationarity assumption is removed, and where methods as a result have to learn from environments or data distributions that can change over time. In this work, we introduce the concept of problem hierarchies: hierarchical structures in which research settings are systematically organized based on their assumptions. Methods can then explicitly state their assumptions by selecting a target setting from this hierarchy. Most importantly, these structures make it possible to easily share and reuse research methods across different settings using inheritance, since a method developed for a given setting is also directly applicable onto any of its children in the hierarchy. We argue that this simple mechanism can have great implications for ML research in practice. As a proof-of-concept of this approach, we introduce Sequoia, an open-source research framework in which we construct a hierarchy of the settings and methods in CL. We hope that this new paradigm and its first implementation can help unify and accelerate research in CL and serve as inspiration for future work in other fields. You can help us grow the tree by visiting github.com/lebrice/Sequoia

    Robot Learning for Manipulation of Deformable Linear Objects

    Get PDF
    Deformable Object Manipulation (DOM) is a challenging problem in robotics. Until recently there has been limited research on the subject, with most robotic manipulation methods being developed for rigid objects. Part of the challenge in DOM is that non-rigid objects require solutions capable of generalizing to changes in shape and mechanical properties. Recently, Machine Learning (ML) has been proven successful in other fields where generalization is important such as computer vision, thus encouraging the application of ML to robotics as well. Notably, Reinforcement Learning (RL) has shown promise in finding control policies for manipulation of rigid objects. However, RL requires large amounts of data that are better satisfied in simulation while deformable objects are inherently more difficult to model and simulate. This thesis presents ReForm, a simulation sandbox for robotic manipulation of Deformable Linear Objects (DLOs) such as cables, ropes, and wires. DLO manipulation is an interesting problem for a variety of applications throughout manufacturing, agriculture, and medicine. Currently, this sandbox includes six shape control tasks, which are classified as explicit when a precise shape is to be achieved, or implicit when the deformation is just a consequence of a more abstract goal, e.g. wrapping a DLO around another object. The proposed simulation environments aim to facilitate comparison and reproducibility of robot learning research. To that end, an RL algorithm is tested on each simulated task providing initial benchmarking results. ReForm is one of three concurrent frameworks to first support DOM problems. This thesis also addresses the problem of DLO state representation for an explicit shape control problem. Moreover, the effects of elastoplastic properties on the RL reward definition are investigated. From a control perspective, DLOs with these properties are particularly challenging to manipulate due to their nonlinear behavior, acting elastic up to a yield point after which they become permanently deformed. A low-dimensional representation from discrete differential geometry is proposed, offering more descriptive shape information than a simple point-cloud while avoiding the need for curve fitting. Empirical results show that this representation leads to a better goal description in the presence of elastoplasticity, preventing the RL algorithm from converging to local minima which correspond to incorrect shapes of the DLO
    • …
    corecore