8,726 research outputs found
Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems
This paper was motivated by the problem of how to make robots fuse and
transfer their experience so that they can effectively use prior knowledge and
quickly adapt to new environments. To address the problem, we present a
learning architecture for navigation in cloud robotic systems: Lifelong
Federated Reinforcement Learning (LFRL). In the work, We propose a knowledge
fusion algorithm for upgrading a shared model deployed on the cloud. Then,
effective transfer learning methods in LFRL are introduced. LFRL is consistent
with human cognitive science and fits well in cloud robotic systems.
Experiments show that LFRL greatly improves the efficiency of reinforcement
learning for robot navigation. The cloud robotic system deployment also shows
that LFRL is capable of fusing prior knowledge. In addition, we release a cloud
robotic navigation-learning website based on LFRL
Recommended from our members
A computational theory of motor learning
In this paper we present a computational theory of human motor performance and learning. The theory is implemented as a running AI system called MAGGIE. Given a description of a desired movement as input, the system generates simulated motor behavior as output. The theory states that skills are encoded as motor schemas, which specify the positions and velocities of a limb at selected points in time. Moreover, there exist two natural representations for such knowledge: viewer-centered schemas describe visually perceived behavior, and joint-centered schemas are used to generate behavior. When the model acts upon these two representational formats, they exhibit quite different behavioral characteristics. MAGGIE performs the desired movement within a feedback control paradigm, monitoring for errors and correcting them when it detects them. Learning involves improving the joint-centered schema over many practice trials; this reduces the need for monitoring. The model accounts for a number of well-documented motor phenomena, including the speed-accuracy trade-off and the gradual improvement in performance with practice. It also makes several testable predictions. We close with a discussion of the theory's strengths and weaknesses, along with directions for future research
Children, Humanoid Robots and Caregivers
This paper presents developmental learning on a humanoid robot from human-robot interactions. We consider in particular teaching humanoids as children during the child's Separation and Individuation developmental phase (Mahler, 1979). Cognitive development during this phase is characterized both by the child's dependence on her mother for learning while becoming awareness of her own individuality, and by self-exploration of her physical surroundings. We propose a learning framework for a humanoid robot inspired on such cognitive development
Indirect Methods for Robot Skill Learning
Robot learning algorithms are appealing alternatives for acquiring rational robotic behaviors from data collected during the execution of tasks. Furthermore, most robot learning techniques are stated as isolated stages and focused on directly obtaining rational policies as a result of optimizing only performance measures of single tasks. However, formulating robotic skill acquisition processes in such a way have some disadvantages. For example, if the same skill has to be learned by different robots, independent learning processes should be carried out for acquiring exclusive policies for each robot. Similarly, if a robot has to learn diverse skills, the robot should acquire the policy for each task in separate learning processes, in a sequential order and commonly starting from scratch. In the same way, formulating the learning process in terms of only the performance measure, makes robots to unintentionally avoid situations that should not be repeated, but without any mechanism that captures the necessity of not repeating those wrong behaviors. In contrast, humans and other animals exploit their experience not only for improving the performance of the task they are currently executing, but for constructing indirectly multiple models to help them with that particular task and to generalize to new problems. Accordingly, the models and algorithms proposed in this thesis seek to be more data efficient and extract more information from the interaction data that is collected either from expert\u2019s demonstrations or the robot\u2019s own experience. The first approach encodes robotic skills with shared latent variable models, obtaining latent representations that can be transferred from one robot to others, therefore avoiding to learn the same task from scratch. The second approach learns complex rational policies by representing them as hierarchical models that can perform multiple concurrent tasks, and whose components are learned in the same learning process, instead of separate processes. Finally, the third approach uses the interaction data for learning two alternative and antagonistic policies that capture what to and not to do, and which influence the learning process in addition to the performance measure defined for the task
- …