743 research outputs found
Less Is More: Robust Robot Learning via Partially Observable Multi-Agent Reinforcement Learning
In many multi-agent and high-dimensional robotic tasks, the controller can be
designed in either a centralized or decentralized way. Correspondingly, it is
possible to use either single-agent reinforcement learning (SARL) or
multi-agent reinforcement learning (MARL) methods to learn such controllers.
However, the relationship between these two paradigms remains under-studied in
the literature. This work explores research questions in terms of robustness
and performance of SARL and MARL approaches to the same task, in order to gain
insight into the most suitable methods. We start by analytically showing the
equivalence between these two paradigms under the full-state observation
assumption. Then, we identify a broad subclass of \textit{Dec-POMDP} tasks
where the agents are weakly or partially interacting. In these tasks, we show
that partial observations of each agent are sufficient for near-optimal
decision-making. Furthermore, we propose to exploit such partially observable
MARL to improve the robustness of robots when joint or agent failures occur.
Our experiments on both simulated multi-agent tasks and a real robot task with
a mobile manipulator validate the presented insights and the effectiveness of
the proposed robust robot learning method via partially observable MARL.Comment: 8 pages, 8 figure
Cooperative Adaptive Control for Cloud-Based Robotics
This paper studies collaboration through the cloud in the context of
cooperative adaptive control for robot manipulators. We first consider the case
of multiple robots manipulating a common object through synchronous centralized
update laws to identify unknown inertial parameters. Through this development,
we introduce a notion of Collective Sufficient Richness, wherein parameter
convergence can be enabled through teamwork in the group. The introduction of
this property and the analysis of stable adaptive controllers that benefit from
it constitute the main new contributions of this work. Building on this
original example, we then consider decentralized update laws, time-varying
network topologies, and the influence of communication delays on this process.
Perhaps surprisingly, these nonidealized networked conditions inherit the same
benefits of convergence being determined through collective effects for the
group. Simple simulations of a planar manipulator identifying an unknown load
are provided to illustrate the central idea and benefits of Collective
Sufficient Richness.Comment: ICRA 201
Exploring Multi-Agent Reinforcement Learning for Mobile Manipulation
To make robots valuable in our everyday lives, they need to be able to make good decisions even in unexpected situations. Reinforcement learning is a paradigm that aims to learn decision-making models for robots without the need for direct examples of the correct decisions. For this type of robot learning, it is common practice to learn a single central model that controls the entire robot. This work is motivated by advances in modular and swarm robotics, where multiple robots or decision-makers collaborate to complete a task. Instead of learning a single central model, we explore the idea of learning multiple decision-making models, each controlling a different part of the robot. In particular, we investigate whether providing the different models with different sensing capabilities helps the robot to learn or to be robust to perturbations. We formulate these problems as multi-agent problems and use a multi-agent reinforcement learning algorithm to solve them. To evaluate our approach, we design a mobile manipulation task and implement a simulation-based training pipeline to produce decision-making models that can complete the task. The trained models are then directly transferred to a real autonomous mobile manipulator system. Several experiments are performed on the real system to compare the performance and robustness against the usual central model baseline. Our experimental results show that our approach can learn faster and produce decision-making models that are more robust to perturbations
Designing Decentralized controllers for distributed-air-jet MEMS-based micromanipulators by reinforcement learning.
International audienceDistributed-air-jet MEMS-based systems have been proposed to manipulate small parts with high velocities and without any friction problems. The control of such distributed systems is very challenging and usual approaches for contact arrayed system don't produce satisfactory results. In this paper, we investigate reinforcement learning control approaches in order to position and convey an object. Reinforcement learning is a popular approach to find controllers that are tailored exactly to the system without any prior model. We show how to apply reinforcement learning in a decentralized perspective and in order to address the global-local trade-off. The simulation results demonstrate that the reinforcement learning method is a promising way to design control laws for such distributed systems
Hysteretic Q-Learning : an algorithm for decentralized reinforcement learning in cooperative multi-agent teams.
International audienceMulti-agent systems (MAS) are a field of study of growing interest in a variety of domains such as robotics or distributed controls. The article focuses on decentralized reinforcement learning (RL) in cooperative MAS, where a team of independent learning robot (IL) try to coordinate their individual behavior to reach a coherent joint behavior. We assume that each robot has no information about its teammates'actions. To date, RL approaches for such ILs did not guarantee convergence to the optimal joint policy in scenarios where the coordination is difficult. We report an investigation of existing algorithms for the learning of coordination in cooperative MAS, and suggest a Q-Learning extension for ILs, called Hysteretic Q-Learning. This algorithm does not require any additional communication between robots. Its advantages are showing off and compared to other methods on various applications : bimatrix games, collaborative ball balancing task and pursuit domain
Enhancing Exploration and Safety in Deep Reinforcement Learning
A Deep Reinforcement Learning (DRL) agent tries to learn a policy maximizing a long-term objective by trials and errors in large state spaces. However, this learning paradigm requires a non-trivial amount of interactions in the environment to achieve good performance. Moreover, critical applications, such as robotics, typically involve safety criteria to consider while designing novel DRL solutions. Hence, devising safe learning approaches with efficient exploration is crucial to avoid getting stuck in local optima, failing to learn properly, or causing damages to the surrounding environment. This thesis focuses on developing Deep Reinforcement Learning algorithms to foster efficient exploration and safer behaviors in simulation and real domains of interest, ranging from robotics to multi-agent systems. To this end, we rely both on standard benchmarks, such as SafetyGym, and robotic tasks widely adopted in the literature (e.g., manipulation, navigation). This variety of problems is crucial to assess the statistical significance of our empirical studies and the generalization skills of our approaches. We initially benchmark the sample efficiency versus performance trade-off between value-based and policy-gradient algorithms. This part highlights the benefits of using non-standard simulation environments (i.e., Unity), which also facilitates the development of further optimization for DRL. We also discuss the limitations of standard evaluation metrics (e.g., return) in characterizing the actual behaviors of a policy, proposing the use of Formal Verification (FV) as a practical methodology to evaluate behaviors over desired specifications. The second part introduces Evolutionary Algorithms (EAs) as a gradient-free complimentary optimization strategy. In detail, we combine population-based and gradient-based DRL to diversify exploration and improve performance both in single and multi-agent applications. For the latter, we discuss how prior Multi-Agent (Deep) Reinforcement Learning (MARL) approaches hinder exploration, proposing an architecture that favors cooperation without affecting exploration
- …