Search CORE

16,546 research outputs found

Prescribed Performance Control Guided Policy Improvement for Satisfying Signal Temporal Logic Tasks

Author: Dimarogonas Dimos V.
Varnai Peter
Publication venue
Publication date: 01/01/2019
Field of study

Signal temporal logic (STL) provides a user-friendly interface for defining complex tasks for robotic systems. Recent efforts aim at designing control laws or using reinforcement learning methods to find policies which guarantee satisfaction of these tasks. While the former suffer from the trade-off between task specification and computational complexity, the latter encounter difficulties in exploration as the tasks become more complex and challenging to satisfy. This paper proposes to combine the benefits of the two approaches and use an efficient prescribed performance control (PPC) base law to guide exploration within the reinforcement learning algorithm. The potential of the method is demonstrated in a simulated environment through two sample navigational tasks.Comment: This is the extended version of the paper accepted to the 2019 American Control Conference (ACC), Philadelphia (to be published

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Artificial Intelligence and Systems Theory: Applied to Cooperative Robots

Author: Custodio Luis M. M.
Lima Pedro U.
Publication venue
Publication date: 01/09/2004
Field of study

This paper describes an approach to the design of a population of cooperative robots based on concepts borrowed from Systems Theory and Artificial Intelligence. The research has been developed under the SocRob project, carried out by the Intelligent Systems Laboratory at the Institute for Systems and Robotics - Instituto Superior Tecnico (ISR/IST) in Lisbon. The acronym of the project stands both for "Society of Robots" and "Soccer Robots", the case study where we are testing our population of robots. Designing soccer robots is a very challenging problem, where the robots must act not only to shoot a ball towards the goal, but also to detect and avoid static (walls, stopped robots) and dynamic (moving robots) obstacles. Furthermore, they must cooperate to defeat an opposing team. Our past and current research in soccer robotics includes cooperative sensor fusion for world modeling, object recognition and tracking, robot navigation, multi-robot distributed task planning and coordination, including cooperative reinforcement learning in cooperative and adversarial environments, and behavior-based architectures for real time task execution of cooperating robot teams

arXiv.org e-Print Archive

Directory of Open Access Journals

Omega-Regular Reward Machines

Author: Hahn Ernst Moritz
Perez Mateo
Schewe Sven
Somenzi Fabio
Trivedi Ashutosh
Wojtczak Dominik
Publication venue
Publication date: 14/08/2023
Field of study

Reinforcement learning (RL) is a powerful approach for training agents to perform tasks, but designing an appropriate reward mechanism is critical to its success. However, in many cases, the complexity of the learning objectives goes beyond the capabilities of the Markovian assumption, necessitating a more sophisticated reward mechanism. Reward machines and omega-regular languages are two formalisms used to express non-Markovian rewards for quantitative and qualitative objectives, respectively. This paper introduces omega-regular reward machines, which integrate reward machines with omega-regular languages to enable an expressive and effective reward mechanism for RL. We present a model-free RL algorithm to compute epsilon-optimal strategies against omega-egular reward machines and evaluate the effectiveness of the proposed algorithm through experiments.Comment: To appear in ECAI-202

arXiv.org e-Print Archive

A Developmental Organization for Robot Behavior

Author: Grupen Roderic A.
Publication venue: Lund University Cognitive Studies
Publication date: 01/01/2003
Field of study

This paper focuses on exploring how learning and development can be structured in synthetic (robot) systems. We present a developmental assembler for constructing reusable and temporally extended actions in a sequence. The discussion adopts the traditions of dynamic pattern theory in which behavior is an artifact of coupled dynamical systems with a number of controllable degrees of freedom. In our model, the events that delineate control decisions are derived from the pattern of (dis)equilibria on a working subset of sensorimotor policies. We show how this architecture can be used to accomplish sequential knowledge gathering and representation tasks and provide examples of the kind of developmental milestones that this approach has already produced in our lab

CiteSeerX

Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic Specifications

Author: An Ziyan
Han Songyang
Ma Meiyi
Mangharam Rahul
Miao Fei
Wang Jiangwei
Yang Shuo
Zhang Zhili
Publication venue
Publication date: 22/10/2023
Field of study

Reward design is a key component of deep reinforcement learning, yet some tasks and designer's objectives may be unnatural to define as a scalar cost function. Among the various techniques, formal methods integrated with DRL have garnered considerable attention due to their expressiveness and flexibility to define the reward and requirements for different states and actions of the agent. However, how to leverage Signal Temporal Logic (STL) to guide multi-agent reinforcement learning reward design remains unexplored. Complex interactions, heterogeneous goals and critical safety requirements in multi-agent systems make this problem even more challenging. In this paper, we propose a novel STL-guided multi-agent reinforcement learning framework. The STL requirements are designed to include both task specifications according to the objective of each agent and safety specifications, and the robustness values of the STL specifications are leveraged to generate rewards. We validate the advantages of our method through empirical studies. The experimental results demonstrate significant reward performance improvements compared to MARL without STL guidance, along with a remarkable increase in the overall safety rate of the multi-agent systems

arXiv.org e-Print Archive