Search CORE

283 research outputs found

Graph-based Trajectory Prediction with Cooperative Information

Author: Buchholz Michael
Maschke Sebastian
Mertens Max
Strohbeck Jan
Publication venue
Publication date: 24/10/2023
Field of study

For automated driving, predicting the future trajectories of other road users in complex traffic situations is a hard problem. Modern neural networks use the past trajectories of traffic participants as well as map data to gather hints about the possible driver intention and likely maneuvers. With increasing connectivity between cars and other traffic actors, cooperative information is another source of data that can be used as inputs for trajectory prediction algorithms. Connected actors might transmit their intended path or even complete planned trajectories to other actors, which simplifies the prediction problem due to the imposed constraints. In this work, we outline the benefits of using this source of data for trajectory prediction and propose a graph-based neural network architecture that can leverage this additional data. We show that the network performance increases substantially if cooperative data is present. Also, our proposed training scheme improves the network's performance even for cases where no cooperative information is available. We also show that the network can deal with inaccurate cooperative data, which allows it to be used in real automated driving environments.Comment: Accepted for publication at the 26th IEEE International Conference on Intelligent Transportation Systems 202

arXiv.org e-Print Archive

Learning Structured Decision Problems with Unawareness

Author: Innes Craig
Lascarides Alex
Publication venue
Publication date: 03/07/2019
Field of study

Edinburgh Research Explorer

Safe multi-agent reinforcement learning for multi-robot control

Author: Chen Yuanpei
Du Yali
Grudzien Kuba Jakub
Gu Shangding
Knoll Alois
Yang Long
Yang Yaodong
Publication venue: 'Elsevier BV'
Publication date: 01/06/2023
Field of study

King's Research Portal

Bayesian interaction shaping: learning to influence strategic interactions in mixed robotic domains

Author: Ramamoorthy Subramanian
Valtazanos Aris
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

Learning by Doing in Markets, Firms, and Countries

Author: David Genesove
Wallace Mullin
Publication venue
Publication date
Field of study

Research Papers in Economics

FLATLAND: A study of Deep Reinforcement Learning methods applied to the vehicle rescheduling problem in a railway environment

Author: Cantini Giulia
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 19/03/2020
Field of study

In the field of Reinforcement Learning the task is learning how agents should take sequences of actions in an environment in order to maximize a numerical reward signal. This learning process employed in combination with neural networks has given rise to Deep Reinforcement Learning (DRL), that is nowadays applied in many domains, from video games to robotics and self-driving cars. This work investigates possible DRL approaches applied to Flatland, a multi-agent railway simulation where the main task is to plan and reschedule train routes in order to optimize the traffic flow within the network. The tasks introduced in Flatland are based on the Vehicle Rescheduling Problem, for which determining an optimal solution is a NP-complete problem in combinatorial optimization and determining acceptably good solutions using heuristics and deterministic methods is not feasible in realistic railway systems. In particular, we analyze the tasks of navigation of a single agent inside a map, that from a starting position has to reach a target station in the minimum number of time steps and the generalization of this task to a multi-agent setting, with the new issue of conflicts avoidance and resolution between agents. To solve the problem we developed specific observations of the environment, so as to capture the necessary information for the network, trained with Deep Q-Learning and variants, to learn the best action for each agent, that leads to the solution that maximizes the total reward. The positive results obtained on small environments offer ideas for various interpretations and possible future developments, showing that Reinforcement Learning has the potential to solve the problem under a new perspective

AMS Tesi di Laurea

Societies in the wild: cooperation, norms, and hierarchies

Author: Lozano Rodríguez Pablo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/04/2022
Field of study

Mención Internacional en el título de doctorThis research has been funded in part by MCIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe” through grant PGC2018-098186-B-I00 (BASIC).Programa de Doctorado en Ingeniería Matemática por la Universidad Carlos III de MadridPresidente: Sandro Meloni.- Secretaria: Francesca Lipari.- Vocal: Giulia Andrighett

Universidad Carlos III de Madrid e-Archivo

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

Author: Avalos Raphael
Delgrange Florent
Nowé Ann
Pérez Guillermo A.
Roijers Diederik M.
Publication venue
Publication date: 26/10/2023
Field of study

Partially Observable Markov Decision Processes (POMDPs) are used to model environments where the full state cannot be perceived by an agent. As such the agent needs to reason taking into account the past observations and actions. However, simply remembering the full history is generally intractable due to the exponential growth in the history space. Maintaining a probability distribution that models the belief over what the true state is can be used as a sufficient statistic of the history, but its computation requires access to the model of the environment and is often intractable. While SOTA algorithms use Recurrent Neural Networks to compress the observation-action history aiming to learn a sufficient statistic, they lack guarantees of success and can lead to sub-optimal policies. To overcome this, we propose the Wasserstein Belief Updater, an RL algorithm that learns a latent model of the POMDP and an approximation of the belief update. Our approach comes with theoretical guarantees on the quality of our approximation ensuring that our outputted beliefs allow for learning the optimal value function

arXiv.org e-Print Archive

Learning Control Policies for Stochastic Systems with Reach-avoid Guarantees

Author: Chatterjee Krishnendu
Henzinger Thomas A.
Lechner Mathias
Žikelić Đorđe
Publication venue
Publication date: 29/11/2022
Field of study

We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold

p\in[0,1]

over the infinite time horizon. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems -- it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on

3

stochastic non-linear reinforcement learning tasks.Comment: Accepted at AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

What working memory is for

Author: Logie R H
Publication venue
Publication date: 01/03/1997
Field of study

Edinburgh Research Explorer