28,534 research outputs found
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning
Navigating safely and efficiently in dense and heterogeneous traffic
scenarios is challenging for autonomous vehicles (AVs) due to their inability
to infer the behaviors or intentions of nearby drivers. In this work, we
introduce a distributed multi-agent reinforcement learning (MARL) algorithm
that can predict trajectories and intents in dense and heterogeneous traffic
scenarios. Our approach for intent-aware planning, iPLAN, allows agents to
infer nearby drivers' intents solely from their local observations. We model
two distinct incentives for agents' strategies: Behavioral Incentive for
high-level decision-making based on their driving behavior or personality and
Instant Incentive for motion planning for collision avoidance based on the
current traffic state. Our approach enables agents to infer their opponents'
behavior incentives and integrate this inferred information into their
decision-making and motion-planning processes. We perform experiments on two
simulation environments, Non-Cooperative Navigation and Heterogeneous Highway.
In Heterogeneous Highway, results show that, compared with centralized training
decentralized execution (CTDE) MARL baselines such as QMIX and MAPPO, our
method yields a 4.3% and 38.4% higher episodic reward in mild and chaotic
traffic, with 48.1% higher success rate and 80.6% longer survival time in
chaotic traffic. We also compare with a decentralized training decentralized
execution (DTDE) baseline IPPO and demonstrate a higher episodic reward of
12.7% and 6.3% in mild traffic and chaotic traffic, 25.3% higher success rate,
and 13.7% longer survival time
Learning 3D Navigation Protocols on Touch Interfaces with Cooperative Multi-Agent Reinforcement Learning
Using touch devices to navigate in virtual 3D environments such as computer
assisted design (CAD) models or geographical information systems (GIS) is
inherently difficult for humans, as the 3D operations have to be performed by
the user on a 2D touch surface. This ill-posed problem is classically solved
with a fixed and handcrafted interaction protocol, which must be learned by the
user. We propose to automatically learn a new interaction protocol allowing to
map a 2D user input to 3D actions in virtual environments using reinforcement
learning (RL). A fundamental problem of RL methods is the vast amount of
interactions often required, which are difficult to come by when humans are
involved. To overcome this limitation, we make use of two collaborative agents.
The first agent models the human by learning to perform the 2D finger
trajectories. The second agent acts as the interaction protocol, interpreting
and translating to 3D operations the 2D finger trajectories from the first
agent. We restrict the learned 2D trajectories to be similar to a training set
of collected human gestures by first performing state representation learning,
prior to reinforcement learning. This state representation learning is
addressed by projecting the gestures into a latent space learned by a
variational auto encoder (VAE).Comment: 17 pages, 8 figures. Accepted at The European Conference on Machine
Learning and Principles and Practice of Knowledge Discovery in Databases 2019
(ECMLPKDD 2019
Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning
Robots that navigate among pedestrians use collision avoidance algorithms to
enable safe and efficient operation. Recent works present deep reinforcement
learning as a framework to model the complex interactions and cooperation.
However, they are implemented using key assumptions about other agents'
behavior that deviate from reality as the number of agents in the environment
increases. This work extends our previous approach to develop an algorithm that
learns collision avoidance among a variety of types of dynamic agents without
assuming they follow any particular behavior rules. This work also introduces a
strategy using LSTM that enables the algorithm to use observations of an
arbitrary number of other agents, instead of previous methods that have a fixed
observation size. The proposed algorithm outperforms our previous approach in
simulation as the number of agents increases, and the algorithm is demonstrated
on a fully autonomous robotic vehicle traveling at human walking speed, without
the use of a 3D Lidar
- …