569 research outputs found
Mixed Traffic Control and Coordination from Pixels
Traffic congestion is a persistent problem in our society. Existing methods
for traffic control have proven futile in alleviating current congestion levels
leading researchers to explore ideas with robot vehicles given the increased
emergence of vehicles with different levels of autonomy on our roads. This
gives rise to mixed traffic control, where robot vehicles regulate human-driven
vehicles through reinforcement learning (RL). However, most existing studies
use precise observations that involve global information, such as environment
outflow, and local information, i.e., vehicle positions and velocities.
Obtaining this information requires updating existing road infrastructure with
vast sensor environments and communication to potentially unwilling human
drivers. We consider image observations as the alternative for mixed traffic
control via RL: 1) images are ubiquitous through satellite imagery, in-car
camera systems, and traffic monitoring systems; 2) images do not require a
complete re-imagination of the observation space from environment to
environment; and 3) images only require communication to equipment. In this
work, we show robot vehicles using image observations can achieve similar
performance to using precise information on environments, including ring,
figure eight, intersection, merge, and bottleneck. In certain scenarios, our
approach even outperforms using precision observations, e.g., up to 26%
increase in average vehicle velocity in the merge environment and a 6% increase
in outflow in the bottleneck environment, despite only using local traffic
information as opposed to global traffic information
PeRP: Personalized Residual Policies For Congestion Mitigation Through Co-operative Advisory Systems
Intelligent driving systems can be used to mitigate congestion through simple
actions, thus improving many socioeconomic factors such as commute time and gas
costs. However, these systems assume precise control over autonomous vehicle
fleets, and are hence limited in practice as they fail to account for
uncertainty in human behavior. Piecewise Constant (PC) Policies address these
issues by structurally modeling the likeness of human driving to reduce traffic
congestion in dense scenarios to provide action advice to be followed by human
drivers. However, PC policies assume that all drivers behave similarly. To this
end, we develop a co-operative advisory system based on PC policies with a
novel driver trait conditioned Personalized Residual Policy, PeRP. PeRP advises
drivers to behave in ways that mitigate traffic congestion. We first infer the
driver's intrinsic traits on how they follow instructions in an unsupervised
manner with a variational autoencoder. Then, a policy conditioned on the
inferred trait adapts the action of the PC policy to provide the driver with a
personalized recommendation. Our system is trained in simulation with novel
driver modeling of instruction adherence. We show that our approach
successfully mitigates congestion while adapting to different driver behaviors,
with 4 to 22% improvement in average speed over baselines.Comment: Accepted to ITSC 2023. Additional material and code is available at
the project webpage: https://sites.google.com/illinois.edu/per
Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model
In the coming years and decades, autonomous vehicles (AVs) will become
increasingly prevalent, offering new opportunities for safer and more
convenient travel and potentially smarter traffic control methods exploiting
automation and connectivity. Car following is a prime function in autonomous
driving. Car following based on reinforcement learning has received attention
in recent years with the goal of learning and achieving performance levels
comparable to humans. However, most existing RL methods model car following as
a unilateral problem, sensing only the vehicle ahead. Recent literature,
however, Wang and Horn [16] has shown that bilateral car following that
considers the vehicle ahead and the vehicle behind exhibits better system
stability. In this paper we hypothesize that this bilateral car following can
be learned using RL, while learning other goals such as efficiency
maximisation, jerk minimization, and safety rewards leading to a learned model
that outperforms human driving.
We propose and introduce a Deep Reinforcement Learning (DRL) framework for
car following control by integrating bilateral information into both state and
reward function based on the bilateral control model (BCM) for car following
control. Furthermore, we use a decentralized multi-agent reinforcement learning
framework to generate the corresponding control action for each agent. Our
simulation results demonstrate that our learned policy is better than the human
driving policy in terms of (a) inter-vehicle headways, (b) average speed, (c)
jerk, (d) Time to Collision (TTC) and (e) string stability
Deep Reinforcement Learning Approach for Lagrangian Control: Improving Freeway Bottleneck Throughput Via Variable Speed Limit
Connected vehicles (CVs) will enable new applications to improve traffic flow. The focus of this dissertation is to investigate how reinforcement learning (RL) control for the variable speed limit (VSL) through CVs can be generalized to improve traffic flow at different freeway bottlenecks. Three different bottlenecks are investigated: A sag curve, where the gradient changes from negative to positive values causes a reduction in the roadway capacity and congestion; a lane reduction, where three lanes merge to two lanes and cause congestion, and finally, an on-ramp, where increase in demand on a multilane freeway causes capacity drop. An RL algorithm is developed and implemented in a simulation environment for controlling a VSL in the upstream to manipulate the inflow of vehicles to the bottleneck on a freeway to minimize delays and increase the throughput. CVs are assumed to receive VSL messages through Infrastructure-to-Vehicle (I2V) communications technologies. Asynchronous Advantage Actor-Critic (A3C) algorithms are developed for each bottleneck to determine optimal VSL policies. Through these RL control algorithms, the speed of CVs are manipulated in the upstream of the bottleneck to avoid or minimize congestion. Various market penetration rates for CVs are considered in the simulations. It is demonstrated that the RL algorithm is able to adapt to stochastic arrivals of CVs and achieve significant improvements even at low market penetration rates of CVs, and the RL algorithm is able to find solution for all three bottlenecks. The results also show that the RL-based solutions outperform feedback-control-based solutions
Communication-Efficient Cooperative Multi-Agent PPO via Regulated Segment Mixture in Internet of Vehicles
Multi-Agent Reinforcement Learning (MARL) has become a classic paradigm to
solve diverse, intelligent control tasks like autonomous driving in Internet of
Vehicles (IoV). However, the widely assumed existence of a central node to
implement centralized federated learning-assisted MARL might be impractical in
highly dynamic scenarios, and the excessive communication overheads possibly
overwhelm the IoV system. Therefore, in this paper, we design a communication
efficient cooperative MARL algorithm, named RSM-MAPPO, to reduce the
communication overheads in a fully distributed architecture. In particular,
RSM-MAPPO enhances the multi-agent Proximal Policy Optimization (PPO) by
incorporating the idea of segment mixture and augmenting multiple model
replicas from received neighboring policy segments. Afterwards, RSM-MAPPO
adopts a theory-guided metric to regulate the selection of contributive
replicas to guarantee the policy improvement. Finally, extensive simulations in
a mixed-autonomy traffic control scenario verify the effectiveness of the
RSM-MAPPO algorithm
- …