36,310 research outputs found
Near-optimal energy management for plug-in hybrid fuel cell and battery propulsion using deep reinforcement learning
Plug-in hybrid fuel cell and battery propulsion systems appear promising for decarbonising transportation applications such as road vehicles and coastal ships. However, it is challenging to develop optimal or near-optimal energy management for these systems without exact knowledge of future load profiles. Although efforts have been made to develop strategies in a stochastic environment with discrete state space using Q-learning and Double Q-learning, such tabular reinforcement learning agents’ effectiveness is limited due to the state space resolution. This article aims to develop an improved energy management system using deep reinforcement learning to achieve enhanced cost-saving by extending discrete state parameters to be continuous. The improved energy management system is based upon the Double Deep Q-Network. Real-world collected stochastic load profiles are applied to train the Double Deep Q-Network for a coastal ferry. The results suggest that the Double Deep Q-Network acquired energy management strategy has achieved a further 5.5% cost reduction with a 93.8% decrease in training time, compared to that produced by the Double Q-learning agent in discrete state space without function approximations. In addition, this article also proposes an adaptive deep reinforcement learning energy management scheme for practical hybrid-electric propulsion systems operating in changing environments
An Agent-based Modelling Framework for Driving Policy Learning in Connected and Autonomous Vehicles
Due to the complexity of the natural world, a programmer cannot foresee all
possible situations, a connected and autonomous vehicle (CAV) will face during
its operation, and hence, CAVs will need to learn to make decisions
autonomously. Due to the sensing of its surroundings and information exchanged
with other vehicles and road infrastructure, a CAV will have access to large
amounts of useful data. While different control algorithms have been proposed
for CAVs, the benefits brought about by connectedness of autonomous vehicles to
other vehicles and to the infrastructure, and its implications on policy
learning has not been investigated in literature. This paper investigates a
data driven driving policy learning framework through an agent-based modelling
approaches. The contributions of the paper are two-fold. A dynamic programming
framework is proposed for in-vehicle policy learning with and without
connectivity to neighboring vehicles. The simulation results indicate that
while a CAV can learn to make autonomous decisions, vehicle-to-vehicle (V2V)
communication of information improves this capability. Furthermore, to overcome
the limitations of sensing in a CAV, the paper proposes a novel concept for
infrastructure-led policy learning and communication with autonomous vehicles.
In infrastructure-led policy learning, road-side infrastructure senses and
captures successful vehicle maneuvers and learns an optimal policy from those
temporal sequences, and when a vehicle approaches the road-side unit, the
policy is communicated to the CAV. Deep-imitation learning methodology is
proposed to develop such an infrastructure-led policy learning framework
Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning
Recent advances in combining deep neural network architectures with
reinforcement learning techniques have shown promising potential results in
solving complex control problems with high dimensional state and action spaces.
Inspired by these successes, in this paper, we build two kinds of reinforcement
learning algorithms: deep policy-gradient and value-function based agents which
can predict the best possible traffic signal for a traffic intersection. At
each time step, these adaptive traffic light control agents receive a snapshot
of the current state of a graphical traffic simulator and produce control
signals. The policy-gradient based agent maps its observation directly to the
control signal, however the value-function based agent first estimates values
for all legal control signals. The agent then selects the optimal control
action with the highest value. Our methods show promising results in a traffic
network simulated in the SUMO traffic simulator, without suffering from
instability issues during the training process
Recommended from our members
Reinforcement Learning for Hybrid and Plug-In Hybrid Electric Vehicle Energy Management: Recent Advances and Prospects
- …