52,927 research outputs found
Deep Reinforcement Learning with Double Q-learning
The popular Q-learning algorithm is known to overestimate action values under
certain conditions. It was not previously known whether, in practice, such
overestimations are common, whether they harm performance, and whether they can
generally be prevented. In this paper, we answer all these questions
affirmatively. In particular, we first show that the recent DQN algorithm,
which combines Q-learning with a deep neural network, suffers from substantial
overestimations in some games in the Atari 2600 domain. We then show that the
idea behind the Double Q-learning algorithm, which was introduced in a tabular
setting, can be generalized to work with large-scale function approximation. We
propose a specific adaptation to the DQN algorithm and show that the resulting
algorithm not only reduces the observed overestimations, as hypothesized, but
that this also leads to much better performance on several games.Comment: AAAI 201
Performing Deep Recurrent Double Q-Learning for Atari Games
International audienceCurrently, many applications in Machine Learning are based on define new models to extract more information about data, In this case Deep Reinforcement Learning with the most common application in video games like Atari, Mario, and others causes an impact in how to computers can learning by himself with only information called rewards obtained from any action. There is a lot of algorithms modeled and implemented based on Deep Recurrent Q-Learning proposed by DeepMind used in AlphaZero and Go. In this document, We proposed Deep Recurrent Double Q-Learning which is an implementation of Deep Reinforcement Learning using Double Q-Learning algorithms and Recurrent Networks like LSTM and DRQN
Near-optimal energy management for plug-in hybrid fuel cell and battery propulsion using deep reinforcement learning
Plug-in hybrid fuel cell and battery propulsion systems appear promising for decarbonising transportation applications such as road vehicles and coastal ships. However, it is challenging to develop optimal or near-optimal energy management for these systems without exact knowledge of future load profiles. Although efforts have been made to develop strategies in a stochastic environment with discrete state space using Q-learning and Double Q-learning, such tabular reinforcement learning agents’ effectiveness is limited due to the state space resolution. This article aims to develop an improved energy management system using deep reinforcement learning to achieve enhanced cost-saving by extending discrete state parameters to be continuous. The improved energy management system is based upon the Double Deep Q-Network. Real-world collected stochastic load profiles are applied to train the Double Deep Q-Network for a coastal ferry. The results suggest that the Double Deep Q-Network acquired energy management strategy has achieved a further 5.5% cost reduction with a 93.8% decrease in training time, compared to that produced by the Double Q-learning agent in discrete state space without function approximations. In addition, this article also proposes an adaptive deep reinforcement learning energy management scheme for practical hybrid-electric propulsion systems operating in changing environments
Joint Transaction Transmission and Channel Selection in Cognitive Radio Based Blockchain Networks: A Deep Reinforcement Learning Approach
To ensure that the data aggregation, data storage, and data processing are
all performed in a decentralized but trusted manner, we propose to use the
blockchain with the mining pool to support IoT services based on cognitive
radio networks. As such, the secondary user can send its sensing data, i.e.,
transactions, to the mining pools. After being verified by miners, the
transactions are added to the blocks. However, under the dynamics of the
primary channel and the uncertainty of the mempool state of the mining pool, it
is challenging for the secondary user to determine an optimal transaction
transmission policy. In this paper, we propose to use the deep reinforcement
learning algorithm to derive an optimal transaction transmission policy for the
secondary user. Specifically, we adopt a Double Deep-Q Network (DDQN) that
allows the secondary user to learn the optimal policy. The simulation results
clearly show that the proposed deep reinforcement learning algorithm
outperforms the conventional Q-learning scheme in terms of reward and learning
speed
- …