52,927 research outputs found

    Deep Reinforcement Learning with Double Q-learning

    Full text link
    The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prevented. In this paper, we answer all these questions affirmatively. In particular, we first show that the recent DQN algorithm, which combines Q-learning with a deep neural network, suffers from substantial overestimations in some games in the Atari 2600 domain. We then show that the idea behind the Double Q-learning algorithm, which was introduced in a tabular setting, can be generalized to work with large-scale function approximation. We propose a specific adaptation to the DQN algorithm and show that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.Comment: AAAI 201

    Performing Deep Recurrent Double Q-Learning for Atari Games

    Get PDF
    International audienceCurrently, many applications in Machine Learning are based on define new models to extract more information about data, In this case Deep Reinforcement Learning with the most common application in video games like Atari, Mario, and others causes an impact in how to computers can learning by himself with only information called rewards obtained from any action. There is a lot of algorithms modeled and implemented based on Deep Recurrent Q-Learning proposed by DeepMind used in AlphaZero and Go. In this document, We proposed Deep Recurrent Double Q-Learning which is an implementation of Deep Reinforcement Learning using Double Q-Learning algorithms and Recurrent Networks like LSTM and DRQN

    Near-optimal energy management for plug-in hybrid fuel cell and battery propulsion using deep reinforcement learning

    Get PDF
    Plug-in hybrid fuel cell and battery propulsion systems appear promising for decarbonising transportation applications such as road vehicles and coastal ships. However, it is challenging to develop optimal or near-optimal energy management for these systems without exact knowledge of future load profiles. Although efforts have been made to develop strategies in a stochastic environment with discrete state space using Q-learning and Double Q-learning, such tabular reinforcement learning agents’ effectiveness is limited due to the state space resolution. This article aims to develop an improved energy management system using deep reinforcement learning to achieve enhanced cost-saving by extending discrete state parameters to be continuous. The improved energy management system is based upon the Double Deep Q-Network. Real-world collected stochastic load profiles are applied to train the Double Deep Q-Network for a coastal ferry. The results suggest that the Double Deep Q-Network acquired energy management strategy has achieved a further 5.5% cost reduction with a 93.8% decrease in training time, compared to that produced by the Double Q-learning agent in discrete state space without function approximations. In addition, this article also proposes an adaptive deep reinforcement learning energy management scheme for practical hybrid-electric propulsion systems operating in changing environments

    Joint Transaction Transmission and Channel Selection in Cognitive Radio Based Blockchain Networks: A Deep Reinforcement Learning Approach

    Full text link
    To ensure that the data aggregation, data storage, and data processing are all performed in a decentralized but trusted manner, we propose to use the blockchain with the mining pool to support IoT services based on cognitive radio networks. As such, the secondary user can send its sensing data, i.e., transactions, to the mining pools. After being verified by miners, the transactions are added to the blocks. However, under the dynamics of the primary channel and the uncertainty of the mempool state of the mining pool, it is challenging for the secondary user to determine an optimal transaction transmission policy. In this paper, we propose to use the deep reinforcement learning algorithm to derive an optimal transaction transmission policy for the secondary user. Specifically, we adopt a Double Deep-Q Network (DDQN) that allows the secondary user to learn the optimal policy. The simulation results clearly show that the proposed deep reinforcement learning algorithm outperforms the conventional Q-learning scheme in terms of reward and learning speed
    • …
    corecore