The Amazing Race Repeated Update Q-Learning VS. Q-Learning

Abstract

In this paper, we will conduct an experiment that aims to compare the performance of two reinforcement learning algorithms, the Repeated Update Q-learning algorithm (RUQL) [1] and the Q-learning algorithm(QL) [5]. A simulated version of a robot crawler developed by [6] will be used in this experiment, it is shown in figure (1). An investigation study about the difference in performance between RUQL and Q-learning algorithm (QL) [5] is discussed in this paper. Several trials and tests were conducted to estimate the difference in the crawler’s movement using both algorithms. Additionally, a detailed description of the Markovian decision processes (MDPs) elements [2] is introduced, MDP model includes states, actions and rewards for the task in hand. The parameters that were used and tuned in this experiment will be mentioned and the reasons for choosing their values will be explained.  Finally, the source code for the crawler robot was modified in order to implement RUQL and Q-Learning (QL) algorithms, Eclipse [3] and Java SE Development Kit 8 (JDK) [4] are used for this purpose. After running the crawler robot simulation, the results drawn from the experiment showed that RUQL significantly outperforms the traditional QL.  &nbsp

    Similar works