Location of Repository

Vision-based reinforcement learning using approximate policy\ud iteration

By Marwan Shaker, Shigang Yue and Tom Duckett


A major issue for reinforcement learning (RL) applied to robotics is the time required to learn a new skill. While RL has been used to learn mobile robot control in many simulated domains, applications involving learning on real\ud robots are still relatively rare. In this paper, the Least-Squares Policy Iteration (LSPI) reinforcement learning algorithm and a new model-based algorithm Least-Squares Policy Iteration with Prioritized Sweeping (LSPI+), are implemented on a mobile robot to acquire new skills quickly and efficiently. LSPI+ combines the benefits of LSPI and prioritized sweeping, which uses all previous experience to focus the computational effort on the most “interesting” or dynamic parts of the state space. \ud The proposed algorithms are tested on a household vacuum\ud cleaner robot for learning a docking task using vision as the only sensor modality. In experiments these algorithms are compared to other model-based and model-free RL algorithms. The results show that the number of trials required to learn the docking task is significantly reduced using LSPI compared to the other RL algorithms investigated, and that LSPI+ further improves on the performance of LSPI

Topics: H670 Robotics and Cybernetics, H671 Robotics
Year: 2009
OAI identifier: oai:eprints.lincoln.ac.uk:2049

Suggested articles



  1. (2001). Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. doi
  2. (2001). Autonomous helicopter control using reinforcement learning policy search methods. doi
  3. (2005). Fast reinforcement learning for vision-guided mobile robots. In doi
  4. (2008). Hierarchical apprenticeship learning with application to quadruped locomotion. doi
  5. (2003). Least-squares policy iteration. doi
  6. (1999). Least-squares temporal difference learning.
  7. (2002). Numerical Recipes in C++: The Art of Scientific Computing. doi
  8. (2000). Policy gradient methods for reinforcement learning with function approximation.
  9. (2006). Quasionline reinforcement learning for robots. doi
  10. (2004). Reinforcement learning for sensing strategies. In doi
  11. (1998). Reinforcement Learning: An Introduction. doi
  12. (2005). Representation policy iteration.
  13. (2004). Robot docking with neural vision and reinforcement. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.