7,325 research outputs found

    Fast reinforcement learning for vision-guided mobile robots

    Get PDF
    This paper presents a new reinforcement learning algorithm for accelerating acquisition of new skills by real mobile robots, without requiring simulation. It speeds up Q-learning by applying memory-based sweeping and enforcing the “adjoining property”, a technique that exploits the natural ordering of sensory state spaces in many robotic applications by only allowing transitions between neighbouring states. The algorithm is tested within an image-based visual servoing framework on a docking task, in which the robot has to position its gripper at a desired configuration relative to an object on a table. In experiments, we compare the performance of the new algorithm with a hand-designed linear controller and a scheme using the linear controller as a bias to further accelerate the learning. By analysis of the controllability and docking time, we show that the biased learner could improve on the performance of the linear controller, while requiring substantially lower training time than unbiased learning (less than 1 hour on the real robot)

    Vision-based reinforcement learning using approximate policy iteration

    Get PDF
    A major issue for reinforcement learning (RL) applied to robotics is the time required to learn a new skill. While RL has been used to learn mobile robot control in many simulated domains, applications involving learning on real robots are still relatively rare. In this paper, the Least-Squares Policy Iteration (LSPI) reinforcement learning algorithm and a new model-based algorithm Least-Squares Policy Iteration with Prioritized Sweeping (LSPI+), are implemented on a mobile robot to acquire new skills quickly and efficiently. LSPI+ combines the benefits of LSPI and prioritized sweeping, which uses all previous experience to focus the computational effort on the most “interesting” or dynamic parts of the state space. The proposed algorithms are tested on a household vacuum cleaner robot for learning a docking task using vision as the only sensor modality. In experiments these algorithms are compared to other model-based and model-free RL algorithms. The results show that the number of trials required to learn the docking task is significantly reduced using LSPI compared to the other RL algorithms investigated, and that LSPI+ further improves on the performance of LSPI

    Unsupervised navigation using an economy principle

    Get PDF
    We describe robot navigation learning based on self-selection of privileged vectors through the environment in accordance with an in built economy metric. This provides the opportunity both for progressive behavioural adaptation, and adaptive derivations, leading, through situated activity, to “representations" of the environment which are both economically attained and inherently meaningful to the agent

    Learning visual docking for non-holonomic autonomous vehicles

    Get PDF
    This paper presents a new method of learning visual docking skills for non-holonomic vehicles by direct interaction with the environment. The method is based on a reinforcement algorithm, which speeds up Q-learning by applying memorybased sweeping and enforcing the “adjoining property”, a filtering mechanism to only allow transitions between states that satisfy a fixed distance. The method overcomes some limitations of reinforcement learning techniques when they are employed in applications with continuous non-linear systems, such as car-like vehicles. In particular, a good approximation to the optimal behaviour is obtained by a small look-up table. The algorithm is tested within an image-based visual servoing framework on a docking task. The training time was less than 1 hour on the real vehicle. In experiments, we show the satisfactory performance of the algorithm
    corecore