BCU Open Access

    Deep Reward Shaping from Demonstrations

    Get PDF
    Deep reinforcement learning is rapidly gaining attention due to recent successes in a variety of problems. The combination of deep learning and reinforcement learning allows for a generic learning process that does not consider specific knowledge of the task. However, learning from scratch becomes more difficult when tasks involve long trajectories with delayed rewards. The chances of finding the rewards using trial and error become much smaller compared to tasks where the agent continuously interacts with the environment. This is the case in many real life applications which poses a limitation to current methods. In this paper we propose a novel method for combining learning from demonstrations and experience to expedite and improve deep reinforcement learning. Demonstrations from a teacher are used to shape a potential reward function by training a deep supervised convolutional neural network. The shaped function is added to the reward function used in deep-Q-learning (DQN) to perform off-policy training through trial and error. The proposed method is demonstrated on navigation tasks that are learned from raw pixels without utilizing any knowledge of the problem. Navigation tasks represent a typical AI problem that is relevant to many real applications and where only delayed rewards (usually terminal) are available to the agent. The results show that using the proposed shaped rewards significantly improves the performance of the agent over standard DQN. This improvement is more pronounced the sparser the rewards are

    No full text

    No full text

    No full text

    No full text

    No full text

    No full text

    No full text

    No full text

    The economic benefits of a more physically active population: An international analysis

    No full text
    Exercise affects workplace performance, longevity, and the economy. If people walked just an extra 15 minutes each day, the world economy could grow by about $100 billion a year. Gains are attributed to improved productivity and reduced mortality rates, sick leave, and health care costs
    BCU Open Accessis based in GB
    Access Repository Dashboard
    Do you manage BCU Open Access? Access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! CORE Repository Dashboard!