58,574 research outputs found

    Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

    Full text link
    In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course. Top participants were invited to describe their algorithms. In this work, we present eight solutions that used deep reinforcement learning approaches, based on algorithms such as Deep Deterministic Policy Gradient, Proximal Policy Optimization, and Trust Region Policy Optimization. Many solutions use similar relaxations and heuristics, such as reward shaping, frame skipping, discretization of the action space, symmetry, and policy blending. However, each of the eight teams implemented different modifications of the known algorithms.Comment: 27 pages, 17 figure

    Low-cost, multi-agent systems for planetary surface exploration

    Get PDF
    The use of off-the-shelf consumer electronics combined with top-down design methodologies have made small and inexpensive satellites, such as CubeSats, emerge as viable, low-cost and attractive space-based platforms that enable a range of new and exciting mission scenarios. In addition, to overcome some of the resource limitation issues encountered with these platforms, distributed architectures have emerged to enable complex tasks through the use of multiple low complexity units. The low-cost characteristics of such systems coupled with the distributed architecture allows for an increase in the size of the system beyond what would have been feasible with a monolithic system, hence widening the operational capabilities without significantly increasing the control complexity of the system. These ideas are not new for Earth orbiting devices, but excluding some distributed remote sensing architectures they are yet to be applied for the purpose of planetary exploration. Experience gained through large rovers demonstrates the value of in-situ exploration, which is however limited by the associated high-cost and risk. The loss of a rover can and has happened because of a number of possible failures: besides the hazards directly linked to the launch and journey to the target-body, hard landing and malfunctioning of parts are all threats to the success of the mission. To overcome these issues this paper introduces the concept of using off-the-shelf consumer electronics to deploy a low-cost multi-rover system for future planetary surface exploration. It is shown that such a system would significantly reduce the programmatic-risk of the mission (for example catastrophic failure of a single rover), while exploiting the inherent advantages of cooperative behaviour. These advantages are analysed with a particular emphasis put upon the guidance, navigation and control of such architectures using the method of artificial potential field. Laboratory tests on multi-agent robotic systems support the analysis. Principal features of the system are identified and the underlying advantages over a monolithic single-agent system highlighted

    May We Have Your Attention: Analysis of a Selective Attention Task

    Get PDF
    In this paper we present a deeper analysis than has previously been carried out of a selective attention problem, and the evolution of continuous-time recurrent neural networks to solve it. We show that the task has a rich structure, and agents must solve a variety of subproblems to perform well. We consider the relationship between the complexity of an agent and the ease with which it can evolve behavior that generalizes well across subproblems, and demonstrate a shaping protocol that improves generalization
    corecore