18 research outputs found

    A survey of motion planning techniques for humanoid robots

    No full text

    Fast interpolation and time-optimization with contact

    No full text

    Learning with opponent-learning awareness

    No full text
    Multi-agent settings are quickly gathering importance in machine learning. This includes a plethora of recent work on deep multi-agent reinforcement learning, but also can be extended to hierarchical reinforcement learning, generative adversarial networks and decentralised optimization. In all these settings the presence of multiple learning agents renders the training problem non-stationary and often leads to unstable training or undesired final results. We present Learning with Opponent-Learning Awareness (LOLA), a method in which each agent shapes the anticipated learning of the other agents in the environment. The LOLA learning rule includes an additional term that accounts for the impact of one agent’s policy on the anticipated parameter update of the other agents. Preliminary results show that the encounter of two LOLA agents leads to the emergence of titfor-tat and therefore cooperation in the iterated prisoners’ dilemma (IPD), while independent learning does not. In this domain, LOLA also receives higher payouts compared to a naive learner, and is robust against exploitation by higher order gradient-based methods. Applied to infinitely repeated matching pennies, LOLA agents converge to the Nash equilibrium. In a round robin tournament we show that LOLA agents can successfully shape the learning of a range of multi-agent learning algorithms from literature, resulting in the highest average returns on the IPD. We also show that the LOLA update rule can be efficiently calculated using an extension of the likelihood ratio policy gradient estimator, making the method suitable for model-free reinforcement learning. This method thus scales to large parameter and input spaces and nonlinear function approximators. We also apply LOLA to a grid world task with an embedded social dilemma using deep recurrent policies and opponent modelling. Again, by explicitly considering the learning of the other agent, LOLA agents learn to cooperate out of self-interest

    Parametric Trajectory Libraries for Online Motion Planning with Application to Soft Robots

    No full text
    In this paper we propose a method for online motion planning of constrained nonlinear systems. The method consists of three steps: the offline generation of a library of parametric trajectories via direct trajectory optimization, the online search in the library for the best candidate solution to the optimal control problem we aim to solve, and the online refinement of this trajectory. The last phase of this process takes advantage of a sensitivity-like analysis and guarantees to comply with the first-order approximation of the constraints even in case of active set changes. Efficiency of the trajectory generation process is discussed and a valid strategy to minimize online computations is proposed; together with this, an effective procedure for searching the candidate trajectory is also presented. As a case study, we examine optimal control of a planar soft manipulator performing a pick-and-place task: through simulations and experiments, we show how crucial online computation times are to achieve considerable energy savings in the presence of variability of the task to perform

    Contact Planning for the ANYmal Quadruped Robot using an Acyclic Reachability-Based Planner

    Get PDF
    International audienceDespite the great progress in quadrupedal robotics during the last decade, selecting good contacts (footholds) in highly uneven and cluttered environments still remains an open challenge. This paper builds upon a state-of-the-art approach, already successfully used for humanoid robots, and applies it to our robotic platform; the quadruped robot ANY-mal. The proposed algorithm decouples the problem into two subprob-lems: first a guide trajectory for the robot is generated, then contacts are created along this trajectory. Both subproblems rely on approximations and heuristics that need to be tuned. The main contribution of this work is to explain how this algorithm has been retuned to work with ANY-mal and to show the relevance of the approach with a variety of tests in realistic dynamic simulations

    Injury Assessment for Physics-Based Characters

    No full text
    Determining injury levels for virtual characters is an important aspect of many games. For characters that are animated using simulated physics, it is possible assess injury levels based on physical properties, such as accelerations and forces. We have constructed a model for injury assessment that relates results from research on human injury response to parameters in physics-based animation systems. We describe a set of different normalized injury measures for individual body parts, which can be combined into a single measure for total injury. Our research includes a user study in which human observers rate the injury levels of physics-based characters falling from varying heights at different orientations. Results show that the correlation between our model output and perceived injury is stronger than the correlation between perceived injury and fall height (0.603 versus 0.466, respectively, with N = 1020 and p
    corecore