422 research outputs found

    Learning to Segment and Represent Motion Primitives from Driving Data for Motion Planning Applications

    Full text link
    Developing an intelligent vehicle which can perform human-like actions requires the ability to learn basic driving skills from a large amount of naturalistic driving data. The algorithms will become efficient if we could decompose the complex driving tasks into motion primitives which represent the elementary compositions of driving skills. Therefore, the purpose of this paper is to segment unlabeled trajectory data into a library of motion primitives. By applying a probabilistic inference based on an iterative Expectation-Maximization algorithm, our method segments the collected trajectories while learning a set of motion primitives represented by the dynamic movement primitives. The proposed method utilizes the mutual dependencies between the segmentation and representation of motion primitives and the driving-specific based initial segmentation. By utilizing this mutual dependency and the initial condition, this paper presents how we can enhance the performance of both the segmentation and the motion primitive library establishment. We also evaluate the applicability of the primitive representation method to imitation learning and motion planning algorithms. The model is trained and validated by using the driving data collected from the Beijing Institute of Technology intelligent vehicle platform. The results show that the proposed approach can find the proper segmentation and establish the motion primitive library simultaneously

    Movement primitives with multiple phase parameters

    Get PDF
    Movement primitives are concise movement representations that can be learned from human demonstrations, support generalization to novel situations and modulate the speed of execution of movements. The speed modulation mechanisms proposed so far are limited though, allowing only for uniform speed modulation or coupling changes in speed to local measurements of forces, torques or other quantities. Those approaches are not enough when dealing with general velocity constraints. We present a movement primitive formulation that can be used to non-uniformly adapt the speed of execution of a movement in order to satisfy a given constraint, while maintaining similarity in shape to the original trajectory. We present results using a 4-DoF robot arm in a minigolf setup

    Layered direct policy search for learning hierarchical skills

    Get PDF
    Solutions to real world robotic tasks often require complex behaviors in high dimensional continuous state and action spaces. Reinforcement Learning (RL) is aimed at learning such behaviors but often fails for lack of scalability. To address this issue, Hierarchical RL (HRL) algorithms leverage hierarchical policies to exploit the structure of a task. However, many HRL algorithms rely on task specific knowledge such as a set of predefined sub-policies or sub-goals. In this paper we propose a new HRL algorithm based on information theoretic principles to autonomously uncover a diverse set of sub-policies and their activation policies. Moreover, the learning process mirrors the policys structure and is thus also hierarchical, consisting of a set of independent optimization problems. The hierarchical structure of the learning process allows us to control the learning rate of the sub-policies and the gating individually and add specific information theoretic constraints to each layer to ensure the diversification of the subpolicies. We evaluate our algorithm on two high dimensional continuous tasks and experimentally demonstrate its ability to autonomously discover a rich set of sub-policies

    Hierarchical relative entropy policy search

    Get PDF
    Many reinforcement learning (RL) tasks, especially in robotics, consist of multiple sub-tasks that are strongly structured. Such task structures can be exploited by incorporating hierarchical policies that consist of gating networks and sub-policies. However, this concept has only been partially explored for real world settings and complete methods, derived from first principles, are needed. Real world settings are challenging due to large and continuous state-action spaces that are prohibitive for exhaustive sampling methods. We define the problem of learning sub-policies in continuous state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to select the low-level sub-policies for execution by the agent. In order to efficiently share experience with all sub-policies, also called inter-policy learning, we treat these sub-policies as latent variables which allows for distribution of the update information between the sub-policies. We present three different variants of our algorithm, designed to be suitable for a wide variety of real world robot learning tasks and evaluate our algorithms in two real robot learning scenarios as well as several simulations and comparisons

    Bio­-inspired approaches to the control and modelling of an anthropomimetic robot

    Get PDF
    Introducing robots into human environments requires them to handle settings designed specifically for human size and morphology, however, large, conventional humanoid robots with stiff, high powered joint actuators pose a significant danger to humans. By contrast, “anthropomimetic” robots mimic both human morphology and internal structure; skeleton, muscles, compliance and high redundancy. Although far safer, their resultant compliant structure presents a formidable challenge to conventional control. Here we review, and seek to address, characteristic control issues of this class of robot, whilst exploiting their biomimetic nature by drawing upon biological motor control research. We derive a novel learning controller for discovering effective reaching actions created through sustained activation of one or more muscle synergies, an approach which draws upon strong, recent evidence from animal and humans studies, but is almost unexplored to date in musculoskeletal robot literature. Since the best synergies for a given robot will be unknown, we derive a deliberately simple reinforcement learning approach intended to allow their emergence, in particular those patterns which aid linearization of control. We also draw upon optimal control theories to encourage the emergence of smoother movement by incorporating signal dependent noise and trial repetition. In addition, we argue the utility of developing a detailed dynamic model of a complete robot and present a stable, physics-­‐‑based model, of the anthropomimetic ECCERobot, running in real time with 55 muscles and 88 degrees of freedom. Using the model, we find that effective reaching actions can be learned which employ only two sequential motor co-­‐‑activation patterns, each controlled by just a single common driving signal. Factor analysis shows the emergent muscle co-­‐‑activations can be reconstructed to significant accuracy using weighted combinations of only 13 common fragments, labelled “candidate synergies”. Using these synergies as drivable units the same controller learns the same task both faster and better, however, other reaching tasks perform less well, proportional to dissimilarity; we therefore propose that modifications enabling emergence of a more generic set of synergies are required. Finally, we propose a continuous controller for the robot, based on model predictive control, incorporating our model as a predictive component for state estimation, delay-­‐‑ compensation and planning, including merging of the robot and sensed environment into a single model. We test the delay compensation mechanism by controlling a second copy of the model acting as a proxy for the real robot, finding that performance is significantly improved if a precise degree of compensation is applied and show how rapidly an un-­‐‑compensated controller fails as the model accuracy degrades

    A Survey on Policy Search for Robotics

    Get PDF
    Policy search is a subfield in reinforcement learning which focuses on finding good parameters for a given policy parametrization. It is well suited for robotics as it can cope with high-dimensional state and action spaces, one of the main challenges in robot learning. We review recent successes of both model-free and model-based policy search in robot learning. Model-free policy search is a general approach to learn policies based on sampled trajectories. We classify model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and present a unified view on existing algorithms. Learning a policy is often easier than learning an accurate forward model, and, hence, model-free methods are more frequently used in practice. However, for each sampled trajectory, it is necessary to interact with the * Both authors contributed equally. robot, which can be time consuming and challenging in practice. Modelbased policy search addresses this problem by first learning a simulator of the robot’s dynamics from data. Subsequently, the simulator generates trajectories that are used for policy learning. For both modelfree and model-based policy search methods, we review their respective properties and their applicability to robotic systems

    Definition and composition of motor Primitives using latent force models and hidden markov models

    Get PDF
    The movement representation problem is at the core of areas such as robot imitation learning and motion synthesis. In these fields, approaches oriented to the definition of motor primitives as basic building blocks of more complex movements have been extensively used because they cope with the high dimensionality and complexity by using a limited set of adjustable primitives. There is also biological evidence supporting the existence of such primitives in vertebrate and invertebrate motor systems. Traditional methods for representing motor primitives have been purely data-driven or strongly mechanistic. In the former approach new movements are generated using existing movements and these methods are usually very flexible but their extrapolation capacity is limited by the available training data. On the other hand, strongly mechanistic models have a better generalization ability by relying on a physical description of the modeled system, however, it may be hard to fully describe a real system and the resulting differential equations are usually expensive to solve numerically. Therefore, the motor primitive parameterization used in this work is based on a hybrid model which jointly incorporates the flexibility of the data-driven paradigm and the extrapolation capacity of strongly mechanistic models, namely the latent force model framework. Moreover, the sequential composition of different motor primitives is also addressed using Hidden Markov Models (HMMs) which allows to process movement realizations efficiently. The resulting joint model is an HMM with latent force models (LFMs) as emission process which is an unexplored combined probabilistic model to the best of our knowledge
    • 

    corecore