11 research outputs found

    Learning to Control Planar Hitting Motions of a Robotic Arm in a Mini-Golf-like Task

    Get PDF
    In this thesis we tackle the problem of goal-oriented adaptation of a robot hitting motion. We propose the parameters that must be learned in order to use and adapt a basic hitting motion to play minigolf. Then, two different statistical methods are used to learn these parameters. The two methods are evaluated and compared. To validate the proposed approach, a minigolf control module is developed for a robotic arm. Using the different learning techniques, we show that a robot can learn the non-trivial task of deciding how the ball should be hit for a given position on a minigolf field. The result is a robust minigolf-playing system that outperforms most human players using only a small set of training examples

    A Dynamical System-based Approach to Modeling Stable Robot Control Policies via Imitation Learning

    Get PDF
    Despite tremendous advances in robotics, we are still amazed by the proficiency with which humans perform movements. Even new waves of robotic systems still rely heavily on hardcoded motions with a limited ability to react autonomously and robustly to a dynamically changing environment. This thesis focuses on providing possible mechanisms to push the level of adaptivity, reactivity, and robustness of robotic systems closer to human movements. Specifically, it aims at developing these mechanisms for a subclass of robot motions called “reaching movements”, i.e. movements in space stopping at a given target (also referred to as episodic motions, discrete motions, or point-to-point motions). These reaching movements can then be used as building blocks to form more advanced robot tasks. To achieve a high level of proficiency as described above, this thesis particularly seeks to derive control policies that: 1) resemble human motions, 2) guarantee the accomplishment of the task (if the target is reachable), and 3) can instantly adapt to changes in dynamic environments. To avoid manually hardcoding robot motions, this thesis exploits the power of machine learning techniques and takes an Imitation Learning (IL) approach to build a generic model of robot movements from a few examples provided by an expert. To achieve the required level of robustness and reactivity, the perspective adopted in this thesis is that a reaching movement can be described with a nonlinear Dynamical System (DS). When building an estimate of DS from demonstrations, there are two key problems that need to be addressed: the problem of generating motions that resemble at best the demonstrations (the “how-to-imitate” problem), and most importantly, the problem of ensuring the accomplishment of the task, i.e. reaching the target (the “stability” problem). Although there are numerous well-established approaches in robotics that could answer each of these problems separately, tackling both problems simultaneously is challenging and has not been extensively studied yet. This thesis first tackles the problem mentioned above by introducing an iterative method to build an estimate of autonomous nonlinear DS that are formulated as a mixture of Gaussian functions. This method minimizes the number of Gaussian functions required for achieving both local asymptotic stability at the target and accuracy in following demonstrations. We then extend this formulation and provide sufficient conditions to ensure global asymptotic stability of autonomous DS at the target. In this approach, an estimation of the underlying DS is built by solving a constraint optimization problem, where the metric of accuracy and the stability conditions are formulated as the optimization objective and constraints, respectively. In addition to ensuring convergence of all motions to the target within the local or global stability regions, these approaches offer an inherent adaptability and robustness to changes in dynamic environments. This thesis further extends the previous approaches and ensures global asymptotic stability of DS-based motions at the target independently of the choice of the regression technique. Therefore, it offers the possibility to choose the most appropriate regression technique based on the requirements of the task at hand without compromising DS stability. This approach also provides the possibility of online learning and using a combination of two or more regression methods to model more advanced robot tasks, and can be applied to estimate motions that are represented with both autonomous and non-autonomous DS. Additionally, this thesis suggests a reformulation to modeling robot motions that allows encoding of a considerably wider set of tasks ranging from reaching movements to agile robot movements that require hitting a given target with a specific speed and direction. This approach is validated in the context of playing the challenging task of minigolf. Finally, the last part of this thesis proposes a DS-based approach to realtime obstacle avoidance. The presented approach provides a modulation that instantly modifies the robot’s motion to avoid collision with multiple static and moving convex obstacles. This approach can be applied on all the techniques described above without affecting their adaptability, swiftness, or robustness. The techniques that are developed in this thesis have been validated in simulation and on different robotic platforms including the humanoid robots HOAP-3 and iCub, and the robot arms KATANA, WAM, and LWR. Throughout this thesis we show that the DS-based approach to modeling robot discrete movements can offer a high level of adaptability, reactivity, and robustness almost effortlessly when interacting with dynamic environments

    Movement primitives with multiple phase parameters

    Get PDF
    Movement primitives are concise movement representations that can be learned from human demonstrations, support generalization to novel situations and modulate the speed of execution of movements. The speed modulation mechanisms proposed so far are limited though, allowing only for uniform speed modulation or coupling changes in speed to local measurements of forces, torques or other quantities. Those approaches are not enough when dealing with general velocity constraints. We present a movement primitive formulation that can be used to non-uniformly adapt the speed of execution of a movement in order to satisfy a given constraint, while maintaining similarity in shape to the original trajectory. We present results using a 4-DoF robot arm in a minigolf setup

    Robot Learning from Failed Demonstrations

    Get PDF
    Robot Learning from Demonstration (RLfD) seeks to enable lay users to encode desired robot behaviors as autonomous controllers. Current work uses a human's demonstration of the target task to initialize the robot's policy, and then improves its performance either through practice (with a known reward function), or additional human interaction. In this article, we focus on the initialization step and consider what can be learned when the humans do not provide successful examples. We develop probabilistic approaches that avoid reproducing observed failures while leveraging the variance across multiple attempts to drive exploration. Our experiments indicate that failure data do contain information that can be used to discover successful means to accomplish tasks. However, in higher dimensions, additional information from the user will most likely be necessary to enable efficient failure-based learnin

    Reinforcement learning to adjust parametrized motor primitives to new situations

    Get PDF
    Humans manage to adapt learned movements very quickly to new situations by generalizing learned behaviors from similar situations. In contrast, robots currently often need to re-learn the complete movement. In this paper, we propose a method that learns to generalize parametrized motor plans by adapting a small set of global parameters, called meta-parameters. We employ reinforcement learning to learn the required meta-parameters to deal with the current situation, described by states. We introduce an appropriate reinforcement learning algorithm based on a kernelized version of the reward-weighted regression. To show its feasibility, we evaluate this algorithm on a toy example and compare it to several previous approaches. Subsequently, we apply the approach to three robot tasks, i.e., the generalization of throwing movements in darts, of hitting movements in table tennis, and of throwing balls where the tasks are learned on several different real physical robots, i.e., a Barrett WAM, a BioRob, the JST-ICORP/SARCOS CBi and a Kuka KR 6.European Communit

    Control and Learning of Compliant Manipulation Skills

    Get PDF
    Humans demonstrate an impressive capability to manipulate fragile objects without damaging them, graciously controlling the force and position of hands or tools. Traditionally, robotics has favored position control over force control to produce fast, accurate and repeatable motion. For extending the applicability of robotic manipulators outside the strictly controlled environments of industrial work cells, position control is inadequate. Tasks that involve contact with objects whose positions are not known with perfect certainty require a controller that regulates the relationship between positional deviations and forces on the robot. This problem is formalized in the impedance control framework, which focuses the robot control problem on the interaction between the robot and its environment. By adjusting the impedance parameters, the behavior of the robot can be adapted to the need of the task. However, it is often difficult to specify formally how the impedance should vary for best performance. Furthermore, fast it can be shown that careless variation of the impedance can lead to unstable regulation or tracking even in free motion. In the first part of the thesis, the problem of how to define a varying impedance for a task is addressed. A haptic human-robot interface that allows a human supervisor to teach impedance variations by physically interacting with the robot during task execution is introduced. It is shown that the interface can be used to enhance the performance in several manipulation tasks. Then, the problem of stable control with varying impedance is addressed. Along with a theoretical discussion on this topic, a sufficient condition for stable varying stiffness and damping is provided. In the second part of the thesis, we explore more complex manipulation scenarios via online generation of the robot trajectory. This is done along two axes 1) learning how to react to contact forces in insertion tasks which are crucial for assembly operations and 2) autonomous Dynamical Systems (DS) for motion representation with the capability to encode a family of trajectories rather than a fixed, time-dependent reference. A novel framework for task representation using DS is introduced, termed Locally Modulated Dynamical Systems (LMDS). LMDS differs from existing DS estimation algorithms in that it supports non-parametric and incremental learning all the while guaranteeing that the resulting DS is globally stable at an attractor point. To combine the advantages of DS motion generation with impedance control, a novel controller for tasks described by first order DS is proposed. The controller is passive, and has the properties of an impedance controller with the added flexibility of a DS motion representation instead of a time-indexed trajectory

    Robot Learning with Task-Parameterized Generative Models

    Get PDF
    Task-parameterized models provide a representation of movement/behavior that can adapt to a set of task parameters describing the current situation encountered by the robot, such as location of objects or landmarks in its workspace. This paper gives an overview of the task-parameterized Gaussian mixture model (TP-GMM) introduced in previous publications, and introduces a number of extensions and ongoing challenges required to move the approach toward unconstrained environments. In particular, it discusses its generalization capability and the handling of movements with a high number of degrees of freedom. It then shows that the method is not restricted to movements in task space, but that it can also be exploited to handle constraints in joint space, including priority constraints

    Physical Reasoning for Intelligent Agent in Simulated Environments

    No full text
    Developing Artificial Intelligence (AI) that is capable of understanding and interacting with the real world in a sophisticated way has long been a grand vision of AI. There is an increasing number of AI agents coming into our daily lives and assisting us with various daily tasks ranging from house cleaning to serving food in restaurants. While different tasks have different goals, the domains of the tasks all obey the physical rules (classic Newtonian physics) of the real world. To successfully interact with the physical world, an agent needs to be able to understand its surrounding environment, to predict the consequences of its actions and to draw plans that can achieve a goal without causing any unintended outcomes. Much of AI research over the past decades has been dedicated to specific sub-problems such as machine learning and computer vision, etc. Simply plugging in techniques from these subfields is far from creating a comprehensive AI agent that can work well in a physical environment. Instead, it requires an integration of methods from different AI areas that considers specific conditions and requirements of the physical environment. In this thesis, we identified several capabilities that are essential for AI to interact with the physical world, namely, visual perception, object detection, object tracking, action selection, and structure planning. As the real world is a highly complex environment, we started with developing these capabilities in virtual environments with realistic physics simulations. The central part of our methods is the combination of qualitative reasoning and standard techniques from different AI areas. For the visual perception capability, we developed a method that can infer spatial properties of rectangular objects from their minimum bounding rectangles. For the object detection capability, we developed a method that can detect unknown objects in a structure by reasoning about the stability of the structure. For the object tracking capability, we developed a method that can match perceptually indistinguishable objects in visual observations made before and after a physical impact. This method can identify spatial changes of objects in the physical event, and the result of matching can be used for learning the consequence of the impact. For the action selection capability, we developed a method that solves a hole-in-one problem that requires selecting an action out of an infinite number of actions with unknown consequences. For the structure planning capability, we developed a method that can arrange objects to form a stable and robust structure by reasoning about structural stability and robustness

    Compliant control of Uni/ Multi- robotic arms with dynamical systems

    Get PDF
    Accomplishment of many interactive tasks hinges on the compliance of humans. Humans demonstrate an impressive capability of complying their behavior and more particularly their motions with the environment in everyday life. In humans, compliance emerges from different facets. For example, many daily activities involve reaching for grabbing tasks, where compliance appears in a form of coordination. Humans comply their handsâ motions with each other and with that of the object not only to establish a stable contact and to control the impact force but also to overcome sensorimotor imprecisions. Even though compliance has been studied from different aspects in humans, it is primarily related to impedance control in robotics. In this thesis, we leverage the properties of autonomous dynamical systems (DS) for immediate re-planning and introduce active complaint motion generators for controlling robots in three different scenarios, where compliance does not necessarily mean impedance and hence it is not directly related to control in the force/velocity domain. In the first part of the thesis, we propose an active compliant strategy for catching objects in flight, which is less sensitive to the timely control of the interception. The soft catching strategy consists in having the robot following the object for a short period of time. This leaves more time for the fingers to close on the object at the interception and offers more robustness than a âhardâ catching method in which the hand waits for the object at the chosen interception point. We show theoretically that the resulting DS will intercept the object at the intercept point, at the right time with the desired velocity direction. Stability and convergence of the approach are assessed through Lyapunov stability theory. In the second part, we propose a unified compliant control architecture for coordinately reaching for grabbing a moving object by a multi-arm robotic system. Due to the complexity of the task and of the system, each arm complies not only with the objectâs motion but also with the motion of other arms, in both task and joint spaces. At the task-space level, we propose a unified dynamical system that endows the multi-arm system with both synchronous and asynchronous behaviors and with the capability of smoothly transitioning between the two modes. At the joint space level, the compliance between the arms is achieved by introducing a centralized inverse kinematics (IK) solver under self-collision avoidance constraints; formulated as a quadratic programming problem (QP) and solved in real-time. In the last part, we propose a compliant dynamical system for stably transitioning from free motions to contacts. In this part, by modulating the robot's velocity in three regions, we show theoretically and empirically that the robot can (I) stably touch the contact surface (II) at a desired location, and (III) leave the surface or stop on the surface at a desired point

    Generative Models for Learning Robot Manipulation Skills from Humans

    Get PDF
    A long standing goal in artificial intelligence is to make robots seamlessly interact with humans in performing everyday manipulation skills. Learning from demonstrations or imitation learning provides a promising route to bridge this gap. In contrast to direct trajectory learning from demonstrations, many problems arise in interactive robotic applications that require higher contextual level understanding of the environment. This requires learning invariant mappings in the demonstrations that can generalize across different environmental situations such as size, position, orientation of objects, viewpoint of the observer, etc. In this thesis, we address this challenge by encapsulating invariant patterns in the demonstrations using probabilistic learning models for acquiring dexterous manipulation skills. We learn the joint probability density function of the demonstrations with a hidden semi-Markov model, and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The model exploits the invariant segments (also termed as sub-goals, options or actions) in the demonstrations and adapts the movement in accordance with the external environmental situations such as size, position and orientation of the objects in the environment using a task-parameterized formulation. We incorporate high-dimensional sensory data for skill acquisition by parsimoniously representing the demonstrations using statistical subspace clustering methods and exploit the coordination patterns in latent space. To adapt the models on the fly and/or teach new manipulation skills online with the streaming data, we formulate a non-parametric scalable online sequence clustering algorithm with Bayesian non-parametric mixture models to avoid the model selection problem while ensuring tractability under small variance asymptotics. We exploit the developed generative models to perform manipulation skills with remotely operated vehicles over satellite communication in the presence of communication delays and limited bandwidth. A set of task-parameterized generative models are learned from the demonstrations of different manipulation skills provided by the teleoperator. The model captures the intention of teleoperator on one hand and provides assistance in performing remote manipulation tasks on the other hand under varying environmental situations. The assistance is formulated under time-independent shared control, where the model continuously corrects the remote arm movement based on the current state of the teleoperator; and/or time-dependent autonomous control, where the model synthesizes the movement of the remote arm for autonomous skill execution. Using the proposed methodology with the two-armed Baxter robot as a mock-up for semi-autonomous teleoperation, we are able to learn manipulation skills such as opening a valve, pick-and-place an object by obstacle avoidance, hot-stabbing (a specialized underwater task akin to peg-in-a-hole task), screw-driver target snapping, and tracking a carabiner in as few as 4 - 8 demonstrations. Our study shows that the proposed manipulation assistance formulations improve the performance of the teleoperator by reducing the task errors and the execution time, while catering for the environmental differences in performing remote manipulation tasks with limited bandwidth and communication delays
    corecore