    Interactive Imitation Learning of Bimanual Movement Primitives

    Performing bimanual tasks with dual robotic setups can drastically increase the impact on industrial and daily life applications. However, performing a bimanual task brings many challenges, like synchronization and coordination of the single-arm policies. This article proposes the Safe, Interactive Movement Primitives Learning (SIMPLe) algorithm, to teach and correct single or dual arm impedance policies directly from human kinesthetic demonstrations. Moreover, it proposes a novel graph encoding of the policy based on Gaussian Process Regression (GPR) where the single-arm motion is guaranteed to converge close to the trajectory and then towards the demonstrated goal. Regulation of the robot stiffness according to the epistemic uncertainty of the policy allows for easily reshaping the motion with human feedback and/or adapting to external perturbations. We tested the SIMPLe algorithm on a real dual-arm setup where the teacher gave separate single-arm demonstrations and then successfully synchronized them only using kinesthetic feedback or where the original bimanual demonstration was locally reshaped to pick a box at a different height

    Imitation Learning of Motion Coordination in Robots:a Dynamical System Approach

    The ease with which humans coordinate all their limbs is fascinating. Such a simplicity is the result of a complex process of motor coordination, i.e. the ability to resolve the biomechanical redundancy in an efficient and repeatable manner. Coordination enables a wide variety of everyday human activities from filling in a glass with water to pair figure skating. Therefore, it is highly desirable to endow robots with similar skills. Despite the apparent diversity of coordinated motions, all of them share a crucial similarity: these motions are dictated by underlying constraints. The constraints shape the formation of the coordination patterns between the different degrees of freedom. Coordination constraints may take a spatio-temporal form; for instance, during bimanual object reaching or while catching a ball on the fly. They also may relate to the dynamics of the task; for instance, when one applies a specific force profile to carry a load. In this thesis, we develop a framework for teaching coordination skills to robots. Coordination may take different forms, here, we focus on teaching a robot intra-limb and bimanual coordination, as well as coordination with a human during physical collaborative tasks. We use tools from well-established domains of Bayesian semiparametric learning (Gaussian Mixture Models and Regression, Hidden Markov Models), nonlinear dynamics, and adaptive control. We take a biologically inspired approach to robot control. Specifically, we adopt an imitation learning perspective to skill transfer, that offers a seamless and intuitive way of capturing the constraints contained in natural human movements. As the robot is taught from motion data provided by a human teacher, we exploit evidence from human motor control of the temporal evolution of human motions that may be described by dynamical systems. Throughout this thesis, we demonstrate that the dynamical system view on movement formation facilitates coordination control in robots. We explain how our framework for teaching coordination to a robot is built up, starting from intra-limb coordination and control, moving to bimanual coordination, and finally to physical interaction with a human. The dissertation opens with the discussion of learning discrete task-level coordination patterns, such as spatio-temporal constraints emerging between the two arms in bimanual manipulation tasks. The encoding of bimanual constraints occurs at the task level and proceeds through a discretization of the task as sequences of bimanual constraints. Once the constraints are learned, the robot utilizes them to couple the two dynamical systems that generate kinematic trajectories for the hands. Explicit coupling of the dynamical systems ensures accurate reproduction of the learned constraints, and proves to be crucial for successful accomplishment of the task. In the second part of this thesis, we consider learning one-arm control policies. We present an approach to extracting non-linear autonomous dynamical systems from kinematic data of arbitrary point-to-point motions. The proposed method aims to tackle the fundamental questions of learning robot coordination: (i) how to infer a motion representation that captures a multivariate coordination pattern between degrees of freedom and that generalizes this pattern to unseen contexts; (ii) whether the policy learned directly from demonstrations can provide robustness against spatial and temporal perturbations. Finally, we demonstrate that the developed dynamical system approach to coordination may go beyond kinematic motion learning. We consider physical interactions between a robot and a human in situations where they jointly perform manipulation tasks; in particular, the problem of collaborative carrying and positioning of a load. We extend the approach proposed in the second part of this thesis to incorporate haptic information into the learning process. As a result, the robot adapts its kinematic motion plan according to human intentions expressed through the haptic signals. Even after the robot has learned the task model, the human still remains a complex contact environment. To ensure robustness of the robot behavior in the face of the variability inherent to human movements, we wrap the learned task model in an adaptive impedance controller with automatic gain tuning. The techniques, developed in this thesis, have been applied to enable learning of unimanual and bimanual manipulation tasks on the robotics platforms HOAP-3, KATANA, and i-Cub, as well as to endow a pair of simulated robots with the ability to perform a manipulation task in the physical collaboration

    Intuitive Instruction of Industrial Robots : A Knowledge-Based Approach

    With more advanced manufacturing technologies, small and medium sized enterprises can compete with low-wage labor by providing customized and high quality products. For small production series, robotic systems can provide a cost-effective solution. However, for robots to be able to perform on par with human workers in manufacturing industries, they must become flexible and autonomous in their task execution and swift and easy to instruct. This will enable small businesses with short production series or highly customized products to use robot coworkers without consulting expert robot programmers. The objective of this thesis is to explore programming solutions that can reduce the programming effort of sensor-controlled robot tasks. The robot motions are expressed using constraints, and multiple of simple constrained motions can be combined into a robot skill. The skill can be stored in a knowledge base together with a semantic description, which enables reuse and reasoning. The main contributions of the thesis are 1) development of ontologies for knowledge about robot devices and skills, 2) a user interface that provides simple programming of dual-arm skills for non-experts and experts, 3) a programming interface for task descriptions in unstructured natural language in a user-specified vocabulary and 4) an implementation where low-level code is generated from the high-level descriptions. The resulting system greatly reduces the number of parameters exposed to the user, is simple to use for non-experts and reduces the programming time for experts by 80%. The representation is described on a semantic level, which means that the same skill can be used on different robot platforms. The research is presented in seven papers, the first describing the knowledge representation and the second the knowledge-based architecture that enables skill sharing between robots. The third paper presents the translation from high-level instructions to low-level code for force-controlled motions. The two following papers evaluate the simplified programming prototype for non-expert and expert users. The last two present how program statements are extracted from unstructured natural language descriptions

    Programming by Demonstration on Riemannian Manifolds

    This thesis presents a Riemannian approach to Programming by Demonstration (PbD). It generalizes an existing PbD method from Euclidean manifolds to Riemannian manifolds. In this abstract, we review the objectives, methods and contributions of the presented approach. OBJECTIVES PbD aims at providing a user-friendly method for skill transfer between human and robot. It enables a user to teach a robot new tasks using few demonstrations. In order to surpass simple record-and-replay, methods for PbD need to \u2018understand\u2019 what to imitate; they need to extract the functional goals of a task from the demonstration data. This is typically achieved through the application of statisticalmethods. The variety of data encountered in robotics is large. Typical manipulation tasks involve position, orientation, stiffness, force and torque data. These data are not solely Euclidean. Instead, they originate from a variety of manifolds, curved spaces that are only locally Euclidean. Elementary operations, such as summation, are not defined on manifolds. Consequently, standard statistical methods are not well suited to analyze demonstration data that originate fromnon-Euclidean manifolds. In order to effectively extract what-to-imitate, methods for PbD should take into account the underlying geometry of the demonstration manifold; they should be geometry-aware. Successful task execution does not solely depend on the control of individual task variables. By controlling variables individually, a task might fail when one is perturbed and the others do not respond. Task execution also relies on couplings among task variables. These couplings describe functional relations which are often called synergies. In order to understand what-to-imitate, PbDmethods should be able to extract and encode synergies; they should be synergetic. In unstructured environments, it is unlikely that tasks are found in the same scenario twice. The circumstances under which a task is executed\u2014the task context\u2014are more likely to differ each time it is executed. Task context does not only vary during task execution, it also varies while learning and recognizing tasks. To be effective, a robot should be able to learn, recognize and synthesize skills in a variety of familiar and unfamiliar contexts; this can be achieved when its skill representation is context-adaptive. THE RIEMANNIAN APPROACH In this thesis, we present a skill representation that is geometry-aware, synergetic and context-adaptive. The presented method is probabilistic; it assumes that demonstrations are samples from an unknown probability distribution. This distribution is approximated using a Riemannian GaussianMixtureModel (GMM). Instead of using the \u2018standard\u2019 Euclidean Gaussian, we rely on the Riemannian Gaussian\u2014 a distribution akin the Gaussian, but defined on a Riemannian manifold. A Riev mannian manifold is a manifold\u2014a curved space which is locally Euclidean\u2014that provides a notion of distance. This notion is essential for statistical methods as such methods rely on a distance measure. Examples of Riemannian manifolds in robotics are: the Euclidean spacewhich is used for spatial data, forces or torques; the spherical manifolds, which can be used for orientation data defined as unit quaternions; and Symmetric Positive Definite (SPD) manifolds, which can be used to represent stiffness and manipulability. The Riemannian Gaussian is intrinsically geometry-aware. Its definition is based on the geometry of the manifold, and therefore takes into account the manifold curvature. In robotics, the manifold structure is often known beforehand. In the case of PbD, it follows from the structure of the demonstration data. Like the Gaussian distribution, the Riemannian Gaussian is defined by a mean and covariance. The covariance describes the variance and correlation among the state variables. These can be interpreted as local functional couplings among state variables: synergies. This makes the Riemannian Gaussian synergetic. Furthermore, information encoded in multiple Riemannian Gaussians can be fused using the Riemannian product of Gaussians. This feature allows us to construct a probabilistic context-adaptive task representation. CONTRIBUTIONS In particular, this thesis presents a generalization of existing methods of PbD, namely GMM-GMR and TP-GMM. This generalization involves the definition ofMaximum Likelihood Estimate (MLE), Gaussian conditioning and Gaussian product for the Riemannian Gaussian, and the definition of ExpectationMaximization (EM) and GaussianMixture Regression (GMR) for the Riemannian GMM. In this generalization, we contributed by proposing to use parallel transport for Gaussian conditioning. Furthermore, we presented a unified approach to solve the aforementioned operations using aGauss-Newton algorithm. We demonstrated how synergies, encoded in a Riemannian Gaussian, can be transformed into synergetic control policies using standard methods for LinearQuadratic Regulator (LQR). This is achieved by formulating the LQR problem in a (Euclidean) tangent space of the Riemannian manifold. Finally, we demonstrated how the contextadaptive Task-Parameterized Gaussian Mixture Model (TP-GMM) can be used for context inference\u2014the ability to extract context from demonstration data of known tasks. Our approach is the first attempt of context inference in the light of TP-GMM. Although effective, we showed that it requires further improvements in terms of speed and reliability. The efficacy of the Riemannian approach is demonstrated in a variety of scenarios. In shared control, the Riemannian Gaussian is used to represent control intentions of a human operator and an assistive system. Doing so, the properties of the Gaussian can be employed to mix their control intentions. This yields shared-control systems that continuously re-evaluate and assign control authority based on input confidence. The context-adaptive TP-GMMis demonstrated in a Pick & Place task with changing pick and place locations, a box-taping task with changing box sizes, and a trajectory tracking task typically found in industr

    Indirect Methods for Robot Skill Learning

    Robot learning algorithms are appealing alternatives for acquiring rational robotic behaviors from data collected during the execution of tasks. Furthermore, most robot learning techniques are stated as isolated stages and focused on directly obtaining rational policies as a result of optimizing only performance measures of single tasks. However, formulating robotic skill acquisition processes in such a way have some disadvantages. For example, if the same skill has to be learned by different robots, independent learning processes should be carried out for acquiring exclusive policies for each robot. Similarly, if a robot has to learn diverse skills, the robot should acquire the policy for each task in separate learning processes, in a sequential order and commonly starting from scratch. In the same way, formulating the learning process in terms of only the performance measure, makes robots to unintentionally avoid situations that should not be repeated, but without any mechanism that captures the necessity of not repeating those wrong behaviors. In contrast, humans and other animals exploit their experience not only for improving the performance of the task they are currently executing, but for constructing indirectly multiple models to help them with that particular task and to generalize to new problems. Accordingly, the models and algorithms proposed in this thesis seek to be more data efficient and extract more information from the interaction data that is collected either from expert\u2019s demonstrations or the robot\u2019s own experience. The first approach encodes robotic skills with shared latent variable models, obtaining latent representations that can be transferred from one robot to others, therefore avoiding to learn the same task from scratch. The second approach learns complex rational policies by representing them as hierarchical models that can perform multiple concurrent tasks, and whose components are learned in the same learning process, instead of separate processes. Finally, the third approach uses the interaction data for learning two alternative and antagonistic policies that capture what to and not to do, and which influence the learning process in addition to the performance measure defined for the task