106 research outputs found

    Sample-Efficient Reinforcement Learning of Robot Control Policies in the Real World

    Get PDF
    abstract: The goal of reinforcement learning is to enable systems to autonomously solve tasks in the real world, even in the absence of prior data. To succeed in such situations, reinforcement learning algorithms collect new experience through interactions with the environment to further the learning process. The behaviour is optimized by maximizing a reward function, which assigns high numerical values to desired behaviours. Especially in robotics, such interactions with the environment are expensive in terms of the required execution time, human involvement, and mechanical degradation of the system itself. Therefore, this thesis aims to introduce sample-efficient reinforcement learning methods which are applicable to real-world settings and control tasks such as bimanual manipulation and locomotion. Sample efficiency is achieved through directed exploration, either by using dimensionality reduction or trajectory optimization methods. Finally, it is demonstrated how data-efficient reinforcement learning methods can be used to optimize the behaviour and morphology of robots at the same time.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Probabilistic Slide-support Manipulation Planning in Clutter

    Full text link
    To safely and efficiently extract an object from the clutter, this paper presents a bimanual manipulation planner in which one hand of the robot is used to slide the target object out of the clutter while the other hand is used to support the surrounding objects to prevent the clutter from collapsing. Our method uses a neural network to predict the physical phenomena of the clutter when the target object is moved. We generate the most efficient action based on the Monte Carlo tree search.The grasping and sliding actions are planned to minimize the number of motion sequences to pick the target object. In addition, the object to be supported is determined to minimize the position change of surrounding objects. Experiments with a real bimanual robot confirmed that the robot could retrieve the target object, reducing the total number of motion sequences and improving safety.Comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023) (Accepted

    Self-Powered Robots to Reduce Motor Slacking During Upper-Extremity Rehabilitation: A Proof of Concept Study

    Get PDF
    Background: Robotic rehabilitation is a highly promising approach to recover lost functions after stroke or other neurological disorders. Unfortunately, robotic rehabilitation currently suffers from motor slacking , a phenomenon in which the human motor system reduces muscle activation levels and movement excursions, ostensibly to minimize metabolic- and movement-related costs. Consequently, the patient remains passive and is not fully engaged during therapy. To overcome this limitation, we envision a new class of body-powered robots and hypothesize that motor slacking could be reduced if individuals must provide the power to move their impaired limbs via their own body (i.e., through the motion of a healthy limb). Objective: To test whether a body-powered exoskeleton (i.e. robot) could reduce motor slacking during robotic training. Methods: We developed a body-powered robot that mechanically coupled the motions of the user\u27s elbow joints. We tested this passive robot in two groups of subjects (stroke and able-bodied) during four exercise conditions in which we controlled whether the robotic device was powered by the subject or by the experimenter, and whether the subject\u27s driven arm was engaged or at rest. Motor slacking was quantified by computing the muscle activation changes of the elbow flexor and extensor muscles using surface electromyography. Results: Subjects had higher levels of muscle activation in their driven arm during self-powered conditions compared to externally-powered conditions. Most notably, subjects unintentionally activated their driven arm even when explicitly told to relax when the device was self-powered. This behavior was persistent throughout the trial and did not wane after the initiation of the trial. Conclusions: Our findings provide novel evidence indicating that motor slacking can be reduced by self-powered robots; thus demonstrating promise for rehabilitation of impaired subjects using this new class of wearable system. The results also serve as a foundation to develop more sophisticated body-powered robots (e.g., with controllable transmissions) for rehabilitation purposes

    On Neuromechanical Approaches for the Study of Biological Grasp and Manipulation

    Full text link
    Biological and robotic grasp and manipulation are undeniably similar at the level of mechanical task performance. However, their underlying fundamental biological vs. engineering mechanisms are, by definition, dramatically different and can even be antithetical. Even our approach to each is diametrically opposite: inductive science for the study of biological systems vs. engineering synthesis for the design and construction of robotic systems. The past 20 years have seen several conceptual advances in both fields and the quest to unify them. Chief among them is the reluctant recognition that their underlying fundamental mechanisms may actually share limited common ground, while exhibiting many fundamental differences. This recognition is particularly liberating because it allows us to resolve and move beyond multiple paradoxes and contradictions that arose from the initial reasonable assumption of a large common ground. Here, we begin by introducing the perspective of neuromechanics, which emphasizes that real-world behavior emerges from the intimate interactions among the physical structure of the system, the mechanical requirements of a task, the feasible neural control actions to produce it, and the ability of the neuromuscular system to adapt through interactions with the environment. This allows us to articulate a succinct overview of a few salient conceptual paradoxes and contradictions regarding under-determined vs. over-determined mechanics, under- vs. over-actuated control, prescribed vs. emergent function, learning vs. implementation vs. adaptation, prescriptive vs. descriptive synergies, and optimal vs. habitual performance. We conclude by presenting open questions and suggesting directions for future research. We hope this frank assessment of the state-of-the-art will encourage and guide these communities to continue to interact and make progress in these important areas

    Passive Motion Paradigm: An Alternative to Optimal Control

    Get PDF
    In the last years, optimal control theory (OCT) has emerged as the leading approach for investigating neural control of movement and motor cognition for two complementary research lines: behavioral neuroscience and humanoid robotics. In both cases, there are general problems that need to be addressed, such as the “degrees of freedom (DoFs) problem,” the common core of production, observation, reasoning, and learning of “actions.” OCT, directly derived from engineering design techniques of control systems quantifies task goals as “cost functions” and uses the sophisticated formal tools of optimal control to obtain desired behavior (and predictions). We propose an alternative “softer” approach passive motion paradigm (PMP) that we believe is closer to the biomechanics and cybernetics of action. The basic idea is that actions (overt as well as covert) are the consequences of an internal simulation process that “animates” the body schema with the attractor dynamics of force fields induced by the goal and task-specific constraints. This internal simulation offers the brain a way to dynamically link motor redundancy with task-oriented constraints “at runtime,” hence solving the “DoFs problem” without explicit kinematic inversion and cost function computation. We argue that the function of such computational machinery is not only restricted to shaping motor output during action execution but also to provide the self with information on the feasibility, consequence, understanding and meaning of “potential actions.” In this sense, taking into account recent developments in neuroscience (motor imagery, simulation theory of covert actions, mirror neuron system) and in embodied robotics, PMP offers a novel framework for understanding motor cognition that goes beyond the engineering control paradigm provided by OCT. Therefore, the paper is at the same time a review of the PMP rationale, as a computational theory, and a perspective presentation of how to develop it for designing better cognitive architectures

    Noninvasive Neuroprosthetic Control of Grasping by Amputees

    Get PDF
    Smooth coordination and fine temporal control of muscles by the brain allows us to effortlessly pre-shape our hand to grasp different objects. Correlates of motor control for grasping have been found across wide-spread cortical areas, with diverse signal features. These signals have been harnessed by implanting intracortical electrodes and used to control the motion of robotic hands by tetraplegics, using algorithms called brain-machine interfaces (BMIs). Signatures of motor control signal encoding mechanisms of the brain in macro-scale signals such as electroencephalography (EEG) are unknown, and could potentially be used to develop noninvasive brain-machine interfaces. Here we show that a) low frequency (0.1 – 1 Hz) time domain EEG contains information about grasp pre-shaping in able-bodies individuals, and b) This information can be used to control pre-shaping motion of a robotic hand by amputees. In the first study, we recorded simultaneous EEG and hand kinematics as 5 able-bodies individuals grasped various objects. Linear decoders using low delta band EEG amplitudes accurately predicted hand pre-shaping kinematics during grasping. Correlation coefficient between predicted and actual kinematics was r = 0.59 ± 0.04, 0.47 ± 0.06 and 0.32 ± 0.05 for the first 3 synergies. In the second study, two transradial amputees (A1 and A2) controlled a prosthetic hand to grasp two objects using a closed-loop BMI with low delta band EEG. This study was conducted longitudinally in 12 sessions spread over 38 days. A1 achieved a 63% success rate, with 11 sessions significantly above chance. A2 achieved a 32% success rate, with 2 sessions significantly above chance. Previous methods of EEG-based BMIs used frequency domain features, and were thought to have a low signal-to-noise ratio making them unsuitable for control of dexterous tasks like grasping. Our results demonstrate that time-domain EEG contains information about grasp pre-shaping, which can be harnessed for neuroprosthetic control.Electrical and Computer Engineering, Department o

    A topological extension of movement primitives for curvature modulation and sampling of robot motion

    Get PDF
    The version of record is available online at: https://doi.org/10.1007/s10514-021-09976-7This paper proposes to enrich robot motion data with trajectory curvature information. To do so,we use an approximate implementation of a topological feature named writhe, which measures the curling of a closed curve around itself, and its analog feature for two closed curves, namely the linking number. Despite these features have been established for closed curves, their definition allows for a discrete calculation that is well-defined for non-closed curves and can thus provide information about how much a robot trajectory is curling around a line in space. Such lines can be predefined by a user, observed by vision or, in our case, inferred as virtual lines in space around which the robot motion is curling. We use these topological features to augment the data of a trajectory encapsulated as a Movement Primitive (MP). We propose a method to determine how many virtual segments best characterize a trajectory and then find such segments. This results in a generative model that permits modulating curvature to generate new samples, while still staying within the dataset distribution and being able to adapt to contextual variables.This work has been carried out within the project CLOTHILDE (”CLOTH manIpulation Learning from DEmonstrations”) funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Advanced Grant agreement No 741930). Research at IRI is also supported by the Spanish State Research Agency through the Mar ́ıa de Maeztu Seal of Excellence to IRI MDM-2016-0656Peer ReviewedPostprint (author's final draft

    An integrative framework for tailoring virtual reality based motor rehabilitation after stroke

    Get PDF
    Stroke is a leading cause of life-lasting motor impairments, undermining the quality of life of stroke survivors and their families, and representing a major chal lenge for a world population that is ageing at a dramatic rate. Important technologi cal developments and neuroscientific discoveries have contributed to a better under standing of stroke recovery. Virtual Reality (VR) arises as a powerful tool because it allows merging contributions from engineering, human computer interaction, reha bilitation medicine and neuroscience to propose novel and more effective paradigms for motor rehabilitation. However, despite evidence of the benefits of these novel training paradigms, most of them still rely on the choice of particular technologi cal solutions tailored to specific subsets of patients. Here we present an integrative framework that utilizes concepts of human computer confluence to 1) enable VR neu rorehabilitation through interface technologies, making VR rehabilitation paradigms accessible to wide populations of patients, and 2) create VR training environments that allow the personalization of training to address the individual needs of stroke patients. The use of these features is demonstrated in pilot studies using VR training environments in different configurations: as an online low-cost version, with a myo electric robotic orthosis, and in a neurofeedback paradigm. Finally, we argue about the need of coupling VR approaches and neurocomputational modelling to further study stroke and its recovery process, aiding on the design of optimal rehabilitation programs tailored to the requirements of each user.info:eu-repo/semantics/publishedVersio

    Programming by Demonstration on Riemannian Manifolds

    Get PDF
    This thesis presents a Riemannian approach to Programming by Demonstration (PbD). It generalizes an existing PbD method from Euclidean manifolds to Riemannian manifolds. In this abstract, we review the objectives, methods and contributions of the presented approach. OBJECTIVES PbD aims at providing a user-friendly method for skill transfer between human and robot. It enables a user to teach a robot new tasks using few demonstrations. In order to surpass simple record-and-replay, methods for PbD need to \u2018understand\u2019 what to imitate; they need to extract the functional goals of a task from the demonstration data. This is typically achieved through the application of statisticalmethods. The variety of data encountered in robotics is large. Typical manipulation tasks involve position, orientation, stiffness, force and torque data. These data are not solely Euclidean. Instead, they originate from a variety of manifolds, curved spaces that are only locally Euclidean. Elementary operations, such as summation, are not defined on manifolds. Consequently, standard statistical methods are not well suited to analyze demonstration data that originate fromnon-Euclidean manifolds. In order to effectively extract what-to-imitate, methods for PbD should take into account the underlying geometry of the demonstration manifold; they should be geometry-aware. Successful task execution does not solely depend on the control of individual task variables. By controlling variables individually, a task might fail when one is perturbed and the others do not respond. Task execution also relies on couplings among task variables. These couplings describe functional relations which are often called synergies. In order to understand what-to-imitate, PbDmethods should be able to extract and encode synergies; they should be synergetic. In unstructured environments, it is unlikely that tasks are found in the same scenario twice. The circumstances under which a task is executed\u2014the task context\u2014are more likely to differ each time it is executed. Task context does not only vary during task execution, it also varies while learning and recognizing tasks. To be effective, a robot should be able to learn, recognize and synthesize skills in a variety of familiar and unfamiliar contexts; this can be achieved when its skill representation is context-adaptive. THE RIEMANNIAN APPROACH In this thesis, we present a skill representation that is geometry-aware, synergetic and context-adaptive. The presented method is probabilistic; it assumes that demonstrations are samples from an unknown probability distribution. This distribution is approximated using a Riemannian GaussianMixtureModel (GMM). Instead of using the \u2018standard\u2019 Euclidean Gaussian, we rely on the Riemannian Gaussian\u2014 a distribution akin the Gaussian, but defined on a Riemannian manifold. A Riev mannian manifold is a manifold\u2014a curved space which is locally Euclidean\u2014that provides a notion of distance. This notion is essential for statistical methods as such methods rely on a distance measure. Examples of Riemannian manifolds in robotics are: the Euclidean spacewhich is used for spatial data, forces or torques; the spherical manifolds, which can be used for orientation data defined as unit quaternions; and Symmetric Positive Definite (SPD) manifolds, which can be used to represent stiffness and manipulability. The Riemannian Gaussian is intrinsically geometry-aware. Its definition is based on the geometry of the manifold, and therefore takes into account the manifold curvature. In robotics, the manifold structure is often known beforehand. In the case of PbD, it follows from the structure of the demonstration data. Like the Gaussian distribution, the Riemannian Gaussian is defined by a mean and covariance. The covariance describes the variance and correlation among the state variables. These can be interpreted as local functional couplings among state variables: synergies. This makes the Riemannian Gaussian synergetic. Furthermore, information encoded in multiple Riemannian Gaussians can be fused using the Riemannian product of Gaussians. This feature allows us to construct a probabilistic context-adaptive task representation. CONTRIBUTIONS In particular, this thesis presents a generalization of existing methods of PbD, namely GMM-GMR and TP-GMM. This generalization involves the definition ofMaximum Likelihood Estimate (MLE), Gaussian conditioning and Gaussian product for the Riemannian Gaussian, and the definition of ExpectationMaximization (EM) and GaussianMixture Regression (GMR) for the Riemannian GMM. In this generalization, we contributed by proposing to use parallel transport for Gaussian conditioning. Furthermore, we presented a unified approach to solve the aforementioned operations using aGauss-Newton algorithm. We demonstrated how synergies, encoded in a Riemannian Gaussian, can be transformed into synergetic control policies using standard methods for LinearQuadratic Regulator (LQR). This is achieved by formulating the LQR problem in a (Euclidean) tangent space of the Riemannian manifold. Finally, we demonstrated how the contextadaptive Task-Parameterized Gaussian Mixture Model (TP-GMM) can be used for context inference\u2014the ability to extract context from demonstration data of known tasks. Our approach is the first attempt of context inference in the light of TP-GMM. Although effective, we showed that it requires further improvements in terms of speed and reliability. The efficacy of the Riemannian approach is demonstrated in a variety of scenarios. In shared control, the Riemannian Gaussian is used to represent control intentions of a human operator and an assistive system. Doing so, the properties of the Gaussian can be employed to mix their control intentions. This yields shared-control systems that continuously re-evaluate and assign control authority based on input confidence. The context-adaptive TP-GMMis demonstrated in a Pick & Place task with changing pick and place locations, a box-taping task with changing box sizes, and a trajectory tracking task typically found in industr
    • 

    corecore