Search CORE

7,725 research outputs found

Deep Predictive Policy Training using Reinforcement Learning

Author: Björkman Mårten
Ghadirzadeh Ali
Kragic Danica
Maki Atsuto
Publication venue
Publication date: 02/03/2017
Field of study

Skilled robot task learning is best implemented by predictive action policies due to the inherent latency of sensorimotor processes. However, training such predictive policies is challenging as it involves finding a trajectory of motor activations for the full duration of the action. We propose a data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations. The architecture consists of three sub-networks referred to as the perception, policy and behavior super-layers. The perception and behavior super-layers force an abstraction of visual and motor data trained with synthetic and simulated training samples, respectively. The policy super-layer is a small sub-network with fewer parameters that maps data in-between the abstracted manifolds. It is trained for each task using methods for policy search reinforcement learning. We demonstrate the suitability of the proposed architecture and learning framework by training predictive policies for skilled object grasping and ball throwing on a PR2 robot. The effectiveness of the method is illustrated by the fact that these tasks are trained using only about 180 real robot attempts with qualitative terminal rewards.Comment: This work is submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems 2017 (IROS2017

arXiv.org e-Print Archive

Crossref

Gaussian Processes for Learning in Motion Control:Applied to Semiconductor Back-End Machines

Author: Poot Maurice Mathias
Publication venue: Eindhoven University of Technology
Publication date: 03/07/2024
Field of study

Pure OAI Repository

A Sequential Two-Step Algorithm for Fast Generation of Vehicle Racing Trajectories

Author: Gerdes J Christian
Kapania Nitin R.
Subosits John
Publication venue: 'ASME International'
Publication date: 01/02/2019
Field of study

The problem of maneuvering a vehicle through a race course in minimum time requires computation of both longitudinal (brake and throttle) and lateral (steering wheel) control inputs. Unfortunately, solving the resulting nonlinear optimal control problem is typically computationally expensive and infeasible for real-time trajectory planning. This paper presents an iterative algorithm that divides the path generation task into two sequential subproblems that are significantly easier to solve. Given an initial path through the race track, the algorithm runs a forward-backward integration scheme to determine the minimum-time longitudinal speed profile, subject to tire friction constraints. With this fixed speed profile, the algorithm updates the vehicle's path by solving a convex optimization problem that minimizes the resulting path curvature while staying within track boundaries and obeying affine, time-varying vehicle dynamics constraints. This two-step process is repeated iteratively until the predicted lap time no longer improves. While providing no guarantees of convergence or a globally optimal solution, the approach performs very well when validated on the Thunderhill Raceway course in Willows, CA. The predicted lap time converges after four to five iterations, with each iteration over the full 4.5 km race course requiring only thirty seconds of computation time on a laptop computer. The resulting trajectory is experimentally driven at the race circuit with an autonomous Audi TTS test vehicle, and the resulting lap time and racing line is comparable to both a nonlinear gradient descent solution and a trajectory recorded from a professional racecar driver. The experimental results indicate that the proposed method is a viable option for online trajectory planning in the near future

arXiv.org e-Print Archive

CiteSeerX

Flexibility and robustness in iterative learning control : with applications to industrial printers

Author: Bolder J.J.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2015
Field of study

Repository TU/e

Pure OAI Repository

Autonomous Visual Servo Robotic Capture of Non-cooperative Target

Author: Dong Gangqi
Publication venue
Publication date: 27/07/2017
Field of study

This doctoral research develops and validates experimentally a vision-based control scheme for the autonomous capture of a non-cooperative target by robotic manipulators for active space debris removal and on-orbit servicing. It is focused on the final capture stage by robotic manipulators after the orbital rendezvous and proximity maneuver being completed. Two challenges have been identified and investigated in this stage: the dynamic estimation of the non-cooperative target and the autonomous visual servo robotic control. First, an integrated algorithm of photogrammetry and extended Kalman filter is proposed for the dynamic estimation of the non-cooperative target because it is unknown in advance. To improve the stability and precision of the algorithm, the extended Kalman filter is enhanced by dynamically correcting the distribution of the process noise of the filter. Second, the concept of incremental kinematic control is proposed to avoid the multiple solutions in solving the inverse kinematics of robotic manipulators. The proposed target motion estimation and visual servo control algorithms are validated experimentally by a custom built visual servo manipulator-target system. Electronic hardware for the robotic manipulator and computer software for the visual servo are custom designed and developed. The experimental results demonstrate the effectiveness and advantages of the proposed vision-based robotic control for the autonomous capture of a non-cooperative target. Furthermore, a preliminary study is conducted for future extension of the robotic control with consideration of flexible joints

YorkSpace

Predictive Dynamic Simulation of Healthy Sit-to-Stand Movement

Author: Norman-Gerum Valerie
Publication venue: 'University of Waterloo'
Publication date: 15/05/2019
Field of study

This thesis situates itself at the intersection of biomedical modelling and predictive simulation to synthesize healthy human sit-to-stand movement. While the importance of sit-to-stand to physical and social well-being is known, the reasons for why and how people come to perform sit-to-stand the way we do is largely unknown. This thesis establishes the determinants of sit-to-stand in healthy people so that future researchers may investigate the effects of compromised health on sit-to-stand and then explore means of intervening to preserve and restore this motion. Previous researchers have predicted how a person rises from seated. However aspects of their models, most commonly contact and muscle models, are biomechanically inconsistent and restrict their application. These researchers also have not validated their prediction results. To address these limitations and further the study of sit-to-stand prediction, the underlying themes of this thesis are in biomechanical modelling, predictive simulation, and validation. The goal of predicting sit-to-stand inspired the creation of three new models: a model of biomechanics, a model of motion, and performance criteria as a model of preference. First, the human is represented as three rigid links in the sagittal plane. As buttocks are kinetically important to sit-to-stand, a new constitutive model of buttocks is made from experimental force-deformation data. Ten muscles responsible for flexion and extension of the hips, knees, and ankles are defined in the model. Second, candidate sit-to-stand trajectories are described geometrically by a set of Bézier curves, for the first time. Third, with the assumption that healthy people naturally prioritize mechanical efficiency, disinclination to a motion is described as a cost function of joint torques, muscle stresses, and physical infeasibility including slipping and falling. This new dynamic optimization routine allows for motions of gradually increasing complexity, by adding control points to the Bézier curves, while the model's performance is improving. By comparing the predictive simulation results to normative sit-to-stand as described in the literature, for the first time, it is possible to say that the use of these models and optimal control strategy together has produced motions characteristic of healthy sit-to-stand. This work bridges the gap between predictive simulation results and experimental human results and in doing so establishes a benchmark in sit-to-stand prediction. In predicting healthy sit-to-stand, it makes a necessary step toward predicting pathological sit-to-stand, and then to predicting the results of intervention to inform medical design and planning

University of Waterloo's Institutional Repository