7,725 research outputs found
Deep Predictive Policy Training using Reinforcement Learning
Skilled robot task learning is best implemented by predictive action policies
due to the inherent latency of sensorimotor processes. However, training such
predictive policies is challenging as it involves finding a trajectory of motor
activations for the full duration of the action. We propose a data-efficient
deep predictive policy training (DPPT) framework with a deep neural network
policy architecture which maps an image observation to a sequence of motor
activations. The architecture consists of three sub-networks referred to as the
perception, policy and behavior super-layers. The perception and behavior
super-layers force an abstraction of visual and motor data trained with
synthetic and simulated training samples, respectively. The policy super-layer
is a small sub-network with fewer parameters that maps data in-between the
abstracted manifolds. It is trained for each task using methods for policy
search reinforcement learning. We demonstrate the suitability of the proposed
architecture and learning framework by training predictive policies for skilled
object grasping and ball throwing on a PR2 robot. The effectiveness of the
method is illustrated by the fact that these tasks are trained using only about
180 real robot attempts with qualitative terminal rewards.Comment: This work is submitted to IEEE/RSJ International Conference on
Intelligent Robots and Systems 2017 (IROS2017
A Sequential Two-Step Algorithm for Fast Generation of Vehicle Racing Trajectories
The problem of maneuvering a vehicle through a race course in minimum time
requires computation of both longitudinal (brake and throttle) and lateral
(steering wheel) control inputs. Unfortunately, solving the resulting nonlinear
optimal control problem is typically computationally expensive and infeasible
for real-time trajectory planning. This paper presents an iterative algorithm
that divides the path generation task into two sequential subproblems that are
significantly easier to solve. Given an initial path through the race track,
the algorithm runs a forward-backward integration scheme to determine the
minimum-time longitudinal speed profile, subject to tire friction constraints.
With this fixed speed profile, the algorithm updates the vehicle's path by
solving a convex optimization problem that minimizes the resulting path
curvature while staying within track boundaries and obeying affine,
time-varying vehicle dynamics constraints. This two-step process is repeated
iteratively until the predicted lap time no longer improves. While providing no
guarantees of convergence or a globally optimal solution, the approach performs
very well when validated on the Thunderhill Raceway course in Willows, CA. The
predicted lap time converges after four to five iterations, with each iteration
over the full 4.5 km race course requiring only thirty seconds of computation
time on a laptop computer. The resulting trajectory is experimentally driven at
the race circuit with an autonomous Audi TTS test vehicle, and the resulting
lap time and racing line is comparable to both a nonlinear gradient descent
solution and a trajectory recorded from a professional racecar driver. The
experimental results indicate that the proposed method is a viable option for
online trajectory planning in the near future
Autonomous Visual Servo Robotic Capture of Non-cooperative Target
This doctoral research develops and validates experimentally a vision-based control scheme for the autonomous capture of a non-cooperative target by robotic manipulators for active space debris removal and on-orbit servicing. It is focused on the final capture stage by robotic manipulators after the orbital rendezvous and proximity maneuver being completed. Two challenges have been identified and investigated in this stage: the dynamic estimation of the non-cooperative target and the autonomous visual servo robotic control. First, an integrated algorithm of photogrammetry and extended Kalman filter is proposed for the dynamic estimation of the non-cooperative target because it is unknown in advance. To improve the stability and precision of the algorithm, the extended Kalman filter is enhanced by dynamically correcting the distribution of the process noise of the filter. Second, the concept of incremental kinematic control is proposed to avoid the multiple solutions in solving the inverse kinematics of robotic manipulators. The proposed target motion estimation and visual servo control algorithms are validated experimentally by a custom built visual servo manipulator-target system. Electronic hardware for the robotic manipulator and computer software for the visual servo are custom designed and developed. The experimental results demonstrate the effectiveness and advantages of the proposed vision-based robotic control for the autonomous capture of a non-cooperative target. Furthermore, a preliminary study is conducted for future extension of the robotic control with consideration of flexible joints
Predictive Dynamic Simulation of Healthy Sit-to-Stand Movement
This thesis situates itself at the intersection of biomedical modelling and predictive simulation to synthesize healthy human sit-to-stand movement. While the importance of sit-to-stand to physical and social well-being is known, the reasons for why and how people come to perform sit-to-stand the way we do is largely unknown. This thesis establishes the determinants of sit-to-stand in healthy people so that future researchers may investigate the effects of compromised health on sit-to-stand and then explore means of intervening to preserve and restore this motion.
Previous researchers have predicted how a person rises from seated. However aspects of their models, most commonly contact and muscle models, are biomechanically inconsistent and restrict their application. These researchers also have not validated their prediction results.
To address these limitations and further the study of sit-to-stand prediction, the underlying themes of this thesis are in biomechanical modelling, predictive simulation, and validation. The goal of predicting sit-to-stand inspired the creation of three new models: a model of biomechanics, a model of motion, and performance criteria as a model of preference. First, the human is represented as three rigid links in the sagittal plane. As buttocks are kinetically important to sit-to-stand, a new constitutive model of buttocks is made from experimental force-deformation data. Ten muscles responsible for flexion and extension of the hips, knees, and ankles are defined in the model. Second, candidate sit-to-stand trajectories are described geometrically by a set of Bézier curves, for the first time. Third, with the assumption that healthy people naturally prioritize mechanical efficiency, disinclination to a motion is described as a cost function of joint torques, muscle stresses, and physical infeasibility including slipping and falling.
This new dynamic optimization routine allows for motions of gradually increasing complexity, by adding control points to the Bézier curves, while the model's performance is improving. By comparing the predictive simulation results to normative sit-to-stand as described in the literature, for the first time, it is possible to say that the use of these models and optimal control strategy together has produced motions characteristic of healthy sit-to-stand. This work bridges the gap between predictive simulation results and experimental human results and in doing so establishes a benchmark in sit-to-stand prediction. In predicting healthy sit-to-stand, it makes a necessary step toward predicting pathological sit-to-stand, and then to predicting the results of intervention to inform medical design and planning
- …