1,905 research outputs found
Probabilistic movement modeling for intention inference in human-robot interaction.
Intention inference can be an essential step toward efficient humanrobot interaction. For this purpose, we propose the Intention-Driven Dynamics Model (IDDM) to probabilistically model the generative process of movements that are directed by the intention. The IDDM allows to infer the intention from observed movements using Bayes ’ theorem. The IDDM simultaneously finds a latent state representation of noisy and highdimensional observations, and models the intention-driven dynamics in the latent states. As most robotics applications are subject to real-time constraints, we develop an efficient online algorithm that allows for real-time intention inference. Two human-robot interaction scenarios, i.e., target prediction for robot table tennis and action recognition for interactive humanoid robots, are used to evaluate the performance of our inference algorithm. In both intention inference tasks, the proposed algorithm achieves substantial improvements over support vector machines and Gaussian processes.
Optimal Stroke Learning with Policy Gradient Approach for Robotic Table Tennis
Learning to play table tennis is a challenging task for robots, as a wide
variety of strokes required. Recent advances have shown that deep Reinforcement
Learning (RL) is able to successfully learn the optimal actions in a simulated
environment. However, the applicability of RL in real scenarios remains limited
due to the high exploration effort. In this work, we propose a realistic
simulation environment in which multiple models are built for the dynamics of
the ball and the kinematics of the robot. Instead of training an end-to-end RL
model, a novel policy gradient approach with TD3 backbone is proposed to learn
the racket strokes based on the predicted state of the ball at the hitting
time. In the experiments, we show that the proposed approach significantly
outperforms the existing RL methods in simulation. Furthermore, to cross the
domain from simulation to reality, we adopt an efficient retraining method and
test it in three real scenarios. The resulting success rate is 98% and the
distance error is around 24.9 cm. The total training time is about 1.5 hours
Jointly learning trajectory generation and hitting point prediction in robot table tennis
This paper proposes a combined learning framework for a table tennis robot. In a typical robot table tennis setup, a single striking point is predicted for the robot on the basis of the ball's initial state. Subsequently, the desired Cartesian racket state and the desired joint states at the striking time are determined. Finally, robot joint trajectories are generated. Instead of predicting a single striking point, we propose to construct a ball trajectory prediction map, which predicts the ball's entire rebound trajectory using the ball's initial state. We construct as well a robot trajectory generation map, which predicts the robot joint movement pattern and the movement duration using the Cartesian racket trajectories without the need of inverse kinematics, where a correlation function is used to adapt these joint movement parameters according to the ball flight trajectory. With joint movement parameters, we can directly generate joint trajectories. Additionally, we introduce a reinforcement learning approach to modify robot joint trajectories such that the robot can return balls well. We validate this new framework in both the simulated and the real robotic systems and illustrate that a seven degree-of-freedom Barrett WAM robot performs well
Sample-efficient Reinforcement Learning in Robotic Table Tennis
Reinforcement learning (RL) has achieved some impressive recent successes in
various computer games and simulations. Most of these successes are based on
having large numbers of episodes from which the agent can learn. In typical
robotic applications, however, the number of feasible attempts is very limited.
In this paper we present a sample-efficient RL algorithm applied to the example
of a table tennis robot. In table tennis every stroke is different, with
varying placement, speed and spin. An accurate return therefore has to be found
depending on a high-dimensional continuous state space. To make learning in few
trials possible the method is embedded into our robot system. In this way we
can use a one-step environment. The state space depends on the ball at hitting
time (position, velocity, spin) and the action is the racket state
(orientation, velocity) at hitting. An actor-critic based deterministic policy
gradient algorithm was developed for accelerated learning. Our approach
performs competitively both in a simulation and on the real robot in a number
of challenging scenarios. Accurate results are obtained without pre-training in
under episodes of training. The video presenting our experiments is
available at https://youtu.be/uRAtdoL6Wpw.Comment: accepted at ICRA 2021 (Xian, China
Robotic Table Tennis: A Case Study into a High Speed Learning System
We present a deep-dive into a real-world robotic learning system that, in
previous work, was shown to be capable of hundreds of table tennis rallies with
a human and has the ability to precisely return the ball to desired targets.
This system puts together a highly optimized perception subsystem, a high-speed
low-latency robot controller, a simulation paradigm that can prevent damage in
the real world and also train policies for zero-shot transfer, and automated
real world environment resets that enable autonomous training and evaluation on
physical robots. We complement a complete system description, including
numerous design decisions that are typically not widely disseminated, with a
collection of studies that clarify the importance of mitigating various sources
of latency, accounting for training and deployment distribution shifts,
robustness of the perception system, sensitivity to policy hyper-parameters,
and choice of action space. A video demonstrating the components of the system
and details of experimental results can be found at
https://youtu.be/uFcnWjB42I0.Comment: Published and presented at Robotics: Science and Systems (RSS2023
Recommended from our members
Hierarchical policy design for sample-efficient learning of robot table tennis through self-play
Training robots with physical bodies requires developing new methods and action representations that allow the learning agents to explore the space of policies efficiently. This work studies sample-efficient learning of complex policies in the context of robot table tennis. It incorporates learning into a hierarchical control framework using a model-free strategy layer (which requires complex reasoning about opponents that is difficult to do in a model-based way), model-based prediction of external objects (which are difficult to control directly with analytic control methods, but governed by learnable and relatively simple laws of physics), and analytic controllers for the robot itself. Human demonstrations are used to train dynamics models, which together with the analytic controller allow any robot that is physically capable to play table tennis without training episodes. Using only about 7000 demonstrated trajectories, a striking policy can hit ball targets with about 20 cm error. Self-play is used to train cooperative and adversarial strategies on top of model-based striking skills trained from human demonstrations. After only about 24000 strikes in self-play the agent learns to best exploit the human dynamics models for longer cooperative games. Further experiments demonstrate that more flexible variants of the policy can discover new strikes not demonstrated by humans and achieve higher performance at the expense of lower sample-efficiency. Experiments are carried out in a virtual reality environment using sensory observations that are obtainable in the real world. The high sample-efficiency demonstrated in the evaluations show that the proposed method is suitable for learning directly on physical robots without transfer of models or policies from simulation.Computer Science
Black-Box vs. Gray-Box: A Case Study on Learning Table Tennis Ball Trajectory Prediction with Spin and Impacts
In this paper, we present a method for table tennis ball trajectory filtering
and prediction. Our gray-box approach builds on a physical model. At the same
time, we use data to learn parameters of the dynamics model, of an extended
Kalman filter, and of a neural model that infers the ball's initial condition.
We demonstrate superior prediction performance of our approach over two
black-box approaches, which are not supplied with physical prior knowledge. We
demonstrate that initializing the spin from parameters of the ball launcher
using a neural network drastically improves long-time prediction performance
over estimating the spin purely from measured ball positions. An accurate
prediction of the ball trajectory is crucial for successful returns. We
therefore evaluate the return performance with a pneumatic artificial muscular
robot and achieve a return rate of 29/30 (97.7%).Comment: Accepted for publication at the 5th Annual Conference on Learning for
Dynamics and Control (L4DC) 2023. With supplementary materia
- …