208 research outputs found
Optimal Stroke Learning with Policy Gradient Approach for Robotic Table Tennis
Learning to play table tennis is a challenging task for robots, as a wide
variety of strokes required. Recent advances have shown that deep Reinforcement
Learning (RL) is able to successfully learn the optimal actions in a simulated
environment. However, the applicability of RL in real scenarios remains limited
due to the high exploration effort. In this work, we propose a realistic
simulation environment in which multiple models are built for the dynamics of
the ball and the kinematics of the robot. Instead of training an end-to-end RL
model, a novel policy gradient approach with TD3 backbone is proposed to learn
the racket strokes based on the predicted state of the ball at the hitting
time. In the experiments, we show that the proposed approach significantly
outperforms the existing RL methods in simulation. Furthermore, to cross the
domain from simulation to reality, we adopt an efficient retraining method and
test it in three real scenarios. The resulting success rate is 98% and the
distance error is around 24.9 cm. The total training time is about 1.5 hours
Adaptive Robot Systems in Highly Dynamic Environments: A Table Tennis Robot
Hintergrund: Tischtennis bietet ideale Bedingungen, um Kamera-basierte Roboterarme am Limit zu testen. Die besondere Herausforderung liegt in der hohen Geschwindigkeit des Spiels und in der groĂen Varianz von Spin und Tempo jedes einzelnen Schlages. Die bisherige Forschung mit Tischtennisrobotern beschrĂ€nkt sich jedoch auf einfache Szenarien, d.h. auf langsame BĂ€lle mit einer geringen Rotation.
Forschungsziel: Es soll ein lernfÀhiger Tischtennisroboter entwickelt werden, der mit dem Spin menschlicher Gegner umgehen kann.
Methoden:
Das vorgestellte Robotersystem besteht aus sechs Komponenten: Ballpositionserkennung, Ballspinerkennung, Balltrajektorienvorhersage, Schlagparameterbestimmung, Robotertrajektorienplanung und Robotersteuerung.
Zuerst wird der Ball mit traditioneller Bildverarbeitung in den Kamerabildern lokalisiert. Mit iterativer Triangulation wird dann seine 3D-Position berechnet. Aus der Kurve der Ballpositionen wird die aktuelle Position und Geschwindigkeit des Balles ermittelt.
FĂŒr die Spinerkennung werden drei Methoden prĂ€sentiert: Die ersten beiden verfolgen die Bewegung des aufgedruckten Ball-Logos auf hochauflösenden Bildern durch Computer Vision bzw. Convolutional Neural Networks. Im dritten Ansatz wird die Flugbahn des Balls unter BerĂŒcksichtigung der Magnus-Kraft analysiert.
Anhand der Position, der Geschwindigkeit und des Spins des Balls wird die zukĂŒnftige Flugbahn berechnet. DafĂŒr wird die physikalische Diffenzialgleichung mit Gravitationskraft, Luftwiderstandskraft und Magnus-Kraft schrittweise gelöst.
Mit dem berechneten Zustand des Balls am Schlagpunkt haben wir einen Reinforcement-Learning-Algorithmus trainiert, der bestimmt, mit welchen Schlagparametern der Ball zu treffen ist. Eine passende Robotertrajektorie wird von der Reflexxes-Bibliothek generiert. %Der Roboter wird dann mit einer Frequenz von 250 Hz angesteuert.
Ergebnisse: In der quantitativen Auswertung erzielen die einzelnen Komponenten mindestens so gute Ergebnisse wie vergleichbare Tischtennisroboter. Im Hinblick auf das Forschungsziel konnte der Roboter
- ein Konterspiel mit einem Menschen fĂŒhren, mit bis zu 60 RĂŒckschlĂ€gen,
- unterschiedlichen Spin (Ăber- und Unterschnitt) retournieren
- und mehrere TischtennisĂŒbungen innerhalb von 200 SchlĂ€gen erlernen.
SchluĂfolgerung: Bedeutende algorithmische Neuerungen fĂŒhren wir in der Spinerkennung und beim Reinforcement Learning von Schlagparametern ein. Dadurch meistert der Roboter anspruchsvollere Spin- und Ăbungsszenarien als in vergleichbaren Arbeiten.Background: Robotic table tennis systems offer an ideal platform for pushing camera-based robotic manipulation systems to the limit. The unique challenge arises from the fast-paced play and the wide variation in spin and speed between strokes. The range of scenarios under which existing table tennis robots are able to operate is, however, limited, requiring slow play with low rotational velocity of the ball (spin).
Research Goal: We aim to develop a table tennis robot system with learning capabilities able to handle spin against a human opponent.
Methods: The robot system presented in this thesis consists of six components: ball position detection, ball spin detection, ball trajectory prediction, stroke parameter suggestion, robot trajectory generation, and robot control.
For ball detection, the camera images pass through a conventional image processing pipeline. The ballâs 3D positions are determined using iterative triangulation and these are then used to estimate the current ball state (position and velocity).
We propose three methods for estimating the spin. The first two methods estimate spin by analyzing the movement of the logo printed on the ball on high-resolution images using either conventional computer vision or convolutional neural networks. The final approach involves analyzing the trajectory of the ball using Magnus force fitting. Once the ballâs position, velocity, and spin are known, the future trajectory is predicted by forward-solving a physical ball model involving gravitational, drag, and Magnus forces.
With the predicted ball state at hitting time as state input, we train a reinforcement learning algorithm to suggest the racket state at hitting time (stroke parameter). We use the Reflexxes library to generate a robot trajectory to achieve the suggested racket state.
Results: Quantitative evaluation showed that all system components achieve results as good as or better than comparable robots. Regarding the research goal of this thesis, the robot was able to
- maintain stable counter-hitting rallies of up to 60 balls with a human player,
- return balls with different spin types (topspin and backspin) in the same rally,
- learn multiple table tennis drills in just 200 strokes or fewer.
Conclusion: Our spin detection system and reinforcement learning-based stroke parameter suggestion introduce significant algorithmic novelties. In contrast to previous work, our robot succeeds in more difficult spin scenarios and drills
Finite-time extended state observer and fractional-order sliding mode controller for impulsive hybrid port-Hamiltonian systems with input delay and actuators saturation: Application to ball-juggler robots
This paper addresses the robust control problem of mechanical systems with hybrid dynamics in port-Hamiltonian form. It is assumed that only the position states are measurable, and time-delay and saturation constraint affect the control signal. An extended state observer is designed after a coordinate transformation. The effect of the time delay in the control signal is neutralized by applying Pade Ì approximant and augmenting the system states. An assistant system with faster convergence is developed to handle actuators saturation. Fractional-order sliding mode controller acts as a centralized controller and compensates for the undesired effects of unknown external disturbance and parameter uncertainties using the observer estimation results. Stability analysis shows that the closed-loop system states, such as the observer tracking error, and the position/velocity tracking errors, are finite-time stable. Simulation studies on a two ball-playing juggler robot with three degrees of freedom validate the theoretical resultsâ effectiveness
Robotic Ball Catching with an Eye-in-Hand Single-Camera System
In this paper, a unified control framework is proposed to realize a robotic ball catching task with only a moving single-camera (eye-in-hand) system able to catch flying, rolling, and bouncing balls in the same formalism. The thrown ball is visually tracked through a circle detection algorithm. Once the ball is recognized, the camera is forced to follow a baseline in the space so as to acquire an initial dataset of visual measurements. A first estimate of the catching point is initially provided through a linear algorithm. Then, additional visual measurements are acquired to constantly refine the current estimate by exploiting a nonlinear optimization algorithm and a more accurate ballistic model. A classic partitioned visual servoing approach is employed to control the translational and rotational components of the camera differently. Experimental results performed on an industrial robotic system prove the effectiveness of the presented solution. A motion-capture system is employed to validate the proposed estimation process via ground truth
Recommended from our members
A Novel Multi-View Table Tennis Umpiring Framework
This research investigates the development of a low-cost multi-view umpiring framework, as an alternative to the current expensive systems that are almost exclusively restricted to elite professional sports. Table tennis has been selected as the testbed because, while automating the process is challenging, it has many different complex match elements including the service, return and rallies, which are governed by a strict set of regulations. The focus is mainly on the rally element rather than the whole match. Ball detection and tracking in video frames are undertaken to determine reliably the ball position relative to key reference objects like the table surface and net, and the ballâs flight path is used to determine the rallyâs status.
While a low-cost option has benefits, it is technically challenging due to the limited number of cameras and generally low video resolution used. This thesis presents a portable multi-view umpiring framework that identifies each state change in a rally. It makes three significant contributions to knowledge: i) a reliable ball detection strategy that accurately detects the location of the ball in low-resolution sequences; ii) a novel framework for ball tracking using a multi-view system, and iii) a new state-machine based evaluation system for analysing table tennis rallies.
In a series of ten different test scenarios, the system achieved an average of 94% system detection rate and 100% accurate decisions. A test sequence of duration 1 s can be processed in 8 s, leading to a delay of only 7 s, which is considered acceptable for practical purposes. This solution has the potential to reform the way matches are umpired, providing objectivity in resolving disputed decisions. It affords an economic technology for amateur players, while the multi-view facility is extendible to other relevant ball-based sports. Finally, the ball flight path analysis mechanism can be a valuable training tool for skills development
Modeling and Learning of Complex Motor Tasks: A Case Study with Robot Table Tennis
Most tasks that humans need to accomplished in their everyday life require certain motor skills. Although most motor skills seem to rely on the same elementary movements, humans are able to accomplish
many different tasks. Robots, on the other hand, are still limited to a small number of skills and depend on well-defined environments. Modeling new motor behaviors is therefore an important research area
in robotics. Computational models of human motor control are an essential step to construct robotic systems that are able to solve complex tasks in a human inhabited environment. These models can be
the key for robust, efficient, and human-like movement plans. In turn, the reproduction of human-like behavior on a robotic system can be also beneficial for computational neuroscientists to verify their
hypotheses. Although biomimetic models can be of great help in order to close the gap between human and robot motor abilities, these models are usually limited to the scenarios considered. However, one
important property of human motor behavior is the ability to adapt skills to new situations and to learn new motor skills with relatively few trials. Domain-appropriate machine learning techniques, such as supervised and reinforcement learning, have a great potential to enable robotic systems to autonomously
learn motor skills. In this thesis, we attempt to model and subsequently learn a complex motor task. As a test case
for a complex motor task, we chose robot table tennis throughout this thesis. Table tennis requires a series of time critical movements which have to be selected and adapted according to environmental
stimuli as well as the desired targets. We first analyze how humans play table tennis and create a computational model that results in human-like hitting motions on a robot arm. Our focus lies on
generating motor behavior capable of adapting to variations and uncertainties in the environmental conditions. We evaluate the resulting biomimetic model both in a physically realistic simulation and on a real anthropomorphic seven degrees of freedom Barrett WAM robot arm. This biomimetic model based purely on analytical methods produces successful hitting motions, but does not feature the flexibility found in human motor behavior. We therefore suggest a new framework that allows a robot to learn cooperative table tennis from and with a human. Here, the robot first learns a set of elementary hitting movements from a human teacher by kinesthetic teach-in, which is compiled into a set of motor primitives. To generalize these movements to a wider range of situations we introduce the mixture of motor primitives algorithm. The resulting motor policy enables the robot to select appropriate motor primitives as well as to generalize between them. Furthermore, it also allows to adapt the selection process of the hitting movements based on the outcome of previous trials. The framework is evaluated both in simulation and on a real Barrett WAM robot. In consecutive experiments, we show that our approach allows the robot to return balls from a ball launcher and furthermore to play table tennis with a human partner.
Executing robot movements using a biomimetic or learned approach enables the robot to return balls successfully. However, in motor tasks with a competitive goal such as table tennis, the robot not
only needs to return the balls successfully in order to accomplish the task, it also needs an adaptive strategy. Such a higher-level strategy cannot be programed manually as it depends on the opponent and the abilities of the robot. We therefore make a first step towards the goal of acquiring such a strategy and investigate the possibility of inferring strategic information from observing humans playing table tennis. We model table tennis as a Markov decision problem, where the reward function captures the goal of the task as well as knowledge on effective elements of a basic strategy. We show how this reward function, and therefore the strategic information can be discovered with model-free inverse reinforcement learning from human table tennis matches. The approach is evaluated on data collected from players with different playing styles and skill levels. We show that the resulting reward functions are able to capture expert-specific strategic information that allow to distinguish the expert among players with different playing skills as well as different playing styles. To summarize, in this thesis, we have derived a computational model for table tennis that was
successfully implemented on a Barrett WAM robot arm and that has proven to produce human-like hitting motions. We also introduced a framework for learning a complex motor task based on a library
of demonstrated hitting primitives. To select and generalize these hitting movements we developed the mixture of motor primitives algorithm where the selection process can be adapted online based
on the success of the synthesized hitting movements. The setup was tested on a real robot, which showed that the resulting robot table tennis player is able to play a cooperative game against an human
opponent. Finally, we could show that it is possible to infer basic strategic information in table tennis from observing matches of human players using model-free inverse reinforcement learning
Deep Video Analytics of Humans: From Action Recognition to Forgery Detection
In this work, we explore a variety of techniques and applications for visual problems involving videos of humans in the contexts of activity detection, pose detection, and forgery detection.
The first works discussed here address the issue of human activity detection in untrimmed video where the actions performed are spatially and temporally sparse. The video may therefore contain long sequences of frames where no actions occur, and the actions that do occur will often only comprise a very small percentage of the pixels on the screen. We address this with a two-stage architecture that first creates many coarse proposals with high recall, and then classifies and refines them to create temporally accurate activity proposals. We present two methods that follow this high-level paradigm: TRI-3D and CHUNK-3D.
This work on activity detection is then extended to include results on few-shot learning. In this domain, a system must learn to perform detection given only an extremely limited set of training examples. We propose a method we call a Self-Denoising Neural Network (SDNN), which takes inspiration from Denoising Autoencoders, in order to solve this problem, both in the context of activity detection and image classification.
We also propose a method that performs optical character recognition on real world images when no labels are available in the language we wish to transcribe. Specifically, we build an accurate transcription system for Hebrew street name signs when no labeled training data is available. In order to do this, we divide the problem into two components and address each separately: content, which refers to the characters and language structure, and style, which refers to the domain of the images (for example, real or synthetic). We train with simple synthetic Hebrew street signs to address the content components, and with labeled French street signs to address the style.
We continue our analysis by proposing a method for automatic detection of facial forgeries in videos and images. This work approaches the problem of facial forgery detection by breaking the face into multiple regions and training separate classifiers for each part. The end result is a collection of high-quality facial forgery detectors that are both accurate and explainable. We exploit this explainability by providing extensive empirical analysis of our method's results.
Next, we present work that focuses on multi-camera, multi-person 3D human pose estimation from video. To address this problem, we aggregate the outputs of a 2D human pose detector across cameras and actors using a novel factor graph formulation, which we optimize using the loopy belief propagation algorithm. In particular, our factor graph introduces a temporal smoothing term to create smooth transitions between poses across frames.
Finally, our last proposed method covers activity detection, pose detection, and tracking in the game of Ping Pong, where we present a new dataset, dubbed SPIN, with extensive annotations. We introduce several tasks with this dataset, including the task of predicting the future actions of players and tracking ball movements. To evaluate our performance on these tasks, we present a novel recurrent gated CNN architecture
Recommended from our members
Hierarchical policy design for sample-efficient learning of robot table tennis through self-play
Training robots with physical bodies requires developing new methods and action representations that allow the learning agents to explore the space of policies efficiently. This work studies sample-efficient learning of complex policies in the context of robot table tennis. It incorporates learning into a hierarchical control framework using a model-free strategy layer (which requires complex reasoning about opponents that is difficult to do in a model-based way), model-based prediction of external objects (which are difficult to control directly with analytic control methods, but governed by learnable and relatively simple laws of physics), and analytic controllers for the robot itself. Human demonstrations are used to train dynamics models, which together with the analytic controller allow any robot that is physically capable to play table tennis without training episodes. Using only about 7000 demonstrated trajectories, a striking policy can hit ball targets with about 20 cm error. Self-play is used to train cooperative and adversarial strategies on top of model-based striking skills trained from human demonstrations. After only about 24000 strikes in self-play the agent learns to best exploit the human dynamics models for longer cooperative games. Further experiments demonstrate that more flexible variants of the policy can discover new strikes not demonstrated by humans and achieve higher performance at the expense of lower sample-efficiency. Experiments are carried out in a virtual reality environment using sensory observations that are obtainable in the real world. The high sample-efficiency demonstrated in the evaluations show that the proposed method is suitable for learning directly on physical robots without transfer of models or policies from simulation.Computer Science
- âŠ