Search CORE

2 research outputs found

DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands

Author: Gao Yang
Lan Fengbo
Oseni Oluwatosin
Wang Shengjie
Xu Haotian
Zhang Tao
Zhang Yunzhe
Publication venue
Publication date: 12/10/2023
Field of study

Achieving human-like dexterous manipulation remains a crucial area of research in robotics. Current research focuses on improving the success rate of pick-and-place tasks. Compared with pick-and-place, throw-catching behavior has the potential to increase picking speed without transporting objects to their destination. However, dynamic dexterous manipulation poses a major challenge for stable control due to a large number of dynamic contacts. In this paper, we propose a Stability-Constrained Reinforcement Learning (SCRL) algorithm to learn to catch diverse objects with dexterous hands. The SCRL algorithm outperforms baselines by a large margin, and the learned policies show strong zero-shot transfer performance on unseen objects. Remarkably, even though the object in a hand facing sideward is extremely unstable due to the lack of support from the palm, our method can still achieve a high level of success in the most challenging task. Video demonstrations of learned behaviors and the code can be found on the supplementary website

arXiv.org e-Print Archive

A Policy Optimization Method Towards Optimal-time Stability

Author: Cao Yuxue
Gao Yang
Lan Fengbo
Oseni Oluwatosin
Wang Shengjie
Xu Haotian
Zhang Tao
Zheng Xiang
Publication venue
Publication date: 12/10/2023
Field of study

In current model-free reinforcement learning (RL) algorithms, stability criteria based on sampling methods are commonly utilized to guide policy optimization. However, these criteria only guarantee the infinite-time convergence of the system's state to an equilibrium point, which leads to sub-optimality of the policy. In this paper, we propose a policy optimization technique incorporating sampling-based Lyapunov stability. Our approach enables the system's state to reach an equilibrium point within an optimal time and maintain stability thereafter, referred to as "optimal-time stability". To achieve this, we integrate the optimization method into the Actor-Critic framework, resulting in the development of the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm. Through evaluations conducted on ten robotic tasks, our approach outperforms previous studies significantly, effectively guiding the system to generate stable patterns.Comment: 27 pages, 11 figues. 7th Annual Conference on Robot Learning. 202

arXiv.org e-Print Archive