Search CORE

113 research outputs found

Learning Complicated Manipulation Skills via Deterministic Policy with Limited Demonstrations

Author: Ang Marcelo H
Haofeng Liu
Jiayi Tan
Yiwen Chen
Publication venue
Publication date: 29/03/2023
Field of study

Combined with demonstrations, deep reinforcement learning can efficiently develop policies for manipulators. However, it takes time to collect sufficient high-quality demonstrations in practice. And human demonstrations may be unsuitable for robots. The non-Markovian process and over-reliance on demonstrations are further challenges. For example, we found that RL agents are sensitive to demonstration quality in manipulation tasks and struggle to adapt to demonstrations directly from humans. Thus it is challenging to leverage low-quality and insufficient demonstrations to assist reinforcement learning in training better policies, and sometimes, limited demonstrations even lead to worse performance. We propose a new algorithm named TD3fG (TD3 learning from a generator) to solve these problems. It forms a smooth transition from learning from experts to learning from experience. This innovation can help agents extract prior knowledge while reducing the detrimental effects of the demonstrations. Our algorithm performs well in Adroit manipulator and MuJoCo tasks with limited demonstrations

arXiv.org e-Print Archive

A General Pipeline for 3D Detection of Vehicles

Author: Ang Jr. Marcelo H.
Du Xinxin
Karaman Sertac
Rus Daniela
Publication venue
Publication date: 11/02/2018
Field of study

Autonomous driving requires 3D perception of vehicles and other objects in the in environment. Much of the current methods support 2D vehicle detection. This paper proposes a flexible pipeline to adopt any 2D detection network and fuse it with a 3D point cloud to generate 3D information with minimum changes of the 2D detection networks. To identify the 3D box, an effective model fitting algorithm is developed based on generalised car models and score maps. A two-stage convolutional neural network (CNN) is proposed to refine the detected 3D box. This pipeline is tested on the KITTI dataset using two different 2D detection networks. The 3D detection results based on these two networks are similar, demonstrating the flexibility of the proposed pipeline. The results rank second among the 3D detection algorithms, indicating its competencies in 3D detection.Comment: Accepted at ICRA 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Scipedia

Robust 6D Object Pose Estimation by Learning RGB-D Features

Author: Ang Jr Marcelo H
Lee Gim Hee
Pan Liang
Tian Meng
Publication venue
Publication date: 09/03/2020
Field of study

Accurate 6D object pose estimation is fundamental to robotic manipulation and grasping. Previous methods follow a local optimization approach which minimizes the distance between closest point pairs to handle the rotation ambiguity of symmetric objects. In this work, we propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem. We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction. Additionally, the object location is detected by aggregating point-wise vectors pointing to the 3D center. Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches. Our code is available at https://github.com/mentian/object-posenet.Comment: Accepted at ICRA 202

arXiv.org e-Print Archive

Crossref

DR-Pose: A Two-stage Deformation-and-Registration Pipeline for Category-level 6D Object Pose Estimation

Author: Ang Jr Marcelo H.
Gan Runze
Liu Zhiyang
Wang Haozhe
Zhou Lei
Publication venue
Publication date: 04/09/2023
Field of study

Category-level object pose estimation involves estimating the 6D pose and the 3D metric size of objects from predetermined categories. While recent approaches take categorical shape prior information as reference to improve pose estimation accuracy, the single-stage network design and training manner lead to sub-optimal performance since there are two distinct tasks in the pipeline. In this paper, the advantage of two-stage pipeline over single-stage design is discussed. To this end, we propose a two-stage deformation-and registration pipeline called DR-Pose, which consists of completion-aided deformation stage and scaled registration stage. The first stage uses a point cloud completion method to generate unseen parts of target object, guiding subsequent deformation on the shape prior. In the second stage, a novel registration network is designed to extract pose-sensitive features and predict the representation of object partial point cloud in canonical space based on the deformation results from the first stage. DR-Pose produces superior results to the state-of-the-art shape prior-based methods on both CAMERA25 and REAL275 benchmarks. Codes are available at https://github.com/Zray26/DR-Pose.git.Comment: Camera-ready version accepted to IROS 202

arXiv.org e-Print Archive

Adaptive optimal control for linear discrete time-varying systems

Author: Ang Marcelo H
Ge Shuzhi Sam
Lee Tong Heng
Li Yanan
Wang Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

In this paper, adaptive optimal control is proposed for time-varying discrete linear system subject to unknown system dynamics. The idea of the method is a direct application of the Q-learning adaptive dynamic programming for time-varying system. In order to derive the optimal control policy, a actor-critic structure is constructed and time-varying least square method is adopted for parameter adaptation. It has shown that the derived control policy can robustly stabilize the time varying system and guarantee an optimal control performance at the same time. As no particular system information is required throughout the process, the proposed techniques provide a potential feasible solution to a large variety of control application. The validity of the proposed method is verified through simulation studies

Crossref

Sussex Research Online

ScholarBank@NUS