6 research outputs found
AO-Grasp: Articulated Object Grasp Generation
We introduce AO-Grasp, a grasp proposal method that generates stable and
actionable 6 degree-of-freedom grasps for articulated objects. Our generated
grasps enable robots to interact with articulated objects, such as opening and
closing cabinets and appliances. Given a segmented partial point cloud of a
single articulated object, AO-Grasp predicts the best grasp points on the
object with a novel Actionable Grasp Point Predictor model and then finds
corresponding grasp orientations for each point by leveraging a
state-of-the-art rigid object grasping method. We train AO-Grasp on our new
AO-Grasp Dataset, which contains 48K actionable parallel-jaw grasps on
synthetic articulated objects. In simulation, AO-Grasp achieves higher grasp
success rates than existing rigid object grasping and articulated object
interaction baselines on both train and test categories. Additionally, we
evaluate AO-Grasp on 120 realworld scenes of objects with varied geometries,
articulation axes, and joint states, where AO-Grasp produces successful grasps
on 67.5% of scenes, while the baseline only produces successful grasps on 33.3%
of scenes.Comment: Project website: https://stanford-iprl-lab.github.io/ao-gras
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy
In this work, we tackle the problem of learning universal robotic dexterous
grasping from a point cloud observation under a table-top setting. The goal is
to grasp and lift up objects in high-quality and diverse ways and generalize
across hundreds of categories and even the unseen. Inspired by successful
pipelines used in parallel gripper grasping, we split the task into two stages:
1) grasp proposal (pose) generation and 2) goal-conditioned grasp execution.
For the first stage, we propose a novel probabilistic model of grasp pose
conditioned on the point cloud observation that factorizes rotation from
translation and articulation. Trained on our synthesized large-scale dexterous
grasp dataset, this model enables us to sample diverse and high-quality
dexterous grasp poses for the object point cloud.For the second stage, we
propose to replace the motion planning used in parallel gripper grasping with a
goal-conditioned grasp policy, due to the complexity involved in dexterous
grasping execution. Note that it is very challenging to learn this highly
generalizable grasp policy that only takes realistic inputs without oracle
states. We thus propose several important innovations, including state
canonicalization, object curriculum, and teacher-student distillation.
Integrating the two stages, our final pipeline becomes the first to achieve
universal generalization for dexterous grasping, demonstrating an average
success rate of more than 60\% on thousands of object instances, which
significantly outperforms all baselines, meanwhile showing only a minimal
generalization gap.Comment: Accepted to CVPR 202
The Potential of Satellite Sounding Observations for Deriving Atmospheric Wind in All-Weather Conditions
Atmospheric wind is an essential parameter in the global observing system. In this study, the water vapor field in Typhoon Lekima and its surrounding areas simulated by the Weather Research and Forecasting (WRF) model is utilized to track the atmospheric motion wind through the Farneback Optical Flow (OF) algorithm. A series of experiments are conducted to investigate the influence of temporal and spatial resolutions on the errors of tracked winds. It is shown that the wind accuracy from tracking the specific humidity is higher than that from tracking the relative humidity. For fast-evolving weather systems such as typhoons, the shorter time step allows for more accurate wind retrievals, whereas for slow to moderate evolving weather conditions, the longer time step is needed for smaller retrieval errors. Compared to the traditional atmospheric motion vectors (AMVs) algorithm, the Farneback OF wind algorithm achieves a pixel-wise feature tracking and obtains a higher spatial resolution of wind field. It also works well under some special circumstances such as very low water vapor content or the region where the wind direction is parallel to the moisture gradient direction. This study has some significant implications for the configuration of satellite microwave sounding missions through their derived water vapor fields. The required temporal and spatial resolutions in the OF algorithm critically determine the satellite revisiting time and the field of view size. The brightness temperature (BT) simulated through Community Radiative Transfer Model (CRTM) is also used to track winds. It is shown that the error of tracking BT is generally larger than that of tracking water vapor. This increased error may result from the uncertainty in simulations of brightness temperatures at 183 GHz
Tracking and Reconstructing Hand Object Interactions from Point Cloud Sequences in the Wild
In this work, we tackle the challenging task of jointly tracking hand object poses and reconstructing their shapes from depth point cloud sequences in the wild, given the initial poses at frame 0. We for the first time propose a point cloud-based hand joint tracking network, HandTrackNet, to estimate the inter-frame hand joint motion. Our HandTrackNet proposes a novel hand pose canonicalization module to ease the tracking task, yielding accurate and robust hand joint tracking. Our pipeline then reconstructs the full hand via converting the predicted hand joints into a MANO hand. For object tracking, we devise a simple yet effective module that estimates the object SDF from the first frame and performs optimization-based tracking. Finally, a joint optimization step is adopted to perform joint hand and object reasoning, which alleviates the occlusion-induced ambiguity and further refines the hand pose. During training, the whole pipeline only sees purely synthetic data, which are synthesized with sufficient variations and by depth simulation for the ease of generalization. The whole pipeline is pertinent to the generalization gaps and thus directly transferable to real in-the-wild data. We evaluate our method on two real hand object interaction datasets, e.g. HO3D and DexYCB, without any fine-tuning. Our experiments demonstrate that the proposed method significantly outperforms the previous state-of-the-art depth-based hand and object pose estimation and tracking methods, running at a frame rate of 9 FPS. We have released our code on https://github.com/PKU-EPIC/HOTrack