25 research outputs found
6-DoF Stability Field via Diffusion Models
A core capability for robot manipulation is reasoning over where and how to
stably place objects in cluttered environments. Traditionally, robots have
relied on object-specific, hand-crafted heuristics in order to perform such
reasoning, with limited generalizability beyond a small number of object
instances and object interaction patterns. Recent approaches instead learn
notions of physical interaction, namely motion prediction, but require
supervision in the form of labeled object information or come at the cost of
high sample complexity, and do not directly reason over stability or object
placement. We present 6-DoFusion, a generative model capable of generating 3D
poses of an object that produces a stable configuration of a given scene.
Underlying 6-DoFusion is a diffusion model that incrementally refines a
randomly initialized SE(3) pose to generate a sample from a learned,
context-dependent distribution over stable poses. We evaluate our model on
different object placement and stacking tasks, demonstrating its ability to
construct stable scenes that involve novel object classes as well as to improve
the accuracy of state-of-the-art 3D pose estimation methods.Comment: In submissio
Is That a Chair? Imagining Affordances Using Simulations of an Articulated Human Body
For robots to exhibit a high level of intelligence in the real world, they
must be able to assess objects for which they have no prior knowledge.
Therefore, it is crucial for robots to perceive object affordances by reasoning
about physical interactions with the object. In this paper, we propose a novel
method to provide robots with an ability to imagine object affordances using
physical simulations. The class of chair is chosen here as an initial category
of objects to illustrate a more general paradigm. In our method, the robot
"imagines" the affordance of an arbitrarily oriented object as a chair by
simulating a physical sitting interaction between an articulated human body and
the object. This object affordance reasoning is used as a cue for object
classification (chair vs non-chair). Moreover, if an object is classified as a
chair, the affordance reasoning can also predict the upright pose of the object
which allows the sitting interaction to take place. We call this type of poses
the functional pose. We demonstrate our method in chair classification on
synthetic 3D CAD models. Although our method uses only 30 models for training,
it outperforms appearance-based deep learning methods, which require a large
amount of training data, when the upright orientation is not assumed to be
known a priori. In addition, we showcase that the functional pose predictions
of our method align well with human judgments on both synthetic models and real
objects scanned by a depth camera.Comment: 7 pages, 6 figures. Accepted to ICRA202
An Interactive Approach for Functional Prototype Recovery from a Single RGBD Image
Inferring the functionality of an object from a single RGBD image is difficult for two reasons: lack of semantic information about the object, and missing data due to occlusion. In this paper, we present an interactive framework to recover a 3D functional prototype from a single RGBD image. Instead of precisely reconstructing the object geometry for the prototype, we mainly focus on recovering the object’s functionality along with its geometry. Our system allows users to scribble on the image to create initial rough proxies for the parts. After user annotation of high-level relations between parts, our system automatically jointly optimizes detailed joint parameters (axis and position) and part geometry parameters (size, orientation, and position). Such prototype recovery enables a better understanding of the underlying image geometry and allows for further physically plausible manipulation. We demonstrate our framework on various indoor objects with simple or hybrid functions