20 research outputs found
ORLA*: Mobile Manipulator-Based Object Rearrangement with Lazy A*
Effectively performing object rearrangement is an essential skill for mobile
manipulators, e.g., setting up a dinner table or organizing a desk. A key
challenge in such problems is deciding an appropriate manipulation order for
objects to effectively untangle dependencies between objects while considering
the necessary motions for realizing the manipulations (e.g., pick and place).
To our knowledge, computing time-optimal multi-object rearrangement solutions
for mobile manipulators remains a largely untapped research direction. In this
research, we propose ORLA*, which leverages delayed (lazy) evaluation in
searching for a high-quality object pick and place sequence that considers both
end-effector and mobile robot base travel. ORLA* also supports multi-layered
rearrangement tasks considering pile stability using machine learning.
Employing an optimal solver for finding temporary locations for displacing
objects, ORLA* can achieve global optimality. Through extensive simulation and
ablation study, we confirm the effectiveness of ORLA* delivering quality
solutions for challenging rearrangement instances. Supplementary materials are
available at: https://gaokai15.github.io/ORLA-Star/Comment: Submitted to ICRA 202
Pre- and post-contact policy decomposition for non-prehensile manipulation with zero-shot sim-to-real transfer
We present a system for non-prehensile manipulation that require a
significant number of contact mode transitions and the use of environmental
contacts to successfully manipulate an object to a target location. Our method
is based on deep reinforcement learning which, unlike state-of-the-art planning
algorithms, does not require apriori knowledge of the physical parameters of
the object or environment such as friction coefficients or centers of mass. The
planning time is reduced to the simple feed-forward prediction time on a neural
network. We propose a computational structure, action space design, and
curriculum learning scheme that facilitates efficient exploration and
sim-to-real transfer. In challenging real-world non-prehensile manipulation
tasks, we show that our method can generalize over different objects, and
succeed even for novel objects not seen during training. Project website:
https://sites.google.com/view/nonprenehsile-decompositionComment: Accepted to the 2023 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS
Rearrangement-Based Manipulation via Kinodynamic Planning and Dynamic Planning Horizons
Robot manipulation in cluttered environments often requires complex and
sequential rearrangement of multiple objects in order to achieve the desired
reconfiguration of the target objects. Due to the sophisticated physical
interactions involved in such scenarios, rearrangement-based manipulation is
still limited to a small range of tasks and is especially vulnerable to
physical uncertainties and perception noise. This paper presents a planning
framework that leverages the efficiency of sampling-based planning approaches,
and closes the manipulation loop by dynamically controlling the planning
horizon. Our approach interleaves planning and execution to progressively
approach the manipulation goal while correcting any errors or path deviations
along the process. Meanwhile, our framework allows the definition of
manipulation goals without requiring explicit goal configurations, enabling the
robot to flexibly interact with all objects to facilitate the manipulation of
the target ones. With extensive experiments both in simulation and on a real
robot, we evaluate our framework on three manipulation tasks in cluttered
environments: grasping, relocating, and sorting. In comparison with two
baseline approaches, we show that our framework can significantly improve
planning efficiency, robustness against physical uncertainties, and task
success rate under limited time budgets.Comment: Accepted for publication in the Proceedings of the 2022 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS 2022
Haisor: Human-aware indoor scene optimization via deep reinforcement learning
3D scene synthesis facilitates and benefits many real-world applications. Most scene generators focus on making indoor scenes plausible via learning from training data and leveraging extra constraints such as adjacency and symmetry. Although the generated 3D scenes are mostly plausible with visually realistic layouts, they can be functionally unsuitable for human users to navigate and interact with furniture. Our key observation is that human activity plays a critical role and sufficient free space is essential for human-scene interactions. This is exactly where many existing synthesized scenes fail—the seemingly correct layouts are often not fit for living. To tackle this, we present a human-aware optimization framework Haisor for 3D indoor scene arrangement via reinforcement learning, which aims to find an action sequence to optimize the indoor scene layout automatically. Based on the hierarchical scene graph representation, an optimal action sequence is predicted and performed via Deep Q-Learning with Monte Carlo Tree Search (MCTS), where MCTS is our key feature to search for the optimal solution in long-term sequences and large action space. Multiple human-aware rewards are designed as our core criteria of human-scene interaction, aiming to identify the next smart action by leveraging powerful reinforcement learning. Our framework is optimized end-to-end by giving the indoor scenes with part-level furniture layout including part mobility information. Furthermore, our methodology is extensible and allows utilizing different reward designs to achieve personalized indoor scene synthesis. Extensive experiments demonstrate that our approach optimizes the layout of 3D indoor scenes in a human-aware manner, which is more realistic and plausible than original state-of-the-art generator results, and our approach produces superior smart actions, outperforming alternative baselines
Object and Relation Centric Representations for Push Effect Prediction
Pushing is an essential non-prehensile manipulation skill used for tasks
ranging from pre-grasp manipulation to scene rearrangement, reasoning about
object relations in the scene, and thus pushing actions have been widely
studied in robotics. The effective use of pushing actions often requires an
understanding of the dynamics of the manipulated objects and adaptation to the
discrepancies between prediction and reality. For this reason, effect
prediction and parameter estimation with pushing actions have been heavily
investigated in the literature. However, current approaches are limited because
they either model systems with a fixed number of objects or use image-based
representations whose outputs are not very interpretable and quickly accumulate
errors. In this paper, we propose a graph neural network based framework for
effect prediction and parameter estimation of pushing actions by modeling
object relations based on contacts or articulations. Our framework is validated
both in real and simulated environments containing different shaped multi-part
objects connected via different types of joints and objects with different
masses. Our approach enables the robot to predict and adapt the effect of a
pushing action as it observes the scene. Further, we demonstrate 6D effect
prediction in the lever-up action in the context of robot-based hard-disk
disassembly.Comment: Project Page: https://fzaero.github.io/push_learning
Learning Physics-Based Manipulation in Clutter: Combining Image-Based Generalization and Look-Ahead Planning
Physics-based manipulation in clutter involves complex interaction between multiple objects. In this paper, we consider the problem of learning, from interaction in a physics simulator, manipulation skills to solve this multi-step sequential decision making problem in the real world. Our approach has two key properties: (i) the ability to generalize and transfer manipulation skills (over the type, shape, and number of objects in the scene) using an abstract image-based representation that enables a neural network to learn useful features; and (ii) the ability to perform look-ahead planning in the image space using a physics simulator, which is essential for such multi-step problems. We show, in sets of simulated and real-world experiments (video available on https://youtu.be/EmkUQfyvwkY), that by learning to evaluate actions in an abstract image-based representation of the real world, the robot can generalize and adapt to the object shapes in challenging real-world environments