40 research outputs found
Learning to Singulate Objects using a Push Proposal Network
Learning to act in unstructured environments, such as cluttered piles of
objects, poses a substantial challenge for manipulation robots. We present a
novel neural network-based approach that separates unknown objects in clutter
by selecting favourable push actions. Our network is trained from data
collected through autonomous interaction of a PR2 robot with randomly organized
tabletop scenes. The model is designed to propose meaningful push actions based
on over-segmented RGB-D images. We evaluate our approach by singulating up to 8
unknown objects in clutter. We demonstrate that our method enables the robot to
perform the task with a high success rate and a low number of required push
actions. Our results based on real-world experiments show that our network is
able to generalize to novel objects of various sizes and shapes, as well as to
arbitrary object configurations. Videos of our experiments can be viewed at
http://robotpush.cs.uni-freiburg.deComment: International Symposium on Robotics Research (ISRR) 2017, videos:
http://robotpush.cs.uni-freiburg.d
Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter
When operating in unstructured environments such as warehouses, homes, and
retail centers, robots are frequently required to interactively search for and
retrieve specific objects from cluttered bins, shelves, or tables. Mechanical
Search describes the class of tasks where the goal is to locate and extract a
known target object. In this paper, we formalize Mechanical Search and study a
version where distractor objects are heaped over the target object in a bin.
The robot uses an RGBD perception system and control policies to iteratively
select, parameterize, and perform one of 3 actions -- push, suction, grasp --
until the target object is extracted, or either a time limit is exceeded, or no
high confidence push or grasp is available. We present a study of 5 algorithmic
policies for mechanical search, with 15,000 simulated trials and 300 physical
trials for heaps ranging from 10 to 20 objects. Results suggest that success
can be achieved in this long-horizon task with algorithmic policies in over 95%
of instances and that the number of actions required scales approximately
linearly with the size of the heap. Code and supplementary material can be
found at http://ai.stanford.edu/mech-search .Comment: To appear in IEEE International Conference on Robotics and Automation
(ICRA), 2019. 9 pages with 4 figure
Rearrangement on Lattices with Pick-n-Swaps: Optimality Structures and Efficient Algorithms
We propose and study a class of rearrangement problems under a novel
pick-n-swap prehensile manipulation model, in which a robotic manipulator,
capable of carrying an item and making item swaps, is tasked to sort items
stored in lattices of variable dimensions in a time-optimal manner. We
systematically analyze the intrinsic optimality structure, which is fairly rich
and intriguing, under different levels of item distinguishability (fully
labeled, where each item has a unique label, or partially labeled, where
multiple items may be of the same type) and different lattice dimensions.
Focusing on the most practical setting of one and two dimensions, we develop
low polynomial time cycle-following based algorithms that optimally perform
rearrangements on 1D lattices under both fully- and partially-labeled settings.
On the other hand, we show that rearrangement on 2D and higher dimensional
lattices becomes computationally intractable to optimally solve. Despite their
NP-hardness, we prove that efficient cycle-following based algorithms remain
asymptotically optimal for 2D fully- and partially-labeled settings, in
expectation, using the interesting fact that random permutations induce only a
small number of cycles. We further improve these algorithms to provide
1.x-optimality when the number of items is small. Simulation studies
corroborate the effectiveness of our algorithms.Comment: To appear in R:SS 202
Master of Science
thesisIn this work we consider task-based planning in uncertainty. To make progress in this problem, we propose an end-to-end method that makes progress toward the unification of perception and manipulation. Critical for this unification is the geometric primitive. A geometric primitive is a 3D geometry that can be fit to a single view from a 3D image. Geometric primitives are a consistent structure in many scenes, and by leveraging this, perceptual tasks such as segmentation, localization, and recognition can be solved. Sharing this information between these subroutines also makes the method computationally efficient. Geometric primitives can be used to define a set of actions the robot can use to influence the world. Leveraging the rich 3D information in geometric primitives allows the designer to develop actions with a high chance of success. In this work, we consider a pick-and-place action, parameterized by the object and scene constraints. The design of the perceptual capabilities and actions is independent of the task given to the robot, giving the robot more versatility to complete a range of tasks. With a large number of available actions, the robot needs to select which action the robot performs. We propose a task-specific reward function to determine the next-best action for the robot to complete the task. A key insight into making the action selection tractable is reasoning about the occluded regions of the scene. We propose to not reason about what could be in the occluded regions, but instead to treat the occluded regions as parts of the scene to explore. Defining reward functions that encourage this exploration while balancing trying to solve the given task gives the robot more versatility to perform many different tasks. Reasoning about occlusion in this way also makes actions in the scene more robust to scene uncertainty and increases the computational efficiency of the method overall. In this work, we show results for segmentation of geometric primitives on real data, and discuss problems with fitting their parameters. While positive segmentation results are shown, there are problems with fitting consistent parameters to the geometric primitives. We also present simulation results showing the action selection process solving a singulation task. We show that our method is able to perform this task in several scenes with varying levels of complexity. We compare against selecting actions at random, and show our method consistently takes fewer actions to solve the scene
Hierarchical Policy Learning for Mechanical Search
Retrieving objects from clutters is a complex task, which requires multiple
interactions with the environment until the target object can be extracted.
These interactions involve executing action primitives like grasping or pushing
as well as setting priorities for the objects to manipulate and the actions to
execute. Mechanical Search (MS) is a framework for object retrieval, which uses
a heuristic algorithm for pushing and rule-based algorithms for high-level
planning. While rule-based policies profit from human intuition in how they
work, they usually perform sub-optimally in many cases. Deep reinforcement
learning (RL) has shown great performance in complex tasks such as taking
decisions through evaluating pixels, which makes it suitable for training
policies in the context of object-retrieval. In this work, we first formulate
the MS problem in a principled formulation as a hierarchical POMDP. Based on
this formulation, we propose a hierarchical policy learning approach for the MS
problem. For demonstration, we present two main parameterized sub-policies: a
push policy and an action selection policy. When integrated into the
hierarchical POMDP's policy, our proposed sub-policies increase the success
rate of retrieving the target object from less than 32% to nearly 80%, while
reducing the computation time for push actions from multiple seconds to less
than 10 milliseconds.Comment: ICRA 202