40 research outputs found

    Learning to Singulate Objects using a Push Proposal Network

    Full text link
    Learning to act in unstructured environments, such as cluttered piles of objects, poses a substantial challenge for manipulation robots. We present a novel neural network-based approach that separates unknown objects in clutter by selecting favourable push actions. Our network is trained from data collected through autonomous interaction of a PR2 robot with randomly organized tabletop scenes. The model is designed to propose meaningful push actions based on over-segmented RGB-D images. We evaluate our approach by singulating up to 8 unknown objects in clutter. We demonstrate that our method enables the robot to perform the task with a high success rate and a low number of required push actions. Our results based on real-world experiments show that our network is able to generalize to novel objects of various sizes and shapes, as well as to arbitrary object configurations. Videos of our experiments can be viewed at http://robotpush.cs.uni-freiburg.deComment: International Symposium on Robotics Research (ISRR) 2017, videos: http://robotpush.cs.uni-freiburg.d

    Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter

    Full text link
    When operating in unstructured environments such as warehouses, homes, and retail centers, robots are frequently required to interactively search for and retrieve specific objects from cluttered bins, shelves, or tables. Mechanical Search describes the class of tasks where the goal is to locate and extract a known target object. In this paper, we formalize Mechanical Search and study a version where distractor objects are heaped over the target object in a bin. The robot uses an RGBD perception system and control policies to iteratively select, parameterize, and perform one of 3 actions -- push, suction, grasp -- until the target object is extracted, or either a time limit is exceeded, or no high confidence push or grasp is available. We present a study of 5 algorithmic policies for mechanical search, with 15,000 simulated trials and 300 physical trials for heaps ranging from 10 to 20 objects. Results suggest that success can be achieved in this long-horizon task with algorithmic policies in over 95% of instances and that the number of actions required scales approximately linearly with the size of the heap. Code and supplementary material can be found at http://ai.stanford.edu/mech-search .Comment: To appear in IEEE International Conference on Robotics and Automation (ICRA), 2019. 9 pages with 4 figure

    Rearrangement on Lattices with Pick-n-Swaps: Optimality Structures and Efficient Algorithms

    Full text link
    We propose and study a class of rearrangement problems under a novel pick-n-swap prehensile manipulation model, in which a robotic manipulator, capable of carrying an item and making item swaps, is tasked to sort items stored in lattices of variable dimensions in a time-optimal manner. We systematically analyze the intrinsic optimality structure, which is fairly rich and intriguing, under different levels of item distinguishability (fully labeled, where each item has a unique label, or partially labeled, where multiple items may be of the same type) and different lattice dimensions. Focusing on the most practical setting of one and two dimensions, we develop low polynomial time cycle-following based algorithms that optimally perform rearrangements on 1D lattices under both fully- and partially-labeled settings. On the other hand, we show that rearrangement on 2D and higher dimensional lattices becomes computationally intractable to optimally solve. Despite their NP-hardness, we prove that efficient cycle-following based algorithms remain asymptotically optimal for 2D fully- and partially-labeled settings, in expectation, using the interesting fact that random permutations induce only a small number of cycles. We further improve these algorithms to provide 1.x-optimality when the number of items is small. Simulation studies corroborate the effectiveness of our algorithms.Comment: To appear in R:SS 202

    Master of Science

    Get PDF
    thesisIn this work we consider task-based planning in uncertainty. To make progress in this problem, we propose an end-to-end method that makes progress toward the unification of perception and manipulation. Critical for this unification is the geometric primitive. A geometric primitive is a 3D geometry that can be fit to a single view from a 3D image. Geometric primitives are a consistent structure in many scenes, and by leveraging this, perceptual tasks such as segmentation, localization, and recognition can be solved. Sharing this information between these subroutines also makes the method computationally efficient. Geometric primitives can be used to define a set of actions the robot can use to influence the world. Leveraging the rich 3D information in geometric primitives allows the designer to develop actions with a high chance of success. In this work, we consider a pick-and-place action, parameterized by the object and scene constraints. The design of the perceptual capabilities and actions is independent of the task given to the robot, giving the robot more versatility to complete a range of tasks. With a large number of available actions, the robot needs to select which action the robot performs. We propose a task-specific reward function to determine the next-best action for the robot to complete the task. A key insight into making the action selection tractable is reasoning about the occluded regions of the scene. We propose to not reason about what could be in the occluded regions, but instead to treat the occluded regions as parts of the scene to explore. Defining reward functions that encourage this exploration while balancing trying to solve the given task gives the robot more versatility to perform many different tasks. Reasoning about occlusion in this way also makes actions in the scene more robust to scene uncertainty and increases the computational efficiency of the method overall. In this work, we show results for segmentation of geometric primitives on real data, and discuss problems with fitting their parameters. While positive segmentation results are shown, there are problems with fitting consistent parameters to the geometric primitives. We also present simulation results showing the action selection process solving a singulation task. We show that our method is able to perform this task in several scenes with varying levels of complexity. We compare against selecting actions at random, and show our method consistently takes fewer actions to solve the scene

    Hierarchical Policy Learning for Mechanical Search

    Get PDF
    Retrieving objects from clutters is a complex task, which requires multiple interactions with the environment until the target object can be extracted. These interactions involve executing action primitives like grasping or pushing as well as setting priorities for the objects to manipulate and the actions to execute. Mechanical Search (MS) is a framework for object retrieval, which uses a heuristic algorithm for pushing and rule-based algorithms for high-level planning. While rule-based policies profit from human intuition in how they work, they usually perform sub-optimally in many cases. Deep reinforcement learning (RL) has shown great performance in complex tasks such as taking decisions through evaluating pixels, which makes it suitable for training policies in the context of object-retrieval. In this work, we first formulate the MS problem in a principled formulation as a hierarchical POMDP. Based on this formulation, we propose a hierarchical policy learning approach for the MS problem. For demonstration, we present two main parameterized sub-policies: a push policy and an action selection policy. When integrated into the hierarchical POMDP's policy, our proposed sub-policies increase the success rate of retrieving the target object from less than 32% to nearly 80%, while reducing the computation time for push actions from multiple seconds to less than 10 milliseconds.Comment: ICRA 202
    corecore