3,104 research outputs found

    Implicit Shape Model Trees: Recognition of 3-D Indoor Scenes and Prediction of Object Poses for Mobile Robots

    Full text link
    For a mobile robot, we present an approach to recognize scenes in arrangements of objects distributed over cluttered environments. Recognition is made possible by letting the robot alternately search for objects and assign found objects to scenes. Our scene model "Implicit Shape Model (ISM) trees" allows us to solve these two tasks together. For the ISM trees, this article presents novel algorithms for recognizing scenes and predicting the poses of searched objects. We define scenes as sets of objects, where some objects are connected by 3-D spatial relations. In previous work, we recognized scenes using single ISMs. However, these ISMs were prone to false positives. To address this problem, we introduced ISM trees, a hierarchical model that includes multiple ISMs. Through the recognition algorithm it contributes, this article ultimately enables the use of ISM trees in scene recognition. We intend to enable users to generate ISM trees from object arrangements demonstrated by humans. The lack of a suitable algorithm is overcome by the introduction of an ISM tree generation algorithm. In scene recognition, it is usually assumed that image data is already available. However, this is not always the case for robots. For this reason, we combined scene recognition and object search in previous work. However, we did not provide an efficient algorithm to link the two tasks. This article introduces such an algorithm that predicts the poses of searched objects with relations. Experiments show that our overall approach enables robots to find and recognize object arrangements that cannot be perceived from a single viewpoint.Comment: 22 pages, 24 figures; For associated video clips, see https://www.youtube.com/playlist?list=PL3RZ_UQY_uOIfuIJNqdS8wDMjTjOAeOm

    Efficient Belief Propagation for Perception and Manipulation in Clutter

    Full text link
    Autonomous service robots are required to perform tasks in common human indoor environments. To achieve goals associated with these tasks, the robot should continually perceive, reason its environment, and plan to manipulate objects, which we term as goal-directed manipulation. Perception remains the most challenging aspect of all stages, as common indoor environments typically pose problems in recognizing objects under inherent occlusions with physical interactions among themselves. Despite recent progress in the field of robot perception, accommodating perceptual uncertainty due to partial observations remains challenging and needs to be addressed to achieve the desired autonomy. In this dissertation, we address the problem of perception under uncertainty for robot manipulation in cluttered environments using generative inference methods. Specifically, we aim to enable robots to perceive partially observable environments by maintaining an approximate probability distribution as a belief over possible scene hypotheses. This belief representation captures uncertainty resulting from inter-object occlusions and physical interactions, which are inherently present in clutterred indoor environments. The research efforts presented in this thesis are towards developing appropriate state representations and inference techniques to generate and maintain such belief over contextually plausible scene states. We focus on providing the following features to generative inference while addressing the challenges due to occlusions: 1) generating and maintaining plausible scene hypotheses, 2) reducing the inference search space that typically grows exponentially with respect to the number of objects in a scene, 3) preserving scene hypotheses over continual observations. To generate and maintain plausible scene hypotheses, we propose physics informed scene estimation methods that combine a Newtonian physics engine within a particle based generative inference framework. The proposed variants of our method with and without a Monte Carlo step showed promising results on generating and maintaining plausible hypotheses under complete occlusions. We show that estimating such scenarios would not be possible by the commonly adopted 3D registration methods without the notion of a physical context that our method provides. To scale up the context informed inference to accommodate a larger number of objects, we describe a factorization of scene state into object and object-parts to perform collaborative particle-based inference. This resulted in the Pull Message Passing for Nonparametric Belief Propagation (PMPNBP) algorithm that caters to the demands of the high-dimensional multimodal nature of cluttered scenes while being computationally tractable. We demonstrate that PMPNBP is orders of magnitude faster than the state-of-the-art Nonparametric Belief Propagation method. Additionally, we show that PMPNBP successfully estimates poses of articulated objects under various simulated occlusion scenarios. To extend our PMPNBP algorithm for tracking object states over continuous observations, we explore ways to propose and preserve hypotheses effectively over time. This resulted in an augmentation-selection method, where hypotheses are drawn from various proposals followed by the selection of a subset using PMPNBP that explained the current state of the objects. We discuss and analyze our augmentation-selection method with its counterparts in belief propagation literature. Furthermore, we develop an inference pipeline for pose estimation and tracking of articulated objects in clutter. In this pipeline, the message passing module with the augmentation-selection method is informed by segmentation heatmaps from a trained neural network. In our experiments, we show that our proposed pipeline can effectively maintain belief and track articulated objects over a sequence of observations under occlusion.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163159/1/kdesingh_1.pd

    Human-robot Interaction For Multi-robot Systems

    Get PDF
    Designing an effective human-robot interaction paradigm is particularly important for complex tasks such as multi-robot manipulation that require the human and robot to work together in a tightly coupled fashion. Although increasing the number of robots can expand the area that the robots can cover within a bounded period of time, a poor human-robot interface will ultimately compromise the performance of the team of robots. However, introducing a human operator to the team of robots, does not automatically improve performance due to the difficulty of teleoperating mobile robots with manipulators. The human operator’s concentration is divided not only among multiple robots but also between controlling each robot’s base and arm. This complexity substantially increases the potential neglect time, since the operator’s inability to effectively attend to each robot during a critical phase of the task leads to a significant degradation in task performance. There are several proven paradigms for increasing the efficacy of human-robot interaction: 1) multimodal interfaces in which the user controls the robots using voice and gesture; 2) configurable interfaces which allow the user to create new commands by demonstrating them; 3) adaptive interfaces which reduce the operator’s workload as necessary through increasing robot autonomy. This dissertation presents an evaluation of the relative benefits of different types of user interfaces for multi-robot systems composed of robots with wheeled bases and three degree of freedom arms. It describes a design for constructing low-cost multi-robot manipulation systems from off the shelf parts. User expertise was measured along three axes (navigation, manipulation, and coordination), and participants who performed above threshold on two out of three dimensions on a calibration task were rated as expert. Our experiments reveal that the relative expertise of the user was the key determinant of the best performing interface paradigm for that user, indicating that good user modiii eling is essential for designing a human-robot interaction system that will be used for an extended period of time. The contributions of the dissertation include: 1) a model for detecting operator distraction from robot motion trajectories; 2) adjustable autonomy paradigms for reducing operator workload; 3) a method for creating coordinated multi-robot behaviors from demonstrations with a single robot; 4) a user modeling approach for identifying expert-novice differences from short teleoperation traces

    Spatial language driven robot

    Get PDF
    This dissertation investigates the methods to enable a robot to interact with human using spatial language. A prototype system of human-robot interaction using spatial language running on an autonomous robot is proposed in the dissertation. The system includes two complementary works. One is to control the robot by human natural spatial language to find the target object to fetch it. Another work is to generate a natural spatial language description to describe a target object in the robot working environment. The first task is called spatial language grounding and the second work is named as spatial language generation. The spatial language grounding and generation are both end-to-end process which means the system will determine the output only by the natural language command from a human during the interaction and the raw perception data collected from the environment. Furniture recognizers are designed for the robot to detect the environment during the tasks. A hierarchy system is designed to translate the human spatial language to the symbolic grounding model and then to the robot actions. To reduce the ambiguity in the interaction, a human demonstration system is designed to collect the spatial concept of the human user for building the robot behavior policies under different grounding models. A language generation system trained by real human spatial language corpus is proposed to automatically edit spatial descriptions of the location of a target object. All the modules in the system are evaluated in the physical environment, and a 3D robot simulator developed on ROS and GAZEBO.Includes biblographical reference
    • …
    corecore