100 research outputs found

    Integrating Vision and Physical Interaction for Discovery, Segmentation and Grasping of Unknown Objects

    Get PDF
    In dieser Arbeit werden Verfahren der Bildverarbeitung und die Fähigkeit humanoider Roboter, mit ihrer Umgebung physisch zu interagieren, in engem Zusammenspiel eingesetzt, um unbekannte Objekte zu identifizieren, sie vom Hintergrund und anderen Objekten zu trennen, und letztendlich zu greifen. Im Verlauf dieser interaktiven Exploration werden außerdem Eigenschaften des Objektes wie etwa sein Aussehen und seine Form ermittelt

    Autonomous vision-guided bi-manual grasping and manipulation

    Get PDF
    This paper describes the implementation, demonstration and evaluation of a variety of autonomous, vision-guided manipulation capabilities, using a dual-arm Baxter robot. Initially, symmetric coordinated bi-manual manipulation based on kinematic tracking algorithm was implemented on the robot to enable a master-slave manipulation system. We demonstrate the efficacy of this approach with a human-robot collaboration experiment, where a human operator moves the master arm along arbitrary trajectories and the slave arm automatically follows the master arm while maintaining a constant relative pose between the two end-effectors. Next, this concept was extended to perform dual-arm manipulation without human intervention. To this extent, an image-based visual servoing scheme has been developed to control the motion of arms for positioning them at a desired grasp locations. Next we combine this with a dynamic position controller to move the grasped object using both arms in a prescribed trajectory. The presented approach has been validated by performing numerous symmetric and asymmetric bi-manual manipulations at different conditions. Our experiments demonstrated 80% success rate in performing the symmetric dual-arm manipulation tasks; and 73% success rate in performing asymmetric dualarm manipulation tasks

    Seeing Behind The Scene: Using Symmetry To Reason About Objects in Cluttered Environments

    Get PDF
    Rapid advances in robotic technology are bringing robots out of the controlled environments of assembly lines and factories into the unstructured and unpredictable real-life workspaces of human beings. One of the prerequisites for operating in such environments is the ability to grasp previously unobserved physical objects. To achieve this individual objects have to be delineated from the rest of the environment and their shape properties estimated from incomplete observations of the scene. This remains a challenging task due to the lack of prior information about the shape and pose of the object as well as occlusions in cluttered scenes. We attempt to solve this problem by utilizing the powerful concept of symmetry. Symmetry is ubiquitous in both natural and man-made environments. It reveals redundancies in the structure of the world around us and thus can be used in a variety of visual processing tasks. In this thesis we propose a complete pipeline for detecting symmetric objects and recovering their rotational and reflectional symmetries from 3D reconstructions of natural scenes. We begin by obtaining a multiple-view 3D pointcloud of the scene using the Kinect Fusion algorithm. Additionally a voxelized occupancy map of the scene is extracted in order to reason about occlusions. We propose two classes of algorithms for symmetry detection: curve based and surface based. Curve based algorithm relies on extracting and matching surface normal edge curves in the pointcloud. A more efficient surface based algorithm works by fitting symmetry axes/planes to the geometry of the smooth surfaces of the scene. In order to segment the objects we introduce a segmentation approach that uses symmetry as a global grouping principle. It extracts points of the scene that are consistent with a given symmetry candidate. To evaluate the performance of our symmetry detection and segmentation algorithms we construct a dataset of cluttered tabletop scenes with ground truth object masks and corresponding symmetries. Finally we demonstrate how our pipeline can be used by a mobile robot to detect and grasp objects in a house scenario

    Efficient 3D Segmentation, Registration and Mapping for Mobile Robots

    Get PDF
    Sometimes simple is better! For certain situations and tasks, simple but robust methods can achieve the same or better results in the same or less time than related sophisticated approaches. In the context of robots operating in real-world environments, key challenges are perceiving objects of interest and obstacles as well as building maps of the environment and localizing therein. The goal of this thesis is to carefully analyze such problem formulations, to deduce valid assumptions and simplifications, and to develop simple solutions that are both robust and fast. All approaches make use of sensors capturing 3D information, such as consumer RGBD cameras. Comparative evaluations show the performance of the developed approaches. For identifying objects and regions of interest in manipulation tasks, a real-time object segmentation pipeline is proposed. It exploits several common assumptions of manipulation tasks such as objects being on horizontal support surfaces (and well separated). It achieves real-time performance by using particularly efficient approximations in the individual processing steps, subsampling the input data where possible, and processing only relevant subsets of the data. The resulting pipeline segments 3D input data with up to 30Hz. In order to obtain complete segmentations of the 3D input data, a second pipeline is proposed that approximates the sampled surface, smooths the underlying data, and segments the smoothed surface into coherent regions belonging to the same geometric primitive. It uses different primitive models and can reliably segment input data into planes, cylinders and spheres. A thorough comparative evaluation shows state-of-the-art performance while computing such segmentations in near real-time. The second part of the thesis addresses the registration of 3D input data, i.e., consistently aligning input captured from different view poses. Several methods are presented for different types of input data. For the particular application of mapping with micro aerial vehicles where the 3D input data is particularly sparse, a pipeline is proposed that uses the same approximate surface reconstruction to exploit the measurement topology and a surface-to-surface registration algorithm that robustly aligns the data. Optimization of the resulting graph of determined view poses then yields globally consistent 3D maps. For sequences of RGBD data this pipeline is extended to include additional subsampling steps and an initial alignment of the data in local windows in the pose graph. In both cases, comparative evaluations show a robust and fast alignment of the input data

    Egocentric Perception of Hands and Its Applications

    Get PDF

    Combining Perception and Knowledge for Service Robotics

    Get PDF
    As the deployment of robots is shifting away from the industrial settings towards public and private sectors, the robots will have to get equipped with enough knowl- edge that will let them perceive, comprehend and act skillfully in their new work- ing environments. Unlike having a large degree of controlled environment variables characteristic for e.g. assembly lines, the robots active in shopping stores, museums or households will have to perform open-ended tasks and thus react to unforeseen events, self-monitor their activities, detect failures, recover from them and also learn and continuously update their knowledge. In this thesis we present a set of tools and algorithms for acquisition, interpreta- tion and reasoning about the environment models which enable the robots to act flexibly and skillfully in the afore mentioned environments. In particular our contri- butions beyond the state-of-the-art cover following four topics: a) semantic object maps which are the symbolic representations of indoor environments that robot can query for information, b) two algorithms for interactive segmentation of objects of daily use which enable the robots to recognise and grasp objects more robustly, c) an image point feature-based system for large scale object recognition, and finally, d) a system that combines statistical and logical knowledge for household domains and is able to answer queries such as Which objects are currently missing on a breakfast table? . Common to all contributions is that they are all knowledge-enabled in that they either use robot knowledge bases or ground knowledge structures into the robot s internal structures such as perception streams. Further, in all four cases we exploit the tight interplay between the robot s perceptual, reasoning and action skills which we believe is the key enabler for robots to act in unstructured environments. Most of the theoretical contributions of this thesis have also been implemented on TUM-James and TUM-Rosie robots and demonstrated to the spectators by having them perform various household chores. With those demonstrations we thoroughly validated the properties of the developed systems and showed the impossibility of having such tasks implemented without a knowledge-enabled backbone

    Algorithms and evaluation for object detection and tracking in computer vision

    Get PDF
    Vision-based object detection and tracking, especially for video surveillance applications, is studied from algorithms to performance evaluation. This dissertation is composed of four topics: (1) Background Modeling and Detection, (2) Performance Evaluation of Sensitive Target Detection, (3) Multi-view Multi-target Multi-Hypothesis Segmentation and Tracking of People, and (4) A Fine-Structure Image/Video Quality Measure. First, we present a real-time algorithm for foreground-background segmentation. It allows us to capture structural background variation due to periodic-like motion over a long period of time under limited memory. Our codebook-based representation is efficient in memory and speed compared with other background modeling techniques. Our method can handle scenes containing moving backgrounds or illumination variations, and it achieves robust detection for different types of videos. In addition to the basic algorithm, three features improving the algorithm are presented - Automatic Parameter Estimation, Layered Modeling/Detection and Adaptive Codebook Updating. Second, we introduce a performance evaluation methodology called Perturbation Detection Rate (PDR) analysis for measuring performance of foreground-background segmentation. It does not require foreground targets or knowledge of foreground distributions. It measures the sensitivity of a background subtraction algorithm in detecting possible low contrast targets against the background as a function of contrast. We compare four background subtraction algorithms using the methodology. Third, a multi-view multi-hypothesis approach to segmenting and tracking multiple persons on a ground plane is proposed. The tracking state space is the set of ground points of the people being tracked. During tracking, several iterations of segmentation are performed using information from human appearance models and ground plane homography. Two innovations are made in this chapter - (1) To more precisely locate the ground location of a person, all center vertical axes of the person across views are mapped to the top-view plane to find the intersection point. (2) To tackle the explosive state space due to multiple targets and views, iterative segmentation-searching is incorporated into a particle filtering framework. By searching for people's ground point locations from segmentations, a set of a few good particles can be identified, resulting in low computational cost. In addition, even if all the particles are away from the true ground point, some of them move towards the true one through the iterated process as long as they are located nearby. Finally, an objective no-reference measure is presented to assess fine-structure image/video quality. The proposed measure using local statistics reflects image degradation well in terms of noise and blur

    Cylinders extraction in non-oriented point clouds as a clustering problem

    Get PDF
    Finding geometric primitives in 3D point clouds is a fundamental task in many engineering applications such as robotics, autonomous-vehicles and automated industrial inspection. Among all solid shapes, cylinders are frequently found in a variety of scenes, comprising natural or man-made objects. Despite their ubiquitous presence, automated extraction and fitting can become challenging if performed ”in-the-wild”, when the number of primitives is unknown or the point cloud is noisy and not oriented. In this paper we pose the problem of extracting multiple cylinders in a scene by means of a Game-Theoretic inlier selection process exploiting the geometrical relations between pairs of axis candidates. First, we formulate the similarity between two possible cylinders considering the rigid motion aligning the two axes to the same line. This motion is represented with a unitary dual-quaternion so that the distance between two cylinders is induced by the length of the shortest geodesic path in SE(3). Then, a Game-Theoretical process exploits such similarity function to extract sets of primitives maximizing their inner mutual consensus. The outcome of the evolutionary process consists in a probability distribution over the sets of candidates (ie axes), which in turn is used to directly estimate the final cylinder parameters. An extensive experimental section shows that the proposed algorithm offers a high resilience to noise, since the process inherently discards inconsistent data. Compared to other methods, it does not need point normals and does not require a fine tuning of multiple parameters
    • …
    corecore