671 research outputs found

    Visual 3-D SLAM from UAVs

    Get PDF
    The aim of the paper is to present, test and discuss the implementation of Visual SLAM techniques to images taken from Unmanned Aerial Vehicles (UAVs) outdoors, in partially structured environments. Every issue of the whole process is discussed in order to obtain more accurate localization and mapping from UAVs flights. Firstly, the issues related to the visual features of objects in the scene, their distance to the UAV, and the related image acquisition system and their calibration are evaluated for improving the whole process. Other important, considered issues are related to the image processing techniques, such as interest point detection, the matching procedure and the scaling factor. The whole system has been tested using the COLIBRI mini UAV in partially structured environments. The results that have been obtained for localization, tested against the GPS information of the flights, show that Visual SLAM delivers reliable localization and mapping that makes it suitable for some outdoors applications when flying UAVs

    Vector extension of monogenic wavelets for geometric representation of color images

    No full text
    14 pagesInternational audienceMonogenic wavelets offer a geometric representation of grayscale images through an AM/FM model allowing invariance of coefficients to translations and rotations. The underlying concept of local phase includes a fine contour analysis into a coherent unified framework. Starting from a link with structure tensors, we propose a non-trivial extension of the monogenic framework to vector-valued signals to carry out a non marginal color monogenic wavelet transform. We also give a practical study of this new wavelet transform in the contexts of sparse representations and invariant analysis, which helps to understand the physical interpretation of coefficients and validates the interest of our theoretical construction

    Physical Interaction of Autonomous Robots in Complex Environments

    Get PDF
    Recent breakthroughs in the fields of computer vision and robotics are firmly changing the people perception about robots. The idea of robots that substitute humansisnowturningintorobotsthatcollaboratewiththem. Serviceroboticsconsidersrobotsaspersonalassistants. Itsafelyplacesrobotsindomesticenvironments in order to facilitate humans daily life. Industrial robotics is now reconsidering its basic idea of robot as a worker. Currently, the primary method to guarantee the personnels safety in industrial environments is the installation of physical barriers around the working area of robots. The development of new technologies and new algorithms in the sensor field and in the robotic one has led to a new generation of lightweight and collaborative robots. Therefore, industrial robotics leveraged the intrinsic properties of this kind of robots to generate a robot co-worker that is able to safely coexist, collaborate and interact inside its workspace with both personnels and objects. This Ph.D. dissertation focuses on the generation of a pipeline for fast object pose estimation and distance computation of moving objects,in both structured and unstructured environments,using RGB-D images. This pipeline outputs the command actions which let the robot complete its main task and fulfil the safety human-robot coexistence behaviour at once. The proposed pipeline is divided into an object segmentation part,a 6D.o.F. object pose estimation part and a real-time collision avoidance part for safe human-robot coexistence. Firstly, the segmentation module finds candidate object clusters out of RGB-D images of clutter scenes using a graph-based image segmentation technique. This segmentation technique generates a cluster of pixels for each object found in the image. The candidate object clusters are then fed as input to the 6 D.o.F. object pose estimation module. The latter is in charge of estimating both the translation and the orientation in 3D space of each candidate object clusters. The object pose is then employed by the robotic arm to compute a suitable grasping policy. The last module generates a force vector field of the environment surrounding the robot, the objects and the humans. This force vector field drives the robot toward its goal while any potential collision against objects and/or humans is safely avoided. This work has been carried out at Politecnico di Torino, in collaboration with Telecom Italia S.p.A

    Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene

    Full text link
    The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects represented in terms of shape and pose. We propose a convolutional neural network-based approach to predict this representation and benchmark it on a large dataset of indoor scenes. Our experiments evaluate a number of practical design questions, demonstrate that we can infer this representation, and quantitatively and qualitatively demonstrate its merits compared to alternate representations.Comment: Project url with code: https://shubhtuls.github.io/factored3

    Multimodal Computational Attention for Scene Understanding

    Get PDF
    Robotic systems have limited computational capacities. Hence, computational attention models are important to focus on specific stimuli and allow for complex cognitive processing. For this purpose, we developed auditory and visual attention models that enable robotic platforms to efficiently explore and analyze natural scenes. To allow for attention guidance in human-robot interaction, we use machine learning to integrate the influence of verbal and non-verbal social signals into our models

    Improvised Salient Object Detection and Manipulation

    Full text link
    In case of salient subject recognition, computer algorithms have been heavily relied on scanning of images from top-left to bottom-right systematically and apply brute-force when attempting to locate objects of interest. Thus, the process turns out to be quite time consuming. Here a novel approach and a simple solution to the above problem is discussed. In this paper, we implement an approach to object manipulation and detection through segmentation map, which would help to desaturate or, in other words, wash out the background of the image. Evaluation for the performance is carried out using the Jaccard index against the well-known Ground-truth target box technique.Comment: 7 page

    Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

    Get PDF
    Dans cette thĂšse, nous rĂ©solvons le problĂšme de reconstruire simultanĂ©ment une reprĂ©sentation de la gĂ©omĂ©trie du monde, de la trajectoire de l'observateur, et de la trajectoire des objets mobiles, Ă  l'aide de la vision. Nous divisons le problĂšme en trois Ă©tapes : D'abord, nous donnons une solution au problĂšme de la cartographie et localisation simultanĂ©es pour la vision monoculaire qui fonctionne dans les situations les moins bien conditionnĂ©es gĂ©omĂ©triquement. Ensuite, nous incorporons l'observabilitĂ© 3D instantanĂ©e en dupliquant le matĂ©riel de vision avec traitement monoculaire. Ceci Ă©limine les inconvĂ©nients inhĂ©rents aux systĂšmes stĂ©rĂ©o classiques. Nous ajoutons enfin la dĂ©tection et suivi des objets mobiles proches en nous servant de cette observabilitĂ© 3D. Nous choisissons une reprĂ©sentation Ă©parse et ponctuelle du monde et ses objets. La charge calculatoire des algorithmes de perception est allĂ©gĂ©e en focalisant activement l'attention aux rĂ©gions de l'image avec plus d'intĂ©rĂȘt. ABSTRACT : In this thesis we give new means for a machine to understand complex and dynamic visual scenes in real time. In particular, we solve the problem of simultaneously reconstructing a certain representation of the world's geometry, the observer's trajectory, and the moving objects' structures and trajectories, with the aid of vision exteroceptive sensors. We proceeded by dividing the problem into three main steps: First, we give a solution to the Simultaneous Localization And Mapping problem (SLAM) for monocular vision that is able to adequately perform in the most ill-conditioned situations: those where the observer approaches the scene in straight line. Second, we incorporate full 3D instantaneous observability by duplicating vision hardware with monocular algorithms. This permits us to avoid some of the inherent drawbacks of classic stereo systems, notably their limited range of 3D observability and the necessity of frequent mechanical calibration. Third, we add detection and tracking of moving objects by making use of this full 3D observability, whose necessity we judge almost inevitable. We choose a sparse, punctual representation of both the world and the moving objects in order to alleviate the computational payload of the image processing algorithms, which are required to extract the necessary geometrical information out of the images. This alleviation is additionally supported by active feature detection and search mechanisms which focus the attention to those image regions with the highest interest. This focusing is achieved by an extensive exploitation of the current knowledge available on the system (all the mapped information), something that we finally highlight to be the ultimate key to success

    Autonomous Power Line Inspection with Drones via Perception-Aware MPC

    Get PDF
    Drones have the potential to revolutionize power line inspection by increasing productivity, reducing inspection time, improving data quality, and eliminating the risks for human operators. Current state-of-the-art systems for power line inspection have two shortcomings: (i) control is decoupled from perception and needs accurate information about the location of the power lines and masts; (ii) obstacle avoidance is decoupled from the power line tracking, which results in poor tracking in the vicinity of the power masts, and, consequently, in decreased data quality for visual inspection. In this work, we propose a model predictive controller (MPC) that overcomes these limitations by tightly coupling perception and action. Our controller generates commands that maximize the visibility of the power lines while, at the same time, safely avoiding the power masts. For power line detection, we propose a lightweight learning-based detector that is trained only on synthetic data and is able to transfer zero-shot to real-world power line images. We validate our system in simulation and real-world experiments on a mock-up power line infrastructure. We release our code and datasets to the public
    • 

    corecore