Search CORE

671 research outputs found

Visual 3-D SLAM from UAVs

Author: A.J. Davison
Carol Martínez
D. Fontanelli
D.G. Lowe
I. Parra
Iván F. Mondragón
J. Canny
J. Kim
Jorge Artieda
José M. Sebastian
Juan F. Correa
K.L. Ho
L. Mejias
M. Mikolajczyk
Miguel Olivares
O.M. Mozos
Pascual Campoy
R. Munguia
S. Kim
T. Lemaire
T.W. McLain
Y.-H. Choi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

The aim of the paper is to present, test and discuss the implementation of Visual SLAM techniques to images taken from Unmanned Aerial Vehicles (UAVs) outdoors, in partially structured environments. Every issue of the whole process is discussed in order to obtain more accurate localization and mapping from UAVs flights. Firstly, the issues related to the visual features of objects in the scene, their distance to the UAV, and the related image acquisition system and their calibration are evaluated for improving the whole process. Other important, considered issues are related to the image processing techniques, such as interest point detection, the matching procedure and the scaling factor. The whole system has been tested using the COLIBRI mini UAV in partially structured environments. The results that have been obtained for localization, tested against the GPS information of the flights, show that Visual SLAM delivers reliable localization and mapping that makes it suitable for some outdoors applications when flying UAVs

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Digital.CSIC

Open Repository and Bibliography - Luxembourg

Archivo Digital UPM

Vector extension of monogenic wavelets for geometric representation of color images

Author: Carré Philippe
Fernandez-Maloigne Christine
Soulard Raphaël
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2013
Field of study

14 pagesInternational audienceMonogenic wavelets offer a geometric representation of grayscale images through an AM/FM model allowing invariance of coefficients to translations and rotations. The underlying concept of local phase includes a fine contour analysis into a coherent unified framework. Starting from a link with structure tensors, we propose a non-trivial extension of the monogenic framework to vector-valued signals to carry out a non marginal color monogenic wavelet transform. We also give a practical study of this new wavelet transform in the contexts of sparse representations and invariant analysis, which helps to understand the physical interpretation of coefficients and validates the interest of our theoretical construction

Crossref

HAL-UNILIM

Physical Interaction of Autonomous Robots in Complex Environments

Author: Toscana Giorgio
Publication venue: Politecnico di Torino
Publication date
Field of study

Recent breakthroughs in the fields of computer vision and robotics are firmly changing the people perception about robots. The idea of robots that substitute humansisnowturningintorobotsthatcollaboratewiththem. Serviceroboticsconsidersrobotsaspersonalassistants. Itsafelyplacesrobotsindomesticenvironments in order to facilitate humans daily life. Industrial robotics is now reconsidering its basic idea of robot as a worker. Currently, the primary method to guarantee the personnels safety in industrial environments is the installation of physical barriers around the working area of robots. The development of new technologies and new algorithms in the sensor field and in the robotic one has led to a new generation of lightweight and collaborative robots. Therefore, industrial robotics leveraged the intrinsic properties of this kind of robots to generate a robot co-worker that is able to safely coexist, collaborate and interact inside its workspace with both personnels and objects. This Ph.D. dissertation focuses on the generation of a pipeline for fast object pose estimation and distance computation of moving objects,in both structured and unstructured environments,using RGB-D images. This pipeline outputs the command actions which let the robot complete its main task and fulfil the safety human-robot coexistence behaviour at once. The proposed pipeline is divided into an object segmentation part,a 6D.o.F. object pose estimation part and a real-time collision avoidance part for safe human-robot coexistence. Firstly, the segmentation module finds candidate object clusters out of RGB-D images of clutter scenes using a graph-based image segmentation technique. This segmentation technique generates a cluster of pixels for each object found in the image. The candidate object clusters are then fed as input to the 6 D.o.F. object pose estimation module. The latter is in charge of estimating both the translation and the orientation in 3D space of each candidate object clusters. The object pose is then employed by the robotic arm to compute a suitable grasping policy. The last module generates a force vector field of the environment surrounding the robot, the objects and the humans. This force vector field drives the robot toward its goal while any potential collision against objects and/or humans is safely avoided. This work has been carried out at Politecnico di Torino, in collaboration with Telecom Italia S.p.A

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene

Author: Efros Alexei A.
Fouhey David
Gupta Saurabh
Malik Jitendra
Tulsiani Shubham
Publication venue
Publication date: 24/04/2018
Field of study

The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects represented in terms of shape and pose. We propose a convolutional neural network-based approach to predict this representation and benchmark it on a large dataset of indoor scenes. Our experiments evaluate a number of practical design questions, demonstrate that we can infer this representation, and quantitatively and qualitatively demonstrate its merits compared to alternate representations.Comment: Project url with code: https://shubhtuls.github.io/factored3

arXiv.org e-Print Archive

Crossref

Multimodal Computational Attention for Scene Understanding

Author: Schauerte Boris
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2014
Field of study

Robotic systems have limited computational capacities. Hence, computational attention models are important to focus on specific stimuli and allow for complex cognitive processing. For this purpose, we developed auditory and visual attention models that enable robotic platforms to efficiently explore and analyze natural scenes. To allow for attention guidance in human-robot interaction, we use machine learning to integrate the influence of verbal and non-verbal social signals into our models

KITopen

Improvised Salient Object Detection and Manipulation

Author: Maity Abhishek
Publication venue
Publication date: 10/11/2015
Field of study

In case of salient subject recognition, computer algorithms have been heavily relied on scanning of images from top-left to bottom-right systematically and apply brute-force when attempting to locate objects of interest. Thus, the process turns out to be quite time consuming. Here a novel approach and a simple solution to the above problem is discussed. In this paper, we implement an approach to object manipulation and detection through segmentation map, which would help to desaturate or, in other words, wash out the background of the image. Evaluation for the performance is carried out using the Jaccard index against the well-known Ground-truth target box technique.Comment: 7 page

arXiv.org e-Print Archive

ZENODO

Motion Estimation and 3D Reconstruction from Video Sequences - Stime di Moto e Ricostruzione 3D da Sequenze Video

Author: Fanfani Marco
Publication venue
Publication date: 01/01/2015
Field of study

Florence Research

Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

Author: Sola Joan
Publication venue: Institut National Polytechnique de Toulouse
Publication date: 02/02/2007
Field of study

Dans cette thèse, nous résolvons le problème de reconstruire simultanément une représentation de la géométrie du monde, de la trajectoire de l'observateur, et de la trajectoire des objets mobiles, à l'aide de la vision. Nous divisons le problème en trois étapes : D'abord, nous donnons une solution au problème de la cartographie et localisation simultanées pour la vision monoculaire qui fonctionne dans les situations les moins bien conditionnées géométriquement. Ensuite, nous incorporons l'observabilité 3D instantanée en dupliquant le matériel de vision avec traitement monoculaire. Ceci élimine les inconvénients inhérents aux systèmes stéréo classiques. Nous ajoutons enfin la détection et suivi des objets mobiles proches en nous servant de cette observabilité 3D. Nous choisissons une représentation éparse et ponctuelle du monde et ses objets. La charge calculatoire des algorithmes de perception est allégée en focalisant activement l'attention aux régions de l'image avec plus d'intérêt. ABSTRACT : In this thesis we give new means for a machine to understand complex and dynamic visual scenes in real time. In particular, we solve the problem of simultaneously reconstructing a certain representation of the world's geometry, the observer's trajectory, and the moving objects' structures and trajectories, with the aid of vision exteroceptive sensors. We proceeded by dividing the problem into three main steps: First, we give a solution to the Simultaneous Localization And Mapping problem (SLAM) for monocular vision that is able to adequately perform in the most ill-conditioned situations: those where the observer approaches the scene in straight line. Second, we incorporate full 3D instantaneous observability by duplicating vision hardware with monocular algorithms. This permits us to avoid some of the inherent drawbacks of classic stereo systems, notably their limited range of 3D observability and the necessity of frequent mechanical calibration. Third, we add detection and tracking of moving objects by making use of this full 3D observability, whose necessity we judge almost inevitable. We choose a sparse, punctual representation of both the world and the moving objects in order to alleviate the computational payload of the image processing algorithms, which are required to extract the necessary geometrical information out of the images. This alleviation is additionally supported by active feature detection and search mechanisms which focus the attention to those image regions with the highest interest. This focusing is achieved by an extensive exploitation of the current knowledge available on the system (all the mapped information), something that we finally highlight to be the ultimate key to success

Thèses en Ligne

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Institut National Polytechnique de Toulouse (Theses)

HAL-INSA Toulouse

Autonomous Power Line Inspection with Drones via Perception-Aware MPC

Author: Cioffi Giovanni
Hidalgo-Carrio Javier
Scaramuzza Davide
Xing Jiaxu
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 05/10/2023
Field of study

Drones have the potential to revolutionize power line inspection by increasing productivity, reducing inspection time, improving data quality, and eliminating the risks for human operators. Current state-of-the-art systems for power line inspection have two shortcomings: (i) control is decoupled from perception and needs accurate information about the location of the power lines and masts; (ii) obstacle avoidance is decoupled from the power line tracking, which results in poor tracking in the vicinity of the power masts, and, consequently, in decreased data quality for visual inspection. In this work, we propose a model predictive controller (MPC) that overcomes these limitations by tightly coupling perception and action. Our controller generates commands that maximize the visibility of the power lines while, at the same time, safely avoiding the power masts. For power line detection, we propose a lightweight learning-based detector that is trained only on synthetic data and is able to transfer zero-shot to real-world power line images. We validate our system in simulation and real-world experiments on a mock-up power line infrastructure. We release our code and datasets to the public

ZORA