Search CORE

85 research outputs found

Visual SLAM and scale estimation from omnidirectional wearable vision

Author: Guerrero Campo José Jesús
Gutiérrez Gómez Daniel
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2012
Field of study

La resolución del problema de Localización y Mapeado Simultáneos (SLAM) con sistemas de visión permite reconstruir un mapa del entorno a partir de medidas extraídas de imágenes y, al mismo tiempo, estimar la trayectoria u odometría visual de la cámara. En los último años el SLAM visual ha sido uno de los problemas más tratados en el campo de la visión por computador y ha sido abordado tanto con sistemas estéreo como monoculares. Los sistemas estéreo tienen la característica de que conocida la distancia entre las cámaras se pueden triangular los puntos observados y por lo tanto, es posible obtener una estimación tridimensional completa de la posición de los mismos. Por el contrario, los sistemas monoculares, al no poderse medir la profundidad a partir de una sola imagen, permiten solamente una reconstrucción tridimensional con una ambigüedad en la escala. Además, como es frecuente en la resolución del problema de SLAM, el uso de filtros probabilísticos que procesan las imágenes de forma secuencial, da lugar a otro problema más alla de una ambigüedad de escala. Se trata de la existencia de una deriva en la escala que hace que esta no sea constate durante en toda la reconstrucción, y que da lugar a una deformación gradual en la reconstrucción final a medida que el mapa crece. Dado el interés en el uso de dichos sensores por su bajo coste, su universalidad y su facilidad de calibración existen varios trabajos que proponen resolver dicho problema; bien utilizando otros sensores de bajo coste como IMUs, o sensores de odometría disponibles en los vehículos con ruedas; bien sin necesidad de sensores adicionales a partir de algún tipo de medida conocida a priori como la distancia de la cámara al suelo o al eje de rotación del vehículo. De entre los trabajos mencionados, la mayoría se centran en cámaras acopladas a vehículos con ruedas. Las técnicas descritas en los mismos son dificilmente aplicables a una cámara llevada por una persona, debido en primer lugar a la imposibilidad de obtener medidas de odometría, y en segundo lugar, por el modelo más complejo de movimiento. En este TFM se recoge y se amplia el trabajo presentado en el artículo ``Full Scaled 3D Visual Odometry From a Single Wearable Omnidirectional Camera'' enviado y aceptado para su publicación en el próximo ``IEEE International Conference on Intelligent Robots and Sytems (IROS)''. En él se presenta un algoritmo para estimar la escala real de la odometría visual de una persona a partir de la estimación SLAM obtenida con una cámara omnidireccional catadióptrica portable y sin necesidad de usar sensores adicionales. La información a priori para la estimación en la escala viene dada por una ley empírica que relaciona directamente la velocidad al caminar con la frecuencia de paso o, dicho de otra forma equivalente, define la longitud de zancada como una función de la frecuencia de paso. Dicha ley está justificada en una tendencia de la persona a elegir una frecuencia de paso que minimiza el coste metabólico para una velocidad dada. La trayectoria obtenida por SLAM se divide en secciones, calculándose un factor de escala en cada sección. Para estimar dicho factor de escala, en primer lugar se estima la frecuencia de paso mediante análisis espectral de la señal correspondiente a la componente

z

de los estados de la cámara de la sección actual. En segundo lugar se calcula la velocidad de paso mediante la relación empírica descrita anteriormente. Esta medida de velocidad real, así como el promedio de la velocidad absoluta de los estados contenidos en la sección, se incluyen dentro de un filtro de partículas para el cálculo final del factor de escala. Dicho factor de escala se aplica a la correspondiente sección mediante una fórmula recursiva que asegura la continuidad en posición y velocidad. Sobre este algoritmo básico se han introducido mejoras para disminuir el retraso entre la actualización de secciones de la trayectoria, así como para ser capaces de descartar medidas erróneas de la frecuencia de paso y detectar zonas o situaciones, como la presencia de escaleras, donde el modelo empírico utilizado para estimar la velocidad de paso no sería aplicable. Además, dado que inicialmente se implementó el algoritmo en MATLAB, aplicándose offline a la estimación de trayectoria completa desde la aplicación SLAM, se ha realizado también su implementación en C++ como un módulo dentro de esta aplicación para trabajar en tiempo real conjuntamente con el algoritmo de SLAM principal. Los experimentos se han llevado a cabo con secuencias tomadas tanto en exteriores como en interiores dentro del Campus Río Ebro de la Universida dde Zaragoza. En ellos se compara la estimación de la trayectoria a escala real obtenida mediante nuestro método con el Ground Truth obtenido de las imágenes por satélite de Google Maps. Los resultados de los experimentos muestran que se llega a alcanzar un error medio de hasta menos de 2 metros a lo largo de recorridos de 232 metros. Además se aprecia como es capaz de corregir una deriva de escala considerable en la estimación inicial de la trayectoria sin escalar. El trabajo realizado en el presente TFM utiliza el realizado durante mi Proyecto de Fin de Carrera, "Localización por Visión Omnidireccional para Asistencia Personal", con una beca de Iniciación a la Investigación del I3A y defendido en septiembre de 2011. En dicho proyecto se adaptó una completa aplicación C++ de SLAM en tiempo real con cámaras convencionales para ser usada con cámaras omnidireccionales de tipo catadióptrico. Para ello se realizaron modificaciones sobre dos aspectos básicos: el modelo de proyección y las transformaciones aplicadas a los descriptores de los puntos característicos. Fruto de ese trabajo se realizó una publicación, "Adapting a Real-Time Monocular Visual SLAM from Conventional to Omnidirectional Cameras" en el ``11th OMNIVIS'' celebrado dentro del ICCV 2011

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Universidad de Zaragoza

Contributions to Real-time Metric Localisation with Wearable Vision Systems

Author: Gutierrez Gómez Daniel
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2016
Field of study

Under the rapid development of electronics and computer science in the last years, cameras have becomeomnipresent nowadays, to such extent that almost everybody is able to carry one at all times embedded intotheir cellular phone. What makes cameras specially appealing for us is their ability to quickly capture a lot ofinformation of the environment encoded in one image or video, allowing us to immortalize special moments inour life or share reliable visual information of the environment with other persons. However, while the task ofextracting the information from an image may by trivial for us, in the case of computers complex algorithmswith a high computational burden are required to transform a raw image into useful information. In this sense, the same rapid development in computer science that allowed the widespread of cameras has enabled also the possibility of real-time application of previously practically infeasible algorithms.Among the current fields of research in the computer vision community, this thesis is specially concerned inmetric localisation and mapping algorithms. These algorithms are a key component in many practical applications such as robot navigation, augmented reality or reconstructing 3D models of the environment.The goal of this thesis is to delve into visual localisation and mapping from vision, paying special attentionto conventional and unconventional cameras which can be easily worn or handled by a human. In this thesis Icontribute in the following aspects of the visual odometry and SLAM (Simultaneous Localisation and Mapping)pipeline:- Generalised Monocular SLAM for catadioptric central cameras- Resolution of the scale problem in monocular vision- Dense RGB-D odometry- Robust place recognition- Pose-graph optimisatio

Directory of Open Access Journals

Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)

Diposit Digital de Documents de la UAB

Cartographie hybride pour des environnements de grande taille

Author: Üzer Ferit
Publication venue: HAL CCSD
Publication date: 02/03/2016
Field of study

In this thesis, a novel vision based hybrid mapping framework which exploits metric, topological and semantic information is presented. We aim to obtain better computational efficiency than pure metrical mapping techniques, better accuracy as well as usability for robot guidance compared to the topological mapping. A crucial step of any mapping system is the loop closure detection which is the ability of knowing if the robot is revisiting a previously mapped area. Therefore, we first propose a hierarchical loop closure detection framework which also constructs the global topological structure of our hybrid map. Using this loop closure detection module, a hybrid mapping framework is proposed in two step. The first step can be understood as a topo-metric map with nodes corresponding to certain regions in the environment. Each node in turn is made up of a set of images acquired in that region. These maps are further augmented with metric information at those nodes which correspond to image sub-sequences acquired while the robot is revisiting the previously mapped area. The second step augments this model by using road semantics. A Conditional Random Field based classification on the metric reconstruction is used to semantically label the local robot path (road in our case) as straight, curved or junctions. Metric information of regions with curved roads and junctions is retained while that of other regions is discarded in the final map. Loop closure is performed only on junctions thereby increasing the efficiency and also accuracy of the map. By incorporating all of these new algorithms, the hybrid framework presented can perform as a robust, scalable SLAM approach, or act as a main part of a navigation tool which could be used on a mobile robot or an autonomous car in outdoor urban environments. Experimental results obtained on public datasets acquired in challenging urban environments are provided to demonstrate our approach.Dans cette thèse, nous présentons une nouvelle méthode de cartographie visuelle hybride qui exploite des informations métriques, topologiques et sémantiques. Notre but est de réduire le coût calculatoire par rapport à des techniques de cartographie purement métriques. Comparé à de la cartographie topologique, nous voulons plus de précision ainsi que la possibilité d’utiliser la carte pour le guidage de robots. Cette méthode hybride de construction de carte comprend deux étapes. La première étape peut être vue comme une carte topo-métrique avec des nœuds correspondants à certaines régions de l’environnement. Ces cartes sont ensuite complétées avec des données métriques aux nœuds correspondant à des sous-séquences d’images acquises quand le robot revenait dans des zones préalablement visitées. La deuxième étape augmente ce modèle en ajoutant des informations sémantiques. Une classification est effectuée sur la base des informations métriques en utilisant des champs de Markov conditionnels (CRF) pour donner un label sémantique à la trajectoire locale du robot (la route dans notre cas) qui peut être "doit", "virage" ou "intersection". L’information métrique des secteurs de route en virage ou en intersection est conservée alors que la métrique des lignes droites est effacée de la carte finale. La fermeture de boucle n’est réalisée que dans les intersections ce qui accroît l’efficacité du calcul et la précision de la carte. En intégrant tous ces nouveaux algorithmes, cette méthode hybride est robuste et peut être étendue à des environnements de grande taille. Elle peut être utilisée pour la navigation d’un robot mobile ou d’un véhicule autonome en environnement urbain. Nous présentons des résultats expérimentaux obtenus sur des jeux de données publics acquis en milieu urbain pour démontrer l’efficacité de l’approche proposée

Thèses en Ligne

HAL Clermont Université

Theses.fr

Distributed scene reconstruction from multiple mobile platforms

Author: Cavestany Pedro
Publication venue: Cranfield University
Publication date: 01/05/2015
Field of study

Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed

Cranfield CERES

Recommended from our members

Generating Absolute-Scale Point Cloud Data of Built Infrastructure Scenes Using a Monocular Camera Setting

Author: Brilakis Ioannis
Rashidi Abbas
Vela Patricio
Publication venue: JOURNAL OF COMPUTING IN CIVIL ENGINEERING
Publication date: 21/07/2014
Field of study

The global scale of Point Cloud Data (PCD) generated through monocular photo/videogrammetry is unknown, and can be calculated using at least one known dimension of the scene. Measuring one or more dimensions for this purpose induces a manual step in the 3D reconstruction process; this increases the effort and reduces the speed of reconstructing scenes, and induces substantial human error in the process due to the high level of measurement accuracy needed. Other ways of measuring such dimensions are based on acquiring additional information by either using extra sensors or specific classes of objects existing in the scene; we found that these solutions are not simple, cost effective or general enough to be considered practical for reconstructing both indoor and outdoor built infrastructure scenes. To address the issue, in this paper, we propose a novel method for automatically calculating the absolute scale of built infrastructure PCD. We use a pre-measured cube for outdoor scenes and a sheet of paper for indoor environments as the calibration patterns. Assuming that the dimensions of these objects are known, the proposed method extracts the objects’ corner points in 2D video frames using a novel algorithm. The extracted corner points are then matched between the consecutive frames. Finally, the corresponding corner points are reconstructed along with other features of the scenes to determine the real world scale. To evaluate the performance of the method, ten indoor and ten outdoor cases were selected and the absolute-scale PCD for each case was computed. Results illustrated the proposed algorithm is able to reconstruct the predefined objects with a high success rate while the generated absolute scale PCD is sufficiently accurate.This is the accepted manuscript. The final version is available from ASCE at http://dx.doi.org/10.1061/(ASCE)CP.1943-5487.000041

Apollo (Cambridge)

State of the art in vision-based localization techniques for autonomous navigation systems

Author: Alkendi Yusra
Seneviratne Lakmal
Zweiri Yahya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Kingston University Research Repository

Automatic Food Intake Assessment Using Camera Phones

Author: Kong Fanyu
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/01/2012
Field of study

Obesity is becoming an epidemic phenomenon in most developed countries. The fundamental cause of obesity and overweight is an energy imbalance between calories consumed and calories expended. It is essential to monitor everyday food intake for obesity prevention and management. Existing dietary assessment methods usually require manually recording and recall of food types and portions. Accuracy of the results largely relies on many uncertain factors such as user\u27s memory, food knowledge, and portion estimations. As a result, the accuracy is often compromised. Accurate and convenient dietary assessment methods are still blank and needed in both population and research societies. In this thesis, an automatic food intake assessment method using cameras, inertial measurement units (IMUs) on smart phones was developed to help people foster a healthy life style. With this method, users use their smart phones before and after a meal to capture images or videos around the meal. The smart phone will recognize food items and calculate the volume of the food consumed and provide the results to users. The technical objective is to explore the feasibility of image based food recognition and image based volume estimation. This thesis comprises five publications that address four specific goals of this work: (1) to develop a prototype system with existing methods to review the literature methods, find their drawbacks and explore the feasibility to develop novel methods; (2) based on the prototype system, to investigate new food classification methods to improve the recognition accuracy to a field application level; (3) to design indexing methods for large-scale image database to facilitate the development of new food image recognition and retrieval algorithms; (4) to develop novel convenient and accurate food volume estimation methods using only smart phones with cameras and IMUs. A prototype system was implemented to review existing methods. Image feature detector and descriptor were developed and a nearest neighbor classifier were implemented to classify food items. A reedit card marker method was introduced for metric scale 3D reconstruction and volume calculation. To increase recognition accuracy, novel multi-view food recognition algorithms were developed to recognize regular shape food items. To further increase the accuracy and make the algorithm applicable to arbitrary food items, new food features, new classifiers were designed. The efficiency of the algorithm was increased by means of developing novel image indexing method in large-scale image database. Finally, the volume calculation was enhanced through reducing the marker and introducing IMUs. Sensor fusion technique to combine measurements from cameras and IMUs were explored to infer the metric scale of the 3D model as well as reduce noises from these sensors

Michigan Technological University