Search CORE

1,123 research outputs found

Semantic Mapping of Road Scenes

Author: Sengupta S
Publication venue: 'Oxford Brookes University'
Publication date: 01/01/2014
Field of study

The problem of understanding road scenes has been on the fore-front in the computer vision community for the last couple of years. This enables autonomous systems to navigate and understand the surroundings in which it operates. It involves reconstructing the scene and estimating the objects present in it, such as ‘vehicles’, ‘road’, ‘pavements’ and ‘buildings’. This thesis focusses on these aspects and proposes solutions to address them. First, we propose a solution to generate a dense semantic map from multiple street-level images. This map can be imagined as the bird’s eye view of the region with associated semantic labels for ten’s of kilometres of street level data. We generate the overhead semantic view from street level images. This is in contrast to existing approaches using satellite/overhead imagery for classification of urban region, allowing us to produce a detailed semantic map for a large scale urban area. Then we describe a method to perform large scale dense 3D reconstruction of road scenes with associated semantic labels. Our method fuses the depth-maps in an online fashion, generated from the stereo pairs across time into a global 3D volume, in order to accommodate arbitrarily long image sequences. The object class labels estimated from the street level stereo image sequence are used to annotate the reconstructed volume. Then we exploit the scene structure in object class labelling by performing inference over the meshed representation of the scene. By performing labelling over the mesh we solve two issues: Firstly, images often have redundant information with multiple images describing the same scene. Solving these images separately is slow, where our method is approximately a magnitude faster in the inference stage compared to normal inference in the image domain. Secondly, often multiple images, even though they describe the same scene result in inconsistent labelling. By solving a single mesh, we remove the inconsistency of labelling across the images. Also our mesh based labelling takes into account of the object layout in the scene, which is often ambiguous in the image domain, thereby increasing the accuracy of object labelling. Finally, we perform labelling and structure computation through a hierarchical robust PN Markov Random Field defined on voxels and super-voxels given by an octree. This allows us to infer the 3D structure and the object-class labels in a principled manner, through bounded approximate minimisation of a well defined and studied energy functional. In this thesis, we also introduce two object labelled datasets created from real world data. The 15 kilometre Yotta Labelled dataset consists of 8,000 images per camera view of the roadways of the United Kingdom with a subset of them annotated with object class labels and the second dataset is comprised of ground truth object labels for the publicly available KITTI dataset. Both the datasets are available publicly and we hope will be helpful to the vision research community

Oxford Brookes University: RADAR

Seamless Navigation, 3D Reconstruction, Thermographic and Semantic Mapping for Building Inspection

Author: Baumbach Dirk
Dahlke Dennis
Ernst Ines
Linkiewicz Magdalena
Schischmanow Adrian
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

We present a workflow for seamless real-time navigation and 3D thermal mapping in combined indoor and outdoor environments in a global reference frame. The automated workflow and partly real-time capabilities are of special interest for inspection tasks and also for other time-critical applications. We use a hand-held integrated positioning system (IPS), which is a real-time capable visual-aided inertial navigation technology, and augment it with an additional passive thermal infrared camera and global referencing capabilities. The global reference is realized through surveyed optical markers (AprilTags). Due to the sensor data’s fusion of the stereo camera and the thermal images, the resulting georeferenced 3D point cloud is enriched with thermal intensity values. A challenging calibration approach is used to geometrically calibrate and pixel-co-register the trifocal camera system. By fusing the terrestrial dataset with additional geographic information from an unmanned aerial vehicle, we gain a complete building hull point cloud and automatically reconstruct a semantic 3D model. A single-family house with surroundings in the village of Morschenich near the city of Jülich (German federal state North Rhine-Westphalia) was used as a test site to demonstrate our workflow. The presented work is a step towards automated building information modeling

Institute of Transport Research:Publications

PubMed Central

Novel Pole Photogrammetric System for Low-Cost Documentation of Archaeological Sites: The Case Study of “Cueva Pintada”

Author: Del Pozo Susana
González-Aguilera Diego
Guerrero-Sevilla Diego
Hernández-López David
Onrubia-Pintado Jorge
Rodríguez-Gonzálvez Pablo
Publication venue: 'MDPI AG'
Publication date: 24/05/2022
Field of study

19 p.Close-range photogrammetry is a powerful and widely used technique for 3D reconstruction of archaeological environments, specifically when a high-level detail is required. This paper presents an innovative low-cost system that allows high quality and detailed reconstructions of indoor complex scenarios with unfavorable lighting conditions by means of close-range nadir and oblique images as an alternative to drone acquisitions for those places where the use of drones is limited or discouraged: (i) indoor scenarios in which both loss of GNSS signal and need of long exposure times occur, (ii) scenarios with risk of raising dust in suspension due to the proximity to the ground and (iii) complex scenarios with variability in the presence of nooks and vertical elements of different heights. The low-altitude aerial view reached with this system allows high-quality 3D documentation of complex scenarios helped by its ergonomic design, self-stability, lightness, and flexibility of handling. In addition, its interchangeable and remote-control support allows to board different sensors and perform both acquisitions that follow the ideal photogrammetric epipolar geometry but also acquisitions with geometry variations that favor a more complete and reliable reconstruction by avoiding occlusions. This versatile pole photogrammetry system has been successfully used to 3D reconstruct and document the “Cueva Pintada” archaeological site located in Gran Canaria (Spain), of approximately 5400 m2 with a Canon EOS 5D MARK II SLR digital camera. As final products: (i) a great quality photorealistic 3D model of 1.47 mm resolution and ±8.4 mm accuracy, (ii) detailed orthophotos of the main assets of the archaeological remains and (iii) a visor 3D with associated information on the structures, materials and plans of the site were obtained.S

Leon University (Spain)

Recent Advances in Image Restoration with Applications to Real World Problems

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

In the past few decades, imaging hardware has improved tremendously in terms of resolution, making widespread usage of images in many diverse applications on Earth and planetary missions. However, practical issues associated with image acquisition are still affecting image quality. Some of these issues such as blurring, measurement noise, mosaicing artifacts, low spatial or spectral resolution, etc. can seriously affect the accuracy of the aforementioned applications. This book intends to provide the reader with a glimpse of the latest developments and recent advances in image restoration, which includes image super-resolution, image fusion to enhance spatial, spectral resolution, and temporal resolutions, and the generation of synthetic images using deep learning techniques. Some practical applications are also included

Directory of Open Access Books (DOAB)

Human robot interaction in a crowded environment

Author: Valibeik Salman
Valibeik Salman
Publication venue: Medicine, Imperial College London
Publication date: 01/06/2010
Field of study

Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]

Spiral - Imperial College Digital Repository

Deep Learning Approaches Assessment for Underwater Scene Understanding and Egomotion Estimation

Author: Bernardo Gomes Teixeira
Publication venue
Publication date: 13/09/2019
Field of study

Repositório Aberto da Universidade do Porto

Ricerche di Geomatica 2011

Author: Villa B
Publication venue: place:Mestre VE)
Publication date: 01/01/2011
Field of study

Questo volume raccoglie gli articoli che hanno partecipato al Premio AUTeC 2011. Il premio è stato istituito nel 2005. Viene conferito ogni anno ad una tesi di Dottorato giudicata particolarmente significativa sui temi di pertinenza del SSD ICAR/06 (Topografia e Cartografia) nei diversi Dottorati attivi in Italia

Archivio istituzionale della ricerca - Università di Palermo

Robust Visual Odometry and Dynamic Scene Modelling

Author: Zhang Jun
Publication venue
Publication date: 01/01/2021
Field of study

Image-based estimation of camera trajectory, known as visual odometry (VO), has been a popular solution for robot navigation in the past decade due to its low-cost and widely applicable properties. The problem of tracking self-motion as well as motion of objects in the scene using information from a camera is known as multi-body visual odometry and is a challenging task. The performance of VO is heavily sensitive to poor imaging conditions (i.e., direct sunlight, shadow and image blur), which limits its feasibility in many challenging scenarios. Current VO solutions can provide accurate camera motion estimation in largely static scene. However, the deployment of robotic systems in our daily lives requires systems to work in significantly more complex, dynamic environment. This thesis aims to develop robust VO solutions against two challenging cases, underwater and highly dynamic environments, by extensively analyzing and overcoming the difficulties in both cases to achieve accurate ego-motion estimation. Furthermore, to better understand and exploit dynamic scene information, this thesis also investigates the motion of moving objects in dynamic scene, and presents a novel way to integrate ego and object motion estimation into a single framework. In particular, the problem of VO in underwater is challenging due to poor imaging condition and inconsistent motion caused by water flow. This thesis intensively tests and evaluates possible solutions to the mentioned issues, and proposes a stereo underwater VO system that is able to robustly and accurately localize the autonomous underwater vehicle (AUV). Visual odometry in dynamic environment is challenging because dynamic parts of the scene violate the static world assumption fundamental in most classical visual odometry algorithms. If moving parts of a scene dominate the static scene, off-the-shelf VO systems either fail completely or return poor quality trajectory estimation. Most existing techniques try to simplify the problem by removing dynamic information. Arguably, in most scenarios, the dynamics corresponds to a finite number of individual objects that are rigid or piecewise rigid, and their motions can be tracked and estimated in the same way as the ego-motion. With this consideration, the thesis proposes a brand new way to model and estimate object motion, and introduces a novel multi-body VO system that addresses the problem of tracking of both ego and object motion in dynamic outdoor scenes. Based on the proposed multi-body VO framework, this thesis also exploits the spatial and temporal relationships between the camera and object motions, as well as static and dynamic structures, to obtain more consistent and accurate estimations. To this end, the thesis introduces a novel visual dynamic object-aware SLAM system, that is able to achieve robust multiple moving objects tracking, accurate estimation of full SE(3) object motions, and extract inherent linear velocity information of moving objects, along with an accurate robot localisation and mapping of environment structure. The performance of the proposed system is demonstrated on real datasets, showing its capability to resolve rigid object motion estimation and yielding results that outperform state-of-the-art algorithms by an order of magnitude in urban driving scenarios

The Australian National University

Visual and Camera Sensors

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

This book includes 13 papers published in Special Issue ("Visual and Camera Sensors") of the journal Sensors. The goal of this Special Issue was to invite high-quality, state-of-the-art research papers dealing with challenging issues in visual and camera sensors

Directory of Open Access Books (DOAB)