79 research outputs found
Visual-based SLAM configurations for cooperative multi-UAV systems with a lead agent: an observability-based approach
In this work, the problem of the cooperative visual-based SLAM for the class of multi-UA systems that integrates a lead agent has been addressed. In these kinds of systems, a team of aerial robots flying in formation must follow a dynamic lead agent, which can be another aerial robot, vehicle or even a human. A fundamental problem that must be addressed for these kinds of systems
has to do with the estimation of the states of the aerial robots as well as the state of the lead agent.
In this work, the use of a cooperative visual-based SLAM approach is studied in order to solve the above problem. In this case, three different system configurations are proposed and investigated by means of an intensive nonlinear observability analysis. In addition, a high-level control scheme is proposed that allows to control the formation of the UAVs with respect to the lead agent. In this work, several theoretical results are obtained, together with an extensive set of computer simulations which are presented in order to numerically validate the proposal and to show that it can perform well under different circumstances (e.g., GPS-challenging environments). That is, the proposed method is able to operate robustly under many conditions providing a good position estimation of the aerial vehicles and the lead agent as well.Peer ReviewedPostprint (published version
The correlation between vehicle vertical dynamics and deep learning-based visual target state estimation:A sensitivity study
Automated vehicles will provide greater transport convenience and interconnectivity, increase mobility options to young and elderly people, and reduce traffic congestion and emissions. However, the largest obstacle towards the deployment of automated vehicles on public roads is their safety evaluation and validation. Undeniably, the role of cameras and Artificial Intelligence-based (AI) vision is vital in the perception of the driving environment and road safety. Although a significant number of studies on the detection and tracking of vehicles have been conducted, none of them focused on the role of vertical vehicle dynamics. For the first time, this paper analyzes and discusses the influence of road anomalies and vehicle suspension on the performance of detecting and tracking driving objects. To this end, we conducted an extensive road field study and validated a computational tool for performing the assessment using simulations. A parametric study revealed the cases where AI-based vision underperforms and may significantly degrade the safety performance of AV
Learning, Moving, And Predicting With Global Motion Representations
In order to effectively respond to and influence the world they inhabit, animals and other intelligent agents must understand and predict the state of the world and its dynamics. An agent that can characterize how the world moves is better equipped to engage it. Current methods of motion computation rely on local representations of motion (such as optical flow) or simple, rigid global representations (such as camera motion). These methods are useful, but they are difficult to estimate reliably and limited in their applicability to real-world settings, where agents frequently must reason about complex, highly nonrigid motion over long time horizons. In this dissertation, I present methods developed with the goal of building more flexible and powerful notions of motion needed by agents facing the challenges of a dynamic, nonrigid world. This work is organized around a view of motion as a global phenomenon that is not adequately addressed by local or low-level descriptions, but that is best understood when analyzed at the level of whole images and scenes. I develop methods to: (i) robustly estimate camera motion from noisy optical flow estimates by exploiting the global, statistical relationship between the optical flow field and camera motion under projective geometry; (ii) learn representations of visual motion directly from unlabeled image sequences using learning rules derived from a formulation of image transformation in terms of its group properties; (iii) predict future frames of a video by learning a joint representation of the instantaneous state of the visual world and its motion, using a view of motion as transformations of world state. I situate this work in the broader context of ongoing computational and biological investigations into the problem of estimating motion for intelligent perception and action
Plenoptic Signal Processing for Robust Vision in Field Robotics
This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications
Plenoptic Signal Processing for Robust Vision in Field Robotics
This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications
Study and application of motion measurement methods by means of opto-electronics systems - Studio e applicazione di metodi di misura del moto mediante sistemi opto-elettronici
This thesis addresses the problem of localizing a vehicle in unstructured environments through on-board instrumentation that does not require infrastructure modifications.
Two widely used opto-electronic systems which allow for non-contact measurements have been chosen: camera and laser range finder.
Particular attention is paid to the definition of a set of procedures for processing the environment information acquired with the instruments in order to provide both accuracy and robustness to measurement noise.
An important contribute of this work is the development of a robust and reliable algorithm for associating data that has been integrated in a graph based SLAM framework also taking into account uncertainty thus leading to an optimal vehicle motion estimation.
Moreover, the localization of the vehicle can be achieved in a generic environment since the developed global localization solution does not necessarily require the identification of landmarks in the environment, neither natural nor artificial.
Part of the work is dedicated to a thorough comparative analysis of the state-of-the-art scan matching methods in order to choose the best one to be employed in the solution pipeline.
In particular this investigation has highlighted that a dense scan matching approach can ensure good performances in many typical environments.
Several experiments in different environments, also with large scales, denote the effectiveness of the global localization system developed.
While the laser range data have been exploited for the global localization, a robust visual odometry has been investigated.
The results suggest that the use of camera can overcome the situations in which the solution achieved by the laser scanner has a low accuracy.
In particular the global localization framework can be applied also to the camera sensor, in order to perform a sensor fusion between two complementary instrumentations and so obtain a more reliable localization system.
The algorithms have been tested for 2D indoor environments, nevertheless it is expected that they are well suited also for 3D and outdoors
Dense Visual Simultaneous Localisation and Mapping in Collaborative and Outdoor Scenarios
Dense visual simultaneous localisation and mapping (SLAM) systems can produce 3D
reconstructions that are digital facsimiles of the physical space they describe. Systems that
can produce dense maps with this level of fidelity in real time provide foundational spatial
reasoning capabilities for many downstream tasks in autonomous robotics. Over the past
15 years, mapping small scale, indoor environments, such as desks and buildings, with a
single slow moving, hand-held sensor has been one of the central focuses of dense visual
SLAM research.
However, most dense visual SLAM systems exhibit a number of limitations which
mean they cannot be directly applied in collaborative or outdoors settings. The contribution
of this thesis is to address these limitations with the development of new systems and
algorithms for collaborative dense mapping, efficient dense alternation and outdoors
operation with fast camera motion and wide field of view (FOV) cameras. We use
ElasticFusion, a state-of-the-art dense SLAM system, as our starting point where each of
these contributions is implemented as a novel extension to the system.
We first present a collaborative dense SLAM system that allows a number of
cameras starting with unknown initial relative positions to maintain local maps with the
original ElasticFusion algorithm. Visual place recognition across local maps results in
constraints that allow maps to be aligned into a common global reference frame, facilitating
collaborative mapping and tracking of multiple cameras within a shared map.
Within dense alternation based SLAM systems, the standard approach is to fuse
every frame into the dense model without considering whether the information contained
within the frame is already captured by the dense map and therefore redundant. As the
number of cameras or the scale of the map increases, this approach becomes inefficient. In
our second contribution, we address this inefficiency by introducing a novel information
theoretic approach to keyframe selection that allows the system to avoid processing
redundant information. We implement the procedure within ElasticFusion, demonstrating
a marked reduction in the number of frames required by the system to estimate an accurate,
denoised surface reconstruction.
Before dense SLAM techniques can be applied in outdoor scenarios we must
first address their reliance on active depth cameras, and their lack of suitability to fast
camera motion. In our third contribution we present an outdoor dense SLAM system. The system overcomes the need for an active sensor by employing neural network-based depth
inference to predict the geometry of the scene as it appears in each image. To address the
issue of camera tracking during fast motion we employ a hybrid architecture, combining
elements of both dense and sparse SLAM systems to perform camera tracking and to
achieve globally consistent dense mapping.
Automotive applications present a particularly important setting for dense visual
SLAM systems. Such applications are characterised by their use of wide FOV cameras and
are therefore not accurately modelled by the standard pinhole camera model. The fourth
contribution of this thesis is to extend the above hybrid sparse-dense monocular SLAM
system to cater for large FOV fisheye imagery. This is achieved by reformulating the
mapping pipeline in terms of the Kannala-Brandt fisheye camera model. To estimate depth,
we introduce a new version of the PackNet depth estimation neural network (Guizilini et
al., 2020) adapted for fisheye inputs.
To demonstrate the effectiveness of our contributions, we present experimental
results, computed by processing the synthetic ICL-NUIM dataset of Handa et al. (2014) as
well as the real-world TUM-RGBD dataset of Sturm et al. (2012). For outdoor SLAM we
show the results of our system processing the autonomous driving KITTI and KITTI-360
datasets of Geiger et al. (2012a) and Liao et al. (2021) respectively
Advances in Stereo Vision
Stereopsis is a vision process whose geometrical foundation has been known for a long time, ever since the experiments by Wheatstone, in the 19th century. Nevertheless, its inner workings in biological organisms, as well as its emulation by computer systems, have proven elusive, and stereo vision remains a very active and challenging area of research nowadays. In this volume we have attempted to present a limited but relevant sample of the work being carried out in stereo vision, covering significant aspects both from the applied and from the theoretical standpoints
Recommended from our members
Towards secure & robust PNT for automated systems
This dissertation makes four contributions in support of secure and robust position, navigation, and timing (PNT) for automated systems. The first two relate to PNT security while the latter two address robust positioning for automated ground vehicles.
The first contribution is a fundamental theory for provably-secure clock synchronization between two agents in a distributed automated system. All one-way synchronization protocols, such as those based on the Global Positioning System (GPS) and other Global Navigation Satellite Systems (GNSS), are shown to be vulnerable to man-in-the-middle delay attacks. This contribution is the first to identify the necessary and sufficient conditions for provably secure clock synchronization.
The second contribution, also related to PNT security, is a three-year study of the world-wide GPS interference landscape based on data from a dual-frequency GNSS receiver operating continuously on the International Space Station (ISS). This work is the first publicly-reported space-based survey of GNSS interference, and unveils previously-unreported GNSS interference activity.
The third contribution is a novel ground vehicle positioning technique that is robust to GNSS signal blockage, poor lighting conditions, and adverse weather events such as heavy rain and dense fog. The technique relies on sensors that are commonly available on automated vehicles and are insensitive to lighting and inclement weather: automotive radar, low-cost inertial measurement units (IMUs), and GNSS. Remarkably, it is shown that, given a prior radar map, the proposed technique operating on data from off-the-shelf all-weather automotive sensors can maintain sub-50-cm horizontal position accuracy during 60 min of GNSS-denied driving in downtown Austin, TX.
This dissertation’s final contribution is an analysis and demonstration of the feasibility of crowd-sourced digital mapping for automated vehicles. Localization techniques, such as the one described in the previous contribution, rely on such digital maps for accuracy and robustness. A key enabler for large-scale up-to-date maps is enlisting the help of the very consumer vehicles that need the map to build and update it. A method for fusing multi-session vision data into a unified digital map is developed. The asymptotic limit of such a map’s globally-referenced position accuracy is explored for the case in which the mapping agents rely on low-cost GNSS receivers performing standard code-phase-based navigation. Experimental validation along a semi-urban route shows that low-cost consumer vehicles incrementally tighten the accuracy of the jointly-optimized digital map over time enough to support sub-lane-level positioning in a global frame of reference.Electrical and Computer Engineerin
- …