79 research outputs found

    Visual-based SLAM configurations for cooperative multi-UAV systems with a lead agent: an observability-based approach

    Get PDF
    In this work, the problem of the cooperative visual-based SLAM for the class of multi-UA systems that integrates a lead agent has been addressed. In these kinds of systems, a team of aerial robots flying in formation must follow a dynamic lead agent, which can be another aerial robot, vehicle or even a human. A fundamental problem that must be addressed for these kinds of systems has to do with the estimation of the states of the aerial robots as well as the state of the lead agent. In this work, the use of a cooperative visual-based SLAM approach is studied in order to solve the above problem. In this case, three different system configurations are proposed and investigated by means of an intensive nonlinear observability analysis. In addition, a high-level control scheme is proposed that allows to control the formation of the UAVs with respect to the lead agent. In this work, several theoretical results are obtained, together with an extensive set of computer simulations which are presented in order to numerically validate the proposal and to show that it can perform well under different circumstances (e.g., GPS-challenging environments). That is, the proposed method is able to operate robustly under many conditions providing a good position estimation of the aerial vehicles and the lead agent as well.Peer ReviewedPostprint (published version

    The correlation between vehicle vertical dynamics and deep learning-based visual target state estimation:A sensitivity study

    Get PDF
    Automated vehicles will provide greater transport convenience and interconnectivity, increase mobility options to young and elderly people, and reduce traffic congestion and emissions. However, the largest obstacle towards the deployment of automated vehicles on public roads is their safety evaluation and validation. Undeniably, the role of cameras and Artificial Intelligence-based (AI) vision is vital in the perception of the driving environment and road safety. Although a significant number of studies on the detection and tracking of vehicles have been conducted, none of them focused on the role of vertical vehicle dynamics. For the first time, this paper analyzes and discusses the influence of road anomalies and vehicle suspension on the performance of detecting and tracking driving objects. To this end, we conducted an extensive road field study and validated a computational tool for performing the assessment using simulations. A parametric study revealed the cases where AI-based vision underperforms and may significantly degrade the safety performance of AV

    Learning, Moving, And Predicting With Global Motion Representations

    Get PDF
    In order to effectively respond to and influence the world they inhabit, animals and other intelligent agents must understand and predict the state of the world and its dynamics. An agent that can characterize how the world moves is better equipped to engage it. Current methods of motion computation rely on local representations of motion (such as optical flow) or simple, rigid global representations (such as camera motion). These methods are useful, but they are difficult to estimate reliably and limited in their applicability to real-world settings, where agents frequently must reason about complex, highly nonrigid motion over long time horizons. In this dissertation, I present methods developed with the goal of building more flexible and powerful notions of motion needed by agents facing the challenges of a dynamic, nonrigid world. This work is organized around a view of motion as a global phenomenon that is not adequately addressed by local or low-level descriptions, but that is best understood when analyzed at the level of whole images and scenes. I develop methods to: (i) robustly estimate camera motion from noisy optical flow estimates by exploiting the global, statistical relationship between the optical flow field and camera motion under projective geometry; (ii) learn representations of visual motion directly from unlabeled image sequences using learning rules derived from a formulation of image transformation in terms of its group properties; (iii) predict future frames of a video by learning a joint representation of the instantaneous state of the visual world and its motion, using a view of motion as transformations of world state. I situate this work in the broader context of ongoing computational and biological investigations into the problem of estimating motion for intelligent perception and action

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    Study and application of motion measurement methods by means of opto-electronics systems - Studio e applicazione di metodi di misura del moto mediante sistemi opto-elettronici

    Get PDF
    This thesis addresses the problem of localizing a vehicle in unstructured environments through on-board instrumentation that does not require infrastructure modifications. Two widely used opto-electronic systems which allow for non-contact measurements have been chosen: camera and laser range finder. Particular attention is paid to the definition of a set of procedures for processing the environment information acquired with the instruments in order to provide both accuracy and robustness to measurement noise. An important contribute of this work is the development of a robust and reliable algorithm for associating data that has been integrated in a graph based SLAM framework also taking into account uncertainty thus leading to an optimal vehicle motion estimation. Moreover, the localization of the vehicle can be achieved in a generic environment since the developed global localization solution does not necessarily require the identification of landmarks in the environment, neither natural nor artificial. Part of the work is dedicated to a thorough comparative analysis of the state-of-the-art scan matching methods in order to choose the best one to be employed in the solution pipeline. In particular this investigation has highlighted that a dense scan matching approach can ensure good performances in many typical environments. Several experiments in different environments, also with large scales, denote the effectiveness of the global localization system developed. While the laser range data have been exploited for the global localization, a robust visual odometry has been investigated. The results suggest that the use of camera can overcome the situations in which the solution achieved by the laser scanner has a low accuracy. In particular the global localization framework can be applied also to the camera sensor, in order to perform a sensor fusion between two complementary instrumentations and so obtain a more reliable localization system. The algorithms have been tested for 2D indoor environments, nevertheless it is expected that they are well suited also for 3D and outdoors

    Dense Visual Simultaneous Localisation and Mapping in Collaborative and Outdoor Scenarios

    Get PDF
    Dense visual simultaneous localisation and mapping (SLAM) systems can produce 3D reconstructions that are digital facsimiles of the physical space they describe. Systems that can produce dense maps with this level of fidelity in real time provide foundational spatial reasoning capabilities for many downstream tasks in autonomous robotics. Over the past 15 years, mapping small scale, indoor environments, such as desks and buildings, with a single slow moving, hand-held sensor has been one of the central focuses of dense visual SLAM research. However, most dense visual SLAM systems exhibit a number of limitations which mean they cannot be directly applied in collaborative or outdoors settings. The contribution of this thesis is to address these limitations with the development of new systems and algorithms for collaborative dense mapping, efficient dense alternation and outdoors operation with fast camera motion and wide field of view (FOV) cameras. We use ElasticFusion, a state-of-the-art dense SLAM system, as our starting point where each of these contributions is implemented as a novel extension to the system. We first present a collaborative dense SLAM system that allows a number of cameras starting with unknown initial relative positions to maintain local maps with the original ElasticFusion algorithm. Visual place recognition across local maps results in constraints that allow maps to be aligned into a common global reference frame, facilitating collaborative mapping and tracking of multiple cameras within a shared map. Within dense alternation based SLAM systems, the standard approach is to fuse every frame into the dense model without considering whether the information contained within the frame is already captured by the dense map and therefore redundant. As the number of cameras or the scale of the map increases, this approach becomes inefficient. In our second contribution, we address this inefficiency by introducing a novel information theoretic approach to keyframe selection that allows the system to avoid processing redundant information. We implement the procedure within ElasticFusion, demonstrating a marked reduction in the number of frames required by the system to estimate an accurate, denoised surface reconstruction. Before dense SLAM techniques can be applied in outdoor scenarios we must first address their reliance on active depth cameras, and their lack of suitability to fast camera motion. In our third contribution we present an outdoor dense SLAM system. The system overcomes the need for an active sensor by employing neural network-based depth inference to predict the geometry of the scene as it appears in each image. To address the issue of camera tracking during fast motion we employ a hybrid architecture, combining elements of both dense and sparse SLAM systems to perform camera tracking and to achieve globally consistent dense mapping. Automotive applications present a particularly important setting for dense visual SLAM systems. Such applications are characterised by their use of wide FOV cameras and are therefore not accurately modelled by the standard pinhole camera model. The fourth contribution of this thesis is to extend the above hybrid sparse-dense monocular SLAM system to cater for large FOV fisheye imagery. This is achieved by reformulating the mapping pipeline in terms of the Kannala-Brandt fisheye camera model. To estimate depth, we introduce a new version of the PackNet depth estimation neural network (Guizilini et al., 2020) adapted for fisheye inputs. To demonstrate the effectiveness of our contributions, we present experimental results, computed by processing the synthetic ICL-NUIM dataset of Handa et al. (2014) as well as the real-world TUM-RGBD dataset of Sturm et al. (2012). For outdoor SLAM we show the results of our system processing the autonomous driving KITTI and KITTI-360 datasets of Geiger et al. (2012a) and Liao et al. (2021) respectively

    Advances in Stereo Vision

    Get PDF
    Stereopsis is a vision process whose geometrical foundation has been known for a long time, ever since the experiments by Wheatstone, in the 19th century. Nevertheless, its inner workings in biological organisms, as well as its emulation by computer systems, have proven elusive, and stereo vision remains a very active and challenging area of research nowadays. In this volume we have attempted to present a limited but relevant sample of the work being carried out in stereo vision, covering significant aspects both from the applied and from the theoretical standpoints

    Fault-Tolerant Vision for Vehicle Guidance in Agriculture

    Get PDF
    • …
    corecore