9 research outputs found

    Sparse Bayesian information filters for localization and mapping

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2008This thesis formulates an estimation framework for Simultaneous Localization and Mapping (SLAM) that addresses the problem of scalability in large environments. We describe an estimation-theoretic algorithm that achieves significant gains in computational efficiency while maintaining consistent estimates for the vehicle pose and the map of the environment. We specifically address the feature-based SLAM problem in which the robot represents the environment as a collection of landmarks. The thesis takes a Bayesian approach whereby we maintain a joint posterior over the vehicle pose and feature states, conditioned upon measurement data. We model the distribution as Gaussian and parametrize the posterior in the canonical form, in terms of the information (inverse covariance) matrix. When sparse, this representation is amenable to computationally efficient Bayesian SLAM filtering. However, while a large majority of the elements within the normalized information matrix are very small in magnitude, it is fully populated nonetheless. Recent feature-based SLAM filters achieve the scalability benefits of a sparse parametrization by explicitly pruning these weak links in an effort to enforce sparsity. We analyze one such algorithm, the Sparse Extended Information Filter (SEIF), which has laid much of the groundwork concerning the computational benefits of the sparse canonical form. The thesis performs a detailed analysis of the process by which the SEIF approximates the sparsity of the information matrix and reveals key insights into the consequences of different sparsification strategies. We demonstrate that the SEIF yields a sparse approximation to the posterior that is inconsistent, suffering from exaggerated confidence estimates. This overconfidence has detrimental effects on important aspects of the SLAM process and affects the higher level goal of producing accurate maps for subsequent localization and path planning. This thesis proposes an alternative scalable filter that maintains sparsity while preserving the consistency of the distribution. We leverage insights into the natural structure of the feature-based canonical parametrization and derive a method that actively maintains an exactly sparse posterior. Our algorithm exploits the structure of the parametrization to achieve gains in efficiency, with a computational cost that scales linearly with the size of the map. Unlike similar techniques that sacrifice consistency for improved scalability, our algorithm performs inference over a posterior that is conservative relative to the nominal Gaussian distribution. Consequently, we preserve the consistency of the pose and map estimates and avoid the effects of an overconfident posterior. We demonstrate our filter alongside the SEIF and the standard EKF both in simulation as well as on two real-world datasets. While we maintain the computational advantages of an exactly sparse representation, the results show convincingly that our method yields conservative estimates for the robot pose and map that are nearly identical to those of the original Gaussian distribution as produced by the EKF, but at much less computational expense. The thesis concludes with an extension of our SLAM filter to a complex underwater environment. We describe a systems-level framework for localization and mapping relative to a ship hull with an Autonomous Underwater Vehicle (AUV) equipped with a forward-looking sonar. The approach utilizes our filter to fuse measurements of vehicle attitude and motion from onboard sensors with data from sonar images of the hull. We employ the system to perform three-dimensional, 6-DOF SLAM on a ship hull

    Robust multimodal dense SLAM

    Get PDF
    To enable increasingly intelligent behaviours, autonomous robots will need to be equipped with a deep understanding of their surrounding environment. It would be particularly desirable if this level of perception could be achieved automatically through the use of vision-based sensing, as passive cameras make a compelling sensor choice for robotic platforms due to their low cost, low weight, and low power consumption. Fundamental to extracting a high-level understanding from a set of 2D images is an understanding of the underlying 3D geometry of the environment. In mobile robotics, the most popular and successful technique for building a representation of 3D geometry from 2D images is Visual Simultaneous Localisation and Mapping (SLAM). While sparse, landmark-based SLAM systems have demonstrated high levels of accuracy and robustness, they are only capable of producing sparse maps. In general, to move beyond simple navigation to scene understanding and interaction, dense 3D reconstructions are required. Dense SLAM systems naturally allow for online dense scene reconstruction, but suffer from a lack of robustness due to the fact that the dense image alignment used in the tracking step has a narrow convergence basin and that the photometric-based depth estimation used in the mapping step is typically poorly constrained due to the presence of occlusions and homogeneous textures. This thesis develops methods that can be used to increase the robustness of dense SLAM by fusing additional sensing modalities into standard dense SLAM pipelines. In particular, this thesis will look at two sensing modalities: acceleration and rotation rate measurements from an inertial measurement unit (IMU) to address the tracking issue, and learned priors on dense reconstructions from deep neural networks (DNNs) to address the mapping issue.Open Acces

    Optimal Image-Aided Inertial Navigation

    Get PDF
    The utilization of cameras in integrated navigation systems is among the most recent scientific research and high-tech industry development. The research is motivated by the requirement of calibrating off-the-shelf cameras and the fusion of imaging and inertial sensors in poor GNSS environments. The three major contributions of this dissertation are The development of a structureless camera auto-calibration and system calibration algorithm for a GNSS, IMU and stereo camera system. The auto-calibration bundle adjustment utilizes the scale restraint equation, which is free of object coordinates. The number of parameters to be estimated is significantly reduced in comparison with the ones in a self-calibrating bundle adjustment based on the collinearity equations. Therefore, the proposed method is computationally more efficient. The development of a loosely-coupled visual odometry aided inertial navigation algorithm. The fusion of the two sensors is usually performed using a Kalman filter. The pose changes are pairwise time-correlated, i.e. the measurement noise vector at the current epoch is only correlated with the one from the previous epoch. Time-correlated errors are usually modelled by a shaping filter. The shaping filter developed in this dissertation uses Cholesky factors as coefficients derived from the variance and covariance matrices of the measurement noise vectors. Test results with showed that the proposed algorithm performs better than the existing ones and provides more realistic covariance estimates. The development of a tightly-coupled stereo multi-frame aided inertial navigation algorithm for reducing position and orientation drifts. Usually, the image aiding based on the visual odometry uses the tracked features only from a pair of the consecutive image frames. The proposed method integrates the features tracked from multiple overlapped image frames for reducing the position and orientation drifts. The measurement equation is derived from SLAM measurement equation system where the landmark positions in SLAM are algebraically by time-differencing. However, the derived measurements are time-correlated. Through a sequential de-correlation, the Kalman filter measurement update can be performed sequentially and optimally. The main advantages of the proposed algorithm are the reduction of computational requirements when compared to SLAM and a seamless integration into an existing GNSS aided-IMU system

    Dynamic state estimation for mobile robots

    Get PDF
    The scientific goal of this thesis is to tackle different approaches for effective state estimation and modelling of relevant problems in the context of mobile robots. The starting point of this dissertation is the concept of probabilistic robotics, an emerging paradigm that combines state-of-the-art methods with the classic probabilistic theory, developing stochastic frameworks for understanding the uncertain nature of the interaction between a robot and its environment. This allows introducing relevant concepts which are the foundation of the localisation system implemented on the main experimental platform used on this dissertation. An accurate estimation of the position of a robot with respect to a fixed frame is fundamental for building navigation systems that can work in dynamic unstructured environments. This development also allows introducing additional contributions related with global localisation, dynamic obstacle avoidance, path planning and position tracking problems. Kinematics on generalised manipulators are characterised for dealing with complex nonlinear systems. Nonlinear formulations are needed to properly model these systems, which are not always suitable for real-time realisation, lacking analytic formulations in most cases. In this context, this thesis tackles the serial-parallel dual kinematic problem with a novel approach, demonstrating state-of-the-art accuracy and real-time performance. With a spatial decomposition method, the forward kinematics problem on parallel robots and the inverse kinematics problem on serial manipulators is solved modelling the nonlinear behaviour of the pose space using Support Vector Machines. The results are validated on different topologies with the analytic solution for such manipulators, which demonstrates the applicability of the proposed method. Modelling and control of complex dynamical systems is another relevant field with applications on mobile robots. Nonlinear techniques are usually applied to tackle problems like feature or object tracking. However, some nonlinear integer techniques applied for tasks like position tracking in mobile robots with complex dynamics have limited success when modelling such systems. Fractional calculus has demonstrated to be suitable to model complex processes like viscoelasticity or super diffusion. These tools, that take advantage of the generalization of the derivative and integral operators to a fractional order, have been applied to model and control different topics related with robotics in recent years with remarkable success. With the proposal of a fractional-order PI controller, a suitable controller design method is presented to solve the position tracking problem. This is applied to control the distance of a self-driving car with respect to an objective, which can also be applied to other tracking applications like following a navigation path. Furthermore, this thesis introduces a novel fractional-order hyperchaotic system, stabilised with a full-pseudo-state-feedback controller and a located feedback method. This theoretical contribution of a chaotic system is introduced hoping to be useful in this context. Chaos theory has recently started to be applied to study manipulators, biped robots and autonomous navigation, achieving new and promising results, highlighting the uncertain and chaotic nature which also has been found on robots. All together, this thesis is devoted to different problems related with dynamic state estimation for mobile robots, proposing specific contributions related with modelling and control of complex nonlinear systems. These findings are presented in the context of a self-driving electric car, Verdino, jointly developed in collaboration with the Robotics Group of Universidad de La Laguna (GRULL)

    Multi-task near-field perception for autonomous driving using surround-view fisheye cameras

    Get PDF
    Die Bildung der Augen führte zum Urknall der Evolution. Die Dynamik änderte sich von einem primitiven Organismus, der auf den Kontakt mit der Nahrung wartete, zu einem Organismus, der durch visuelle Sensoren gesucht wurde. Das menschliche Auge ist eine der raffiniertesten Entwicklungen der Evolution, aber es hat immer noch Mängel. Der Mensch hat über Millionen von Jahren einen biologischen Wahrnehmungsalgorithmus entwickelt, der in der Lage ist, Autos zu fahren, Maschinen zu bedienen, Flugzeuge zu steuern und Schiffe zu navigieren. Die Automatisierung dieser Fähigkeiten für Computer ist entscheidend für verschiedene Anwendungen, darunter selbstfahrende Autos, Augmented Realität und architektonische Vermessung. Die visuelle Nahfeldwahrnehmung im Kontext von selbstfahrenden Autos kann die Umgebung in einem Bereich von 0 - 10 Metern und 360° Abdeckung um das Fahrzeug herum wahrnehmen. Sie ist eine entscheidende Entscheidungskomponente bei der Entwicklung eines sichereren automatisierten Fahrens. Jüngste Fortschritte im Bereich Computer Vision und Deep Learning in Verbindung mit hochwertigen Sensoren wie Kameras und LiDARs haben ausgereifte Lösungen für die visuelle Wahrnehmung hervorgebracht. Bisher stand die Fernfeldwahrnehmung im Vordergrund. Ein weiteres wichtiges Problem ist die begrenzte Rechenleistung, die für die Entwicklung von Echtzeit-Anwendungen zur Verfügung steht. Aufgrund dieses Engpasses kommt es häufig zu einem Kompromiss zwischen Leistung und Laufzeiteffizienz. Wir konzentrieren uns auf die folgenden Themen, um diese anzugehen: 1) Entwicklung von Nahfeld-Wahrnehmungsalgorithmen mit hoher Leistung und geringer Rechenkomplexität für verschiedene visuelle Wahrnehmungsaufgaben wie geometrische und semantische Aufgaben unter Verwendung von faltbaren neuronalen Netzen. 2) Verwendung von Multi-Task-Learning zur Überwindung von Rechenengpässen durch die gemeinsame Nutzung von initialen Faltungsschichten zwischen den Aufgaben und die Entwicklung von Optimierungsstrategien, die die Aufgaben ausbalancieren.The formation of eyes led to the big bang of evolution. The dynamics changed from a primitive organism waiting for the food to come into contact for eating food being sought after by visual sensors. The human eye is one of the most sophisticated developments of evolution, but it still has defects. Humans have evolved a biological perception algorithm capable of driving cars, operating machinery, piloting aircraft, and navigating ships over millions of years. Automating these capabilities for computers is critical for various applications, including self-driving cars, augmented reality, and architectural surveying. Near-field visual perception in the context of self-driving cars can perceive the environment in a range of 0 - 10 meters and 360° coverage around the vehicle. It is a critical decision-making component in the development of safer automated driving. Recent advances in computer vision and deep learning, in conjunction with high-quality sensors such as cameras and LiDARs, have fueled mature visual perception solutions. Until now, far-field perception has been the primary focus. Another significant issue is the limited processing power available for developing real-time applications. Because of this bottleneck, there is frequently a trade-off between performance and run-time efficiency. We concentrate on the following issues in order to address them: 1) Developing near-field perception algorithms with high performance and low computational complexity for various visual perception tasks such as geometric and semantic tasks using convolutional neural networks. 2) Using Multi-Task Learning to overcome computational bottlenecks by sharing initial convolutional layers between tasks and developing optimization strategies that balance tasks

    Videos in Context for Telecommunication and Spatial Browsing

    Get PDF
    The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance

    2019 EC3 July 10-12, 2019 Chania, Crete, Greece

    Get PDF
    corecore