855 research outputs found

    Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data

    Full text link
    Localization is a key requirement for mobile robot autonomy and human-robot interaction. Vision-based localization is accurate and flexible, however, it incurs a high computational burden which limits its application on many resource-constrained platforms. In this paper, we address the problem of performing real-time localization in large-scale 3D point cloud maps of ever-growing size. While most systems using multi-modal information reduce localization time by employing side-channel information in a coarse manner (eg. WiFi for a rough prior position estimate), we propose to inter-weave the map with rich sensory data. This multi-modal approach achieves two key goals simultaneously. First, it enables us to harness additional sensory data to localise against a map covering a vast area in real-time; and secondly, it also allows us to roughly localise devices which are not equipped with a camera. The key to our approach is a localization policy based on a sequential Monte Carlo estimator. The localiser uses this policy to attempt point-matching only in nodes where it is likely to succeed, significantly increasing the efficiency of the localization process. The proposed multi-modal localization system is evaluated extensively in a large museum building. The results show that our multi-modal approach not only increases the localization accuracy but significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots (Humanoids) 201

    DeMoN: Depth and Motion Network for Learning Monocular Stereo

    Full text link
    In this paper we formulate structure from motion as a learning problem. We train a convolutional network end-to-end to compute depth and camera motion from successive, unconstrained image pairs. The architecture is composed of multiple stacked encoder-decoder networks, the core part being an iterative network that is able to improve its own predictions. The network estimates not only depth and motion, but additionally surface normals, optical flow between the images and confidence of the matching. A crucial component of the approach is a training loss based on spatial relative differences. Compared to traditional two-frame structure from motion methods, results are more accurate and more robust. In contrast to the popular depth-from-single-image networks, DeMoN learns the concept of matching and, thus, better generalizes to structures not seen during training.Comment: Camera ready version for CVPR 2017. Supplementary material included. Project page: http://lmb.informatik.uni-freiburg.de/people/ummenhof/depthmotionnet

    Robust convex optimisation techniques for autonomous vehicle vision-based navigation

    Get PDF
    This thesis investigates new convex optimisation techniques for motion and pose estimation. Numerous computer vision problems can be formulated as optimisation problems. These optimisation problems are generally solved via linear techniques using the singular value decomposition or iterative methods under an L2 norm minimisation. Linear techniques have the advantage of offering a closed-form solution that is simple to implement. The quantity being minimised is, however, not geometrically or statistically meaningful. Conversely, L2 algorithms rely on iterative estimation, where a cost function is minimised using algorithms such as Levenberg-Marquardt, Gauss-Newton, gradient descent or conjugate gradient. The cost functions involved are geometrically interpretable and can statistically be optimal under an assumption of Gaussian noise. However, in addition to their sensitivity to initial conditions, these algorithms are often slow and bear a high probability of getting trapped in a local minimum or producing infeasible solutions, even for small noise levels. In light of the above, in this thesis we focus on developing new techniques for finding solutions via a convex optimisation framework that are globally optimal. Presently convex optimisation techniques in motion estimation have revealed enormous advantages. Indeed, convex optimisation ensures getting a global minimum, and the cost function is geometrically meaningful. Moreover, robust optimisation is a recent approach for optimisation under uncertain data. In recent years the need to cope with uncertain data has become especially acute, particularly where real-world applications are concerned. In such circumstances, robust optimisation aims to recover an optimal solution whose feasibility must be guaranteed for any realisation of the uncertain data. Although many researchers avoid uncertainty due to the added complexity in constructing a robust optimisation model and to lack of knowledge as to the nature of these uncertainties, and especially their propagation, in this thesis robust convex optimisation, while estimating the uncertainties at every step is investigated for the motion estimation problem. First, a solution using convex optimisation coupled to the recursive least squares (RLS) algorithm and the robust H filter is developed for motion estimation. In another solution, uncertainties and their propagation are incorporated in a robust L convex optimisation framework for monocular visual motion estimation. In this solution, robust least squares is combined with a second order cone program (SOCP). A technique to improve the accuracy and the robustness of the fundamental matrix is also investigated in this thesis. This technique uses the covariance intersection approach to fuse feature location uncertainties, which leads to more consistent motion estimates. Loop-closure detection is crucial in improving the robustness of navigation algorithms. In practice, after long navigation in an unknown environment, detecting that a vehicle is in a location it has previously visited gives the opportunity to increase the accuracy and consistency of the estimate. In this context, we have developed an efficient appearance-based method for visual loop-closure detection based on the combination of a Gaussian mixture model with the KD-tree data structure. Deploying this technique for loop-closure detection, a robust L convex posegraph optimisation solution for unmanned aerial vehicle (UAVs) monocular motion estimation is introduced as well. In the literature, most proposed solutions formulate the pose-graph optimisation as a least-squares problem by minimising a cost function using iterative methods. In this work, robust convex optimisation under the L norm is adopted, which efficiently corrects the UAV’s pose after loop-closure detection. To round out the work in this thesis, a system for cooperative monocular visual motion estimation with multiple aerial vehicles is proposed. The cooperative motion estimation employs state-of-the-art approaches for optimisation, individual motion estimation and registration. Three-view geometry algorithms in a convex optimisation framework are deployed on board the monocular vision system for each vehicle. In addition, vehicle-to-vehicle relative pose estimation is performed with a novel robust registration solution in a global optimisation framework. In parallel, and as a complementary solution for the relative pose, a robust non-linear H solution is designed as well to fuse measurements from the UAVs’ on-board inertial sensors with the visual estimates. The suggested contributions have been exhaustively evaluated over a number of real-image data experiments in the laboratory using monocular vision systems and range imaging devices. In this thesis, we propose several solutions towards the goal of robust visual motion estimation using convex optimisation. We show that the convex optimisation framework may be extended to include uncertainty information, to achieve robust and optimal solutions. We observed that convex optimisation is a practical and very appealing alternative to linear techniques and iterative methods

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Nonrigid reconstruction of 3D breast surfaces with a low-cost RGBD camera for surgical planning and aesthetic evaluation

    Get PDF
    Accounting for 26% of all new cancer cases worldwide, breast cancer remains the most common form of cancer in women. Although early breast cancer has a favourable long-term prognosis, roughly a third of patients suffer from a suboptimal aesthetic outcome despite breast conserving cancer treatment. Clinical-quality 3D modelling of the breast surface therefore assumes an increasingly important role in advancing treatment planning, prediction and evaluation of breast cosmesis. Yet, existing 3D torso scanners are expensive and either infrastructure-heavy or subject to motion artefacts. In this paper we employ a single consumer-grade RGBD camera with an ICP-based registration approach to jointly align all points from a sequence of depth images non-rigidly. Subtle body deformation due to postural sway and respiration is successfully mitigated leading to a higher geometric accuracy through regularised locally affine transformations. We present results from 6 clinical cases where our method compares well with the gold standard and outperforms a previous approach. We show that our method produces better reconstructions qualitatively by visual assessment and quantitatively by consistently obtaining lower landmark error scores and yielding more accurate breast volume estimates

    Information-driven navigation

    Get PDF
    En los últimos años, hemos presenciado un progreso enorme de la precisión y la robustez de la “Odometría Visual” (VO) y del “Mapeo y la Localización Simultánea” (SLAM). Esta mejora de su funcionamiento ha permitido las primeras implementaciones comerciales relacionadascon la realidad aumentada (AR), la realidad virtual (VR) y la robótica. En esta tesis, desarrollamos nuevos métodos probabilísticos para mejorar la precisión, robustez y eficiencia de estas técnicas. Las contribuciones de nuestro trabajo están publicadas en tres artículos y se complementan con el lanzamiento de “SID-SLAM”, el software que contiene todas nuestras contribuciones, y del “Minimal Texture dataset”.Nuestra primera contribución es un algoritmo para la selección de puntos basado en Teoría de la Información para sistemas RGB-D VO/SLAM basados en métodos directos y/o en características visuales (features). El objetivo es seleccionar las medidas más informativas, para reducir el tama˜no del problema de optimización con un impacto mínimo en la precisión. Nuestros resultados muestran que nuestro nuevo criterio permitereducir el número de puntos hasta tan sólo 24 de ellos, alcanzando la precisión del estado del arte y reduciendo en hasta 10 veces la demanda computacional.El desarrollo de mejores modelos de incertidumbre para las medidas visuales mejoraría la precisión de la estructura y movimiento multi-vista y llevaría a estimaciones más realistas de la incertidumbre del estado en VO/SLAM. En esta tesis derivamos un modelo de covarianza para residuos multi-vista, que se convierte en un elemento crucial de nuestras contribuciones basadas en Teoría de la Información.La odometría visual y los sistemas de SLAM se dividen típicamente en la literatura en dos categorías, los basados en features y los métodos directos, dependiendo del tipo de residuos que son minimizados. En la última parte de la tesis combinamos nuestras dos contribucionesanteriores en la formulación e implementación de SID-SLAM, el primer sistema completo de SLAM semi-directo RGB-D que utiliza de forma integrada e indistinta features y métodos directos, en un sistema completo dirigido con información. Adicionalmente, grabamos ‘‘Minimal Texture”, un dataset RGB-D con un contenido visual conceptualmente simple pero arduo, con un ground truth preciso para facilitar la investigación del estado del arte en SLAM semi-directo.In the last years, we have witnessed an impressive progress in the accuracy and robustness of Visual Odometry (VO) and Simultaneous Localization and Mapping (SLAM). This boost in the performance has enabled the first commercial implementations related to augmented reality (AR), virtual reality (VR) and robotics. In this thesis, we developed new probabilistic methods to further improve the accuracy, robustness and efficiency of VO and SLAM. The contributions of our work are issued in three main publications and complemented with the release of SID-SLAM, the software containing all our contributions, and the challenging Mininal Texture dataset. Our first contribution is an information-theoretic approach to point selection for direct and/or feature-based RGB-D VO/SLAM. The aim is to select only the most informative measurements, in order to reduce the optimization problem with a minimal impact in the accuracy. Our experimental results show that our novel criteria allows us to reduce the number of tracked points down to only 24 of them, achieving state-of-the-art accuracy while reducing 10x the computational demand. Better uncertainty models for visual measurements will impact the accuracy of multi-view structure and motion and will lead to realistic uncertainty estimates of the VO/SLAM states. We derived a novel model for multi-view residual covariances based on perspective deformation, which has become a crucial element in our information-driven approach. Visual odometry and SLAM systems are typically divided in the literature into two categories, feature-based and direct methods, depending on the type of residuals that are minimized. We combined our two previous contributions in the formulation and implementation of SID-SLAM, the first full semi-direct RGB-D SLAM system that uses tightly and indistinctly features and direct methods within a complete information-driven pipeline. Moreover, we recorded Minimal Texture an RGB-D dataset with conceptually simple but challenging content, with accurate ground truth to facilitate state-of-the-art research on semi-direct SLAM.<br /
    corecore