397 research outputs found

    Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems

    Full text link
    Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many important new contributions in the last years. These novel developments are grounded on recent advances in convex analysis, discrete optimization, parallel processing, and non-smooth optimization with emphasis on sparsity issues. In this paper, we aim at presenting the principles of primal-dual approaches, while giving an overview of numerical methods which have been proposed in different contexts. We show the benefits which can be drawn from primal-dual algorithms both for solving large-scale convex optimization problems and discrete ones, and we provide various application examples to illustrate their usefulness

    Map-Based Localization for Unmanned Aerial Vehicle Navigation

    Get PDF
    Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments. Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments. The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%

    Algorithms for the reconstruction, analysis, repairing and enhancement of 3D urban models from multiple data sources

    Get PDF
    Over the last few years, there has been a notorious growth in the field of digitization of 3D buildings and urban environments. The substantial improvement of both scanning hardware and reconstruction algorithms has led to the development of representations of buildings and cities that can be remotely transmitted and inspected in real-time. Among the applications that implement these technologies are several GPS navigators and virtual globes such as Google Earth or the tools provided by the Institut Cartogràfic i Geològic de Catalunya. In particular, in this thesis, we conceptualize cities as a collection of individual buildings. Hence, we focus on the individual processing of one structure at a time, rather than on the larger-scale processing of urban environments. Nowadays, there is a wide diversity of digitization technologies, and the choice of the appropriate one is key for each particular application. Roughly, these techniques can be grouped around three main families: - Time-of-flight (terrestrial and aerial LiDAR). - Photogrammetry (street-level, satellite, and aerial imagery). - Human-edited vector data (cadastre and other map sources). Each of these has its advantages in terms of covered area, data quality, economic cost, and processing effort. Plane and car-mounted LiDAR devices are optimal for sweeping huge areas, but acquiring and calibrating such devices is not a trivial task. Moreover, the capturing process is done by scan lines, which need to be registered using GPS and inertial data. As an alternative, terrestrial LiDAR devices are more accessible but cover smaller areas, and their sampling strategy usually produces massive point clouds with over-represented plain regions. A more inexpensive option is street-level imagery. A dense set of images captured with a commodity camera can be fed to state-of-the-art multi-view stereo algorithms to produce realistic-enough reconstructions. One other advantage of this approach is capturing high-quality color data, whereas the geometric information is usually lacking. In this thesis, we analyze in-depth some of the shortcomings of these data-acquisition methods and propose new ways to overcome them. Mainly, we focus on the technologies that allow high-quality digitization of individual buildings. These are terrestrial LiDAR for geometric information and street-level imagery for color information. Our main goal is the processing and completion of detailed 3D urban representations. For this, we will work with multiple data sources and combine them when possible to produce models that can be inspected in real-time. Our research has focused on the following contributions: - Effective and feature-preserving simplification of massive point clouds. - Developing normal estimation algorithms explicitly designed for LiDAR data. - Low-stretch panoramic representation for point clouds. - Semantic analysis of street-level imagery for improved multi-view stereo reconstruction. - Color improvement through heuristic techniques and the registration of LiDAR and imagery data. - Efficient and faithful visualization of massive point clouds using image-based techniques.Durant els darrers anys, hi ha hagut un creixement notori en el camp de la digitalització d'edificis en 3D i entorns urbans. La millora substancial tant del maquinari d'escaneig com dels algorismes de reconstrucció ha portat al desenvolupament de representacions d'edificis i ciutats que es poden transmetre i inspeccionar remotament en temps real. Entre les aplicacions que implementen aquestes tecnologies hi ha diversos navegadors GPS i globus virtuals com Google Earth o les eines proporcionades per l'Institut Cartogràfic i Geològic de Catalunya. En particular, en aquesta tesi, conceptualitzem les ciutats com una col·lecció d'edificis individuals. Per tant, ens centrem en el processament individual d'una estructura a la vegada, en lloc del processament a gran escala d'entorns urbans. Avui en dia, hi ha una àmplia diversitat de tecnologies de digitalització i la selecció de l'adequada és clau per a cada aplicació particular. Aproximadament, aquestes tècniques es poden agrupar en tres famílies principals: - Temps de vol (LiDAR terrestre i aeri). - Fotogrametria (imatges a escala de carrer, de satèl·lit i aèries). - Dades vectorials editades per humans (cadastre i altres fonts de mapes). Cadascun d'ells presenta els seus avantatges en termes d'àrea coberta, qualitat de les dades, cost econòmic i esforç de processament. Els dispositius LiDAR muntats en avió i en cotxe són òptims per escombrar àrees enormes, però adquirir i calibrar aquests dispositius no és una tasca trivial. A més, el procés de captura es realitza mitjançant línies d'escaneig, que cal registrar mitjançant GPS i dades inercials. Com a alternativa, els dispositius terrestres de LiDAR són més accessibles, però cobreixen àrees més petites, i la seva estratègia de mostreig sol produir núvols de punts massius amb regions planes sobrerepresentades. Una opció més barata són les imatges a escala de carrer. Es pot fer servir un conjunt dens d'imatges capturades amb una càmera de qualitat mitjana per obtenir reconstruccions prou realistes mitjançant algorismes estèreo d'última generació per produir. Un altre avantatge d'aquest mètode és la captura de dades de color d'alta qualitat. Tanmateix, la informació geomètrica resultant sol ser de baixa qualitat. En aquesta tesi, analitzem en profunditat algunes de les mancances d'aquests mètodes d'adquisició de dades i proposem noves maneres de superar-les. Principalment, ens centrem en les tecnologies que permeten una digitalització d'alta qualitat d'edificis individuals. Es tracta de LiDAR terrestre per obtenir informació geomètrica i imatges a escala de carrer per obtenir informació sobre colors. El nostre objectiu principal és el processament i la millora de representacions urbanes 3D amb molt detall. Per a això, treballarem amb diverses fonts de dades i les combinarem quan sigui possible per produir models que es puguin inspeccionar en temps real. La nostra investigació s'ha centrat en les següents contribucions: - Simplificació eficaç de núvols de punts massius, preservant detalls d'alta resolució. - Desenvolupament d'algoritmes d'estimació normal dissenyats explícitament per a dades LiDAR. - Representació panoràmica de baixa distorsió per a núvols de punts. - Anàlisi semàntica d'imatges a escala de carrer per millorar la reconstrucció estèreo de façanes. - Millora del color mitjançant tècniques heurístiques i el registre de dades LiDAR i imatge. - Visualització eficient i fidel de núvols de punts massius mitjançant tècniques basades en imatges
    • …
    corecore