469 research outputs found

    Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume

    Full text link
    Deep learning has shown to be effective for depth inference in multi-view stereo (MVS). However, the scalability and accuracy still remain an open problem in this domain. This can be attributed to the memory-consuming cost volume representation and inappropriate depth inference. Inspired by the group-wise correlation in stereo matching, we propose an average group-wise correlation similarity measure to construct a lightweight cost volume. This can not only reduce the memory consumption but also reduce the computational burden in the cost volume filtering. Based on our effective cost volume representation, we propose a cascade 3D U-Net module to regularize the cost volume to further boost the performance. Unlike the previous methods that treat multi-view depth inference as a depth regression problem or an inverse depth classification problem, we recast multi-view depth inference as an inverse depth regression task. This allows our network to achieve sub-pixel estimation and be applicable to large-scale scenes. Through extensive experiments on DTU dataset and Tanks and Temples dataset, we show that our proposed network with Correlation cost volume and Inverse DEpth Regression (CIDER), achieves state-of-the-art results, demonstrating its superior performance on scalability and accuracy.Comment: Accepted by AAAI-202

    RANSAC for Robotic Applications: A Survey

    Get PDF
    Random Sample Consensus, most commonly abbreviated as RANSAC, is a robust estimation method for the parameters of a model contaminated by a sizable percentage of outliers. In its simplest form, the process starts with a sampling of the minimum data needed to perform an estimation, followed by an evaluation of its adequacy, and further repetitions of this process until some stopping criterion is met. Multiple variants have been proposed in which this workflow is modified, typically tweaking one or several of these steps for improvements in computing time or the quality of the estimation of the parameters. RANSAC is widely applied in the field of robotics, for example, for finding geometric shapes (planes, cylinders, spheres, etc.) in cloud points or for estimating the best transformation between different camera views. In this paper, we present a review of the current state of the art of RANSAC family methods with a special interest in applications in robotics.This work has been partially funded by the Basque Government, Spain, under Research Teams Grant number IT1427-22 and under ELKARTEK LANVERSO Grant number KK-2022/00065; the Spanish Ministry of Science (MCIU), the State Research Agency (AEI), the European Regional Development Fund (FEDER), under Grant number PID2021-122402OB-C21 (MCIU/AEI/FEDER, UE); and the Spanish Ministry of Science, Innovation and Universities, under Grant FPU18/04737

    One at A Time: Multi-step Volumetric Probability Distribution Diffusion for Depth Estimation

    Full text link
    Recent works have explored the fundamental role of depth estimation in multi-view stereo (MVS) and semantic scene completion (SSC). They generally construct 3D cost volumes to explore geometric correspondence in depth, and estimate such volumes in a single step relying directly on the ground truth approximation. However, such problem cannot be thoroughly handled in one step due to complex empirical distributions, especially in challenging regions like occlusions, reflections, etc. In this paper, we formulate the depth estimation task as a multi-step distribution approximation process, and introduce a new paradigm of modeling the Volumetric Probability Distribution progressively (step-by-step) following a Markov chain with Diffusion models (VPDD). Specifically, to constrain the multi-step generation of volume in VPDD, we construct a meta volume guidance and a confidence-aware contextual guidance as conditional geometry priors to facilitate the distribution approximation. For the sampling process, we further investigate an online filtering strategy to maintain consistency in volume representations for stable training. Experiments demonstrate that our plug-and-play VPDD outperforms the state-of-the-arts for tasks of MVS and SSC, and can also be easily extended to different baselines to get improvement. It is worth mentioning that we are the first camera-based work that surpasses LiDAR-based methods on the SemanticKITTI dataset

    Modélisation tridimensionnelle précise de l'environnement à l’aide des systèmes de photogrammétrie embarqués sur drones

    Get PDF
    Abstract : Images acquired from unmanned aerial vehicles (UAVs) can provide data with unprecedented spatial and temporal resolution for three-dimensional (3D) modeling. Solutions developed for this purpose are mainly operating based on photogrammetry concepts, namely UAV-Photogrammetry Systems (UAV-PS). Such systems are used in applications where both geospatial and visual information of the environment is required. These applications include, but are not limited to, natural resource management such as precision agriculture, military and police-related services such as traffic-law enforcement, precision engineering such as infrastructure inspection, and health services such as epidemic emergency management. UAV-photogrammetry systems can be differentiated based on their spatial characteristics in terms of accuracy and resolution. That is some applications, such as precision engineering, require high-resolution and high-accuracy information of the environment (e.g. 3D modeling with less than one centimeter accuracy and resolution). In other applications, lower levels of accuracy might be sufficient, (e.g. wildlife management needing few decimeters of resolution). However, even in those applications, the specific characteristics of UAV-PSs should be well considered in the steps of both system development and application in order to yield satisfying results. In this regard, this thesis presents a comprehensive review of the applications of unmanned aerial imagery, where the objective was to determine the challenges that remote-sensing applications of UAV systems currently face. This review also allowed recognizing the specific characteristics and requirements of UAV-PSs, which are mostly ignored or not thoroughly assessed in recent studies. Accordingly, the focus of the first part of this thesis is on exploring the methodological and experimental aspects of implementing a UAV-PS. The developed system was extensively evaluated for precise modeling of an open-pit gravel mine and performing volumetric-change measurements. This application was selected for two main reasons. Firstly, this case study provided a challenging environment for 3D modeling, in terms of scale changes, terrain relief variations as well as structure and texture diversities. Secondly, open-pit-mine monitoring demands high levels of accuracy, which justifies our efforts to improve the developed UAV-PS to its maximum capacities. The hardware of the system consisted of an electric-powered helicopter, a high-resolution digital camera, and an inertial navigation system. The software of the system included the in-house programs specifically designed for camera calibration, platform calibration, system integration, onboard data acquisition, flight planning and ground control point (GCP) detection. The detailed features of the system are discussed in the thesis, and solutions are proposed in order to enhance the system and its photogrammetric outputs. The accuracy of the results was evaluated under various mapping conditions, including direct georeferencing and indirect georeferencing with different numbers, distributions and types of ground control points. Additionally, the effects of imaging configuration and network stability on modeling accuracy were assessed. The second part of this thesis concentrates on improving the techniques of sparse and dense reconstruction. The proposed solutions are alternatives to traditional aerial photogrammetry techniques, properly adapted to specific characteristics of unmanned, low-altitude imagery. Firstly, a method was developed for robust sparse matching and epipolar-geometry estimation. The main achievement of this method was its capacity to handle a very high percentage of outliers (errors among corresponding points) with remarkable computational efficiency (compared to the state-of-the-art techniques). Secondly, a block bundle adjustment (BBA) strategy was proposed based on the integration of intrinsic camera calibration parameters as pseudo-observations to Gauss-Helmert model. The principal advantage of this strategy was controlling the adverse effect of unstable imaging networks and noisy image observations on the accuracy of self-calibration. The sparse implementation of this strategy was also performed, which allowed its application to data sets containing a lot of tie points. Finally, the concepts of intrinsic curves were revisited for dense stereo matching. The proposed technique could achieve a high level of accuracy and efficiency by searching only through a small fraction of the whole disparity search space as well as internally handling occlusions and matching ambiguities. These photogrammetric solutions were extensively tested using synthetic data, close-range images and the images acquired from the gravel-pit mine. Achieving absolute 3D mapping accuracy of 11±7 mm illustrated the success of this system for high-precision modeling of the environment.Résumé : Les images acquises à l’aide d’aéronefs sans pilote (ASP) permettent de produire des données de résolutions spatiales et temporelles uniques pour la modélisation tridimensionnelle (3D). Les solutions développées pour ce secteur d’activité sont principalement basées sur des concepts de photogrammétrie et peuvent être identifiées comme des systèmes photogrammétriques embarqués sur aéronefs sans pilote (SP-ASP). Ils sont utilisés dans plusieurs applications environnementales où l’information géospatiale et visuelle est essentielle. Ces applications incluent notamment la gestion des ressources naturelles (ex. : agriculture de précision), la sécurité publique et militaire (ex. : gestion du trafic), les services d’ingénierie (ex. : inspection de bâtiments) et les services de santé publique (ex. : épidémiologie et gestion des risques). Les SP-ASP peuvent être subdivisés en catégories selon les besoins en termes de précision et de résolution. En effet, dans certains cas, tel qu’en ingénierie, l’information sur l’environnement doit être de haute précision et de haute résolution (ex. : modélisation 3D avec une précision et une résolution inférieure à un centimètre). Pour d’autres applications, tel qu’en gestion de la faune sauvage, des niveaux de précision et de résolution moindres peut être suffisants (ex. : résolution de l’ordre de quelques décimètres). Cependant, même dans ce type d’applications les caractéristiques des SP-ASP devraient être prises en considération dans le développement des systèmes et dans leur utilisation, et ce, pour atteindre les résultats visés. À cet égard, cette thèse présente une revue exhaustive des applications de l’imagerie aérienne acquise par ASP et de déterminer les challenges les plus courants. Cette étude a également permis d’établir les caractéristiques et exigences spécifiques des SP-ASP qui sont généralement ignorées ou partiellement discutées dans les études récentes. En conséquence, la première partie de cette thèse traite des aspects méthodologiques et d’expérimentation de la mise en place d’un SP-ASP. Le système développé a été évalué pour la modélisation précise d’une gravière et utilisé pour réaliser des mesures de changement volumétrique. Cette application a été retenue pour deux raisons principales. Premièrement, ce type de milieu fournit un environnement difficile pour la modélisation, et ce, en termes de changement d’échelle, de changement de relief du terrain ainsi que la grande diversité de structures et de textures. Deuxièment, le suivi de mines à ciel ouvert exige un niveau de précision élevé, ce qui justifie les efforts déployés pour mettre au point un SP-ASP de haute précision. Les composantes matérielles du système consistent en un ASP à propulsion électrique de type hélicoptère, d’une caméra numérique à haute résolution ainsi qu’une station inertielle. La composante logicielle est composée de plusieurs programmes développés particulièrement pour calibrer la caméra et la plateforme, intégrer les systèmes, enregistrer les données, planifier les paramètres de vol et détecter automatiquement les points de contrôle au sol. Les détails complets du système sont abordés dans la thèse et des solutions sont proposées afin d’améliorer le système et la qualité des données photogrammétriques produites. La précision des résultats a été évaluée sous diverses conditions de cartographie, incluant le géoréférencement direct et indirect avec un nombre, une répartition et des types de points de contrôle variés. De plus, les effets de la configuration des images et la stabilité du réseau sur la précision de la modélisation ont été évalués. La deuxième partie de la thèse porte sur l’amélioration des techniques de reconstruction éparse et dense. Les solutions proposées sont des alternatives aux techniques de photogrammétrie aérienne traditionnelle et adaptée aux caractéristiques particulières de l’imagerie acquise à basse altitude par ASP. Tout d’abord, une méthode robuste de correspondance éparse et d’estimation de la géométrie épipolaire a été développée. L’élément clé de cette méthode est sa capacité à gérer le pourcentage très élevé des valeurs aberrantes (erreurs entre les points correspondants) avec une efficacité de calcul remarquable en comparaison avec les techniques usuelles. Ensuite, une stratégie d’ajustement de bloc basée sur l’intégration de pseudoobservations du modèle Gauss-Helmert a été proposée. Le principal avantage de cette stratégie consistait à contrôler les effets négatifs du réseau d’images instable et des images bruitées sur la précision de l’autocalibration. Une implémentation éparse de cette stratégie a aussi été réalisée, ce qui a permis de traiter des jeux de données contenant des millions de points de liaison. Finalement, les concepts de courbes intrinsèques ont été revisités pour l’appariement stéréo dense. La technique proposée pourrait atteindre un haut niveau de précision et d’efficacité en recherchant uniquement dans une petite portion de l’espace de recherche des disparités ainsi qu’en traitant les occlusions et les ambigüités d’appariement. Ces solutions photogrammétriques ont été largement testées à l’aide de données synthétiques, d’images à courte portée ainsi que celles acquises sur le site de la gravière. Le système a démontré sa capacité a modélisation dense de l’environnement avec une très haute exactitude en atteignant une précision 3D absolue de l’ordre de 11±7 mm

    Robust surface modelling of visual hull from multiple silhouettes

    Get PDF
    Reconstructing depth information from images is one of the actively researched themes in computer vision and its application involves most vision research areas from object recognition to realistic visualisation. Amongst other useful vision-based reconstruction techniques, this thesis extensively investigates the visual hull (VH) concept for volume approximation and its robust surface modelling when various views of an object are available. Assuming that multiple images are captured from a circular motion, projection matrices are generally parameterised in terms of a rotation angle from a reference position in order to facilitate the multi-camera calibration. However, this assumption is often violated in practice, i.e., a pure rotation in a planar motion with accurate rotation angle is hardly realisable. To address this problem, at first, this thesis proposes a calibration method associated with the approximate circular motion. With these modified projection matrices, a resulting VH is represented by a hierarchical tree structure of voxels from which surfaces are extracted by the Marching cubes (MC) algorithm. However, the surfaces may have unexpected artefacts caused by a coarser volume reconstruction, the topological ambiguity of the MC algorithm, and imperfect image processing or calibration result. To avoid this sensitivity, this thesis proposes a robust surface construction algorithm which initially classifies local convex regions from imperfect MC vertices and then aggregates local surfaces constructed by the 3D convex hull algorithm. Furthermore, this thesis also explores the use of wide baseline images to refine a coarse VH using an affine invariant region descriptor. This improves the quality of VH when a small number of initial views is given. In conclusion, the proposed methods achieve a 3D model with enhanced accuracy. Also, robust surface modelling is retained when silhouette images are degraded by practical noise

    Efficient and Accurate Disparity Estimation from MLA-Based Plenoptic Cameras

    Get PDF
    This manuscript focuses on the processing images from microlens-array based plenoptic cameras. These cameras enable the capturing of the light field in a single shot, recording a greater amount of information with respect to conventional cameras, allowing to develop a whole new set of applications. However, the enhanced information introduces additional challenges and results in higher computational effort. For one, the image is composed of thousand of micro-lens images, making it an unusual case for standard image processing algorithms. Secondly, the disparity information has to be estimated from those micro-images to create a conventional image and a three-dimensional representation. Therefore, the work in thesis is devoted to analyse and propose methodologies to deal with plenoptic images. A full framework for plenoptic cameras has been built, including the contributions described in this thesis. A blur-aware calibration method to model a plenoptic camera, an optimization method to accurately select the best microlenses combination, an overview of the different types of plenoptic cameras and their representation. Datasets consisting of both real and synthetic images have been used to create a benchmark for different disparity estimation algorithm and to inspect the behaviour of disparity under different compression rates. A robust depth estimation approach has been developed for light field microscopy and image of biological samples
    • …
    corecore