84 research outputs found

    An A Contrario Model for Matching Interest Points under Geometric and Photometric Constraints

    Get PDF
    International audienceFinding point correspondences between two views is generally based on the matching of local photometric descriptors. A subsequent geometric constraint ensures that the set of matching points is consistent with a realistic camera motion. Starting from a paper by Moisan and Stival, we propose an a contrario model for matching interest points based on descriptor similarity and geometric constraints. The resulting algorithm has adaptive matching thresholds and is able to detect point correspondences whose associated descriptors are not the first nearest neighbor. We also discuss the specific difficulties raised by images containing repeated patterns which are likely to introduce correspondences beyond the nearest neighbor

    Image point correspondences and repeated patterns

    Get PDF
    Matching or tracking interest points between several views is one of the keystones of many computer vision applications. The procedure generally consists in several independent steps, basically interest point extraction, then interest point matching by keeping only the ''best correspondences'' with respect to similarity between some local descriptors, and final correspondence pruning to keep those that are consistent with a realistic camera motion (here, consistent with epipolar constraints or homography transformation.) Each step in itself is a delicate task which may endanger the whole process. In particular, repeated patterns give lots of false correspondences in descriptor-based matching which are hardly, if ever, recovered by the final pruning step. We discuss here the specific difficulties raised by repeated patterns in the point correspondence problem. Then we show to what extent it is possible to address these difficulties. Starting from a statistical model by Moisan and Stival, we propose a one-stage approach for matching interest points based on simultaneous descriptor similarity and geometric constraint. The resulting algorithm has adaptive matching thresholds and is able to pick up point correspondences beyond the nearest neighbour. We also discuss Generalized Ransac and we show how to improve Morel and Yu's Asift, an effective point matching algorithm to make it more robust to the presence of repeated patterns.L'appariement ou le suivi de points d'intérêt entre plusieurs images est la brique de base de nombreuses applications en vision par ordinateur. La procédure consiste généralement en plusieurs étapes indépendantes, à savoir : l'extraction des points d'intérêt, puis l'appariement des points d'intérêt en gardant les "meilleures correspondances" selon la ressemblance de descripteurs locaux, et enfin l'élagage de l'ensemble des correspondances pour garder celles cohérentes avec un mouvement de caméra (ici, cohérentes selon les contraintes épipolaires ou une homographie globale). Chaque étape est une tâche délicate qui peut compromettre le succès du processus entier. En particulier, les motifs répétés génèrent de nombreux faux appariements qui sont difficilement rattrapés par l'élagage final. Dans ce rapport nous discutons les difficultés spécifiques soulevées par les motifs répétés dans l'appariement de points. Ensuite nous montrons dans quelle mesure il est possible de dépasser ces difficultés. En reprenant un modèle statistique proposé par Moisan et Stival, nous proposons une nouvelle approche prenant en compte simultanément la ressemblance des descripteurs et la contrainte géométrique. L'algorithme a des seuils d'appariement adaptatifs et est capable de sélectionner des correspondances au delà du plus proche voisin. Nous discutons aussi Ransac généralisé et nous montrons comment améliorer Asift de Morel et Yu pour le rendre robuste à la présence de motifs répétés

    Lifting GIS Maps into Strong Geometric Context for Scene Understanding

    Full text link
    Contextual information can have a substantial impact on the performance of visual tasks such as semantic segmentation, object detection, and geometric estimation. Data stored in Geographic Information Systems (GIS) offers a rich source of contextual information that has been largely untapped by computer vision. We propose to leverage such information for scene understanding by combining GIS resources with large sets of unorganized photographs using Structure from Motion (SfM) techniques. We present a pipeline to quickly generate strong 3D geometric priors from 2D GIS data using SfM models aligned with minimal user input. Given an image resectioned against this model, we generate robust predictions of depth, surface normals, and semantic labels. We show that the precision of the predicted geometry is substantially more accurate other single-image depth estimation methods. We then demonstrate the utility of these contextual constraints for re-scoring pedestrian detections, and use these GIS contextual features alongside object detection score maps to improve a CRF-based semantic segmentation framework, boosting accuracy over baseline models

    Determining point correspondences between two views under geometric constraint and photometric consistency

    Get PDF
    Matching or tracking points of interest between several views is one of the keystones of many computer vision applications, especially when considering structure and motion estimation. The procedure generally consists in several independent steps, basically 1) point of interest extraction, 2) point of interest matching by keeping only the ``best correspondences'' with respect to similarity between some local descriptors, 3) correspondence pruning to keep those consistent with an estimated camera motion (here, consistent with epipolar constraints or homography transformation). Each step in itself is a touchy task which may endanger the whole process. In particular, repeated patterns give lots of false matches in step 2) which are hardly, if never, recovered by step 3). Starting from a statistical model by Moisan and Stival, we propose a new one-stage approach to steps 2) and 3), which does not need tricky parameters. The advantage of the proposed method is its robustness to repeated patterns

    Virtual Line Descriptor and Semi-Local Matching Method for Reliable Feature Correspondence

    Get PDF
    International audienceFinding reliable correspondences between sets of feature points in two images remains challenging in case of ambiguities or strong transformations. In this paper, we define a photometric descriptor for virtual lines that join neighbouring feature points. We show that it can be used in the second-order term of existing graph matchers to significantly improve their accuracy. We also define a semi-local matching method based on this descriptor. We show that it is robust to strong transformations and more accurate than existing graph matchers for scenes with significant occlusions, including for very low inlier rates. Used as a preprocessor to filter outliers from match candidates, it significantly improves the robustness of RANSAC and reduces camera calibration errors

    An Efficient Point-Matching Method Based on Multiple Geometrical Hypotheses

    Get PDF
    Point matching in multiple images is an open problem in computer vision because of the numerous geometric transformations and photometric conditions that a pixel or point might exhibit in the set of images. Over the last two decades, different techniques have been proposed to address this problem. The most relevant are those that explore the analysis of invariant features. Nonetheless, their main limitation is that invariant analysis all alone cannot reduce false alarms. This paper introduces an efficient point-matching method for two and three views, based on the combined use of two techniques: (1) the correspondence analysis extracted from the similarity of invariant features and (2) the integration of multiple partial solutions obtained from 2D and 3D geometry. The main strength and novelty of this method is the determination of the point-to-point geometric correspondence through the intersection of multiple geometrical hypotheses weighted by the maximum likelihood estimation sample consensus (MLESAC) algorithm. The proposal not only extends the methods based on invariant descriptors but also generalizes the correspondence problem to a perspective projection model in multiple views. The developed method has been evaluated on three types of image sequences: outdoor, indoor, and industrial. Our developed strategy discards most of the wrong matches and achieves remarkable F-scores of 97%, 87%, and 97% for the outdoor, indoor, and industrial sequences, respectively

    3D SEM Surface Reconstruction: An Optimized, Adaptive, and Intelligent Approach

    Get PDF
    Structural analysis of microscopic objects is a longstanding topic in several scientific disciplines, including biological, mechanical, and material sciences. The scanning electron microscope (SEM), as a promising imaging equipment has been around to determine the surface properties (e.g., compositions or geometries) of specimens by achieving increased magnification, contrast, and resolution greater than one nanometer. Whereas SEM micrographs still remain two-dimensional (2D), many research and educational questions truly require knowledge and information about their three-dimensional (3D) surface structures. Having 3D surfaces from SEM images would provide true anatomic shapes of micro samples which would allow for quantitative measurements and informative visualization of the systems being investigated. In this research project, we novel design and develop an optimized, adaptive, and intelligent multi-view approach named 3DSEM++ for 3D surface reconstruction of SEM images, making a 3D SEM dataset publicly and freely available to the research community. The work is expected to stimulate more interest and draw attention from the computer vision and multimedia communities to the fast-growing SEM application area

    How to Overcome Perceptual Aliasing in ASIFT?

    Get PDF
    International audienceSIFT is one of the most popular algorithms to extract points of interest from images. It is a scale+rotation invariant method. As a consequence, if one compares points of interest between two images subject to a large viewpoint change, then only a few, if any, common points will be retrieved. This may lead subsequent algorithms to failure, especially when considering structure and motion or object recognition problems. Reaching at least affine invariance is crucial for reliable point correspondences. Successful approaches have been recently proposed by several authors to strengthen scale+rotation invariance into affine invariance, using viewpoint simulation (e.g. the ASIFT algorithm). However, almost all resulting algorithms fail in presence of repeated patterns, which are common in man-made environments, because of the so-called perceptual aliasing. Focusing on ASIFT, we show how to overcome the perceptual aliasing problem. To the best of our knowledge, the resulting algorithm performs better than any existing generic point matching procedure

    Robust Visual SLAM in Challenging Environments with Low-texture and Dynamic Illumination

    Get PDF
    - Robustness to Dynamic Illumination conditions is also one of the main open challenges in visual odometry and SLAM, e.g. high dynamic range (HDR) environments. The main difficulties in these situations come from both the limitations of the sensors, for instance automatic settings of a camera might not react fast enough to properly record dynamic illumination changes, and also from limitations in the algorithms, e.g. the track of interest points is typically based on brightness constancy. The work of this thesis contributes to mitigate these phenomena from two different perspectives. The first one addresses this problem from a deep learning perspective by enhancing images to invariant and richer representations for VO and SLAM, benefiting from the generalization properties of deep neural networks. In this work it is also demonstrated how the insertion of long short term memory (LSTM) allows us to obtain temporally consistent sequences, since the estimation depends on previous states. Secondly, a more traditional perspective is exploited to contribute with a purely geometric-based tracking of line segments in challenging stereo streams with complex or varying illumination, since they are intrinsically more informative. Fecha de lectura de Tesis Doctoral: 26 de febrero 2020In the last years, visual Simultaneous Localization and Mapping (SLAM) has played a role of capital importance in rapid technological advances, e.g. mo- bile robotics and applications such as virtual, augmented, or mixed reality (VR/AR/MR), as a vital part of their processing pipelines. As its name indicates, it comprises the estimation of the state of a robot (typically the pose) while, simultaneously, incrementally building and refining a consistent representation of the environment, i.e. the so-called map, based on the equipped sensors. Despite the maturity reached by state-of-art visual SLAM techniques in controlled environments, there are still many open challenges to address be- fore reaching a SLAM system robust to long-term operations in uncontrolled scenarios, where classical assumptions, such as static environments, do not hold anymore. This thesis contributes to improve robustness of visual SLAM in harsh or difficult environments, in particular: - Low-textured Environments, where traditional approaches suffer from an accuracy impoverishment and, occasionally, the absolute failure of the system. Fortunately, many of such low-textured environments contain planar elements that are rich in linear shapes, so an alternative feature choice such as line segments would exploit information from structured parts of the scene. This set of contributions exploits both type of features, i.e. points and line segments, to produce visual odometry and SLAM algorithms robust in a broader variety of environments, hence leveraging them at all instances of the related processes: monocular depth estimation, visual odometry, keyframe selection, bundle adjustment, loop closing, etc. Additionally, an open-source C++ implementation of the proposed algorithms has been released along with the published articles and some extra multimedia material for the benefit of the community

    Hierarchical structure-and-motion recovery from uncalibrated images

    Full text link
    This paper addresses the structure-and-motion problem, that requires to find camera motion and 3D struc- ture from point matches. A new pipeline, dubbed Samantha, is presented, that departs from the prevailing sequential paradigm and embraces instead a hierarchical approach. This method has several advantages, like a provably lower computational complexity, which is necessary to achieve true scalability, and better error containment, leading to more stability and less drift. Moreover, a practical autocalibration procedure allows to process images without ancillary information. Experiments with real data assess the accuracy and the computational efficiency of the method.Comment: Accepted for publication in CVI
    • …
    corecore