24 research outputs found

    Robust Self-calibration of Focal Lengths from the Fundamental Matrix

    Full text link
    The problem of self-calibration of two cameras from a given fundamental matrix is one of the basic problems in geometric computer vision. Under the assumption of known principal points and square pixels, the well-known Bougnoux formula offers a means to compute the two unknown focal lengths. However, in many practical situations, the formula yields inaccurate results due to commonly occurring singularities. Moreover, the estimates are sensitive to noise in the computed fundamental matrix and to the assumed positions of the principal points. In this paper, we therefore propose an efficient and robust iterative method to estimate the focal lengths along with the principal points of the cameras given a fundamental matrix and priors for the estimated camera parameters. In addition, we study a computationally efficient check of models generated within RANSAC that improves the accuracy of the estimated models while reducing the total computational time. Extensive experiments on real and synthetic data show that our iterative method brings significant improvements in terms of the accuracy of the estimated focal lengths over the Bougnoux formula and other state-of-the-art methods, even when relying on inaccurate priors

    2D-to-3D photo rendering for 3D displays

    Get PDF

    Automatic camera tracking

    Get PDF

    Monocular Vision based Crowdsourced 3D Traffic Sign Positioning with Unknown Camera Intrinsics and Distortion Coefficients

    Full text link
    Autonomous vehicles and driver assistance systems utilize maps of 3D semantic landmarks for improved decision making. However, scaling the mapping process as well as regularly updating such maps come with a huge cost. Crowdsourced mapping of these landmarks such as traffic sign positions provides an appealing alternative. The state-of-the-art approaches to crowdsourced mapping use ground truth camera parameters, which may not always be known or may change over time. In this work, we demonstrate an approach to computing 3D traffic sign positions without knowing the camera focal lengths, principal point, and distortion coefficients a priori. We validate our proposed approach on a public dataset of traffic signs in KITTI. Using only a monocular color camera and GPS, we achieve an average single journey relative and absolute positioning accuracy of 0.26 m and 1.38 m, respectively.Comment: Accepted at 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC

    Mise en correspondance semi-dense de points de vue éloignés et non calibrés

    Get PDF
    Cet article propose une méthode générale de mise en correspondance semi-dense d'une paire d'images couleur prises de points de vue éloignés et non calibrés. Après une initialisation de la géométrie épipolaire et des appariements par le descripteur d'images SIFT, la contrainte épipolaire est récursivement resserrée en vue d'affiner la mise en correspondance. En fin du processus itératif, une étape de densification par corrélation affine permet d'obtenir entre 1733 et 10717 appariements entre deux images de résolution comprise entre 800x600 et 1024x768. Le processus complet s'exécute en environ 2 minutes sur un ordinateur de type Pentium IV à 3GHz, sans optimisation particulière

    EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata

    Full text link
    We learn a visual representation that captures information about the camera that recorded a given photo. To do this, we train a multimodal embedding between image patches and the EXIF metadata that cameras automatically insert into image files. Our model represents this metadata by simply converting it to text and then processing it with a transformer. The features that we learn significantly outperform other self-supervised and supervised features on downstream image forensics and calibration tasks. In particular, we successfully localize spliced image regions "zero shot" by clustering the visual embeddings for all of the patches within an image.Comment: Project link: http://hellomuffin.github.io/exif-as-languag

    Biometric fusion methods for adaptive face recognition in computer vision

    Get PDF
    PhD ThesisFace recognition is a biometric method that uses different techniques to identify the individuals based on the facial information received from digital image data. The system of face recognition is widely used for security purposes, which has challenging problems. The solutions to some of the most important challenges are proposed in this study. The aim of this thesis is to investigate face recognition across pose problem based on the image parameters of camera calibration. In this thesis, three novel methods have been derived to address the challenges of face recognition and offer solutions to infer the camera parameters from images using a geomtric approach based on perspective projection. The following techniques were used: camera calibration CMT and Face Quadtree Decomposition (FQD), in order to develop the face camera measurement technique (FCMT) for human facial recognition. Facial information from a feature extraction and identity-matching algorithm has been created. The success and efficacy of the proposed algorithm are analysed in terms of robustness to noise, the accuracy of distance measurement, and face recognition. To overcome the intrinsic and extrinsic parameters of camera calibration parameters, a novel technique has been developed based on perspective projection, which uses different geometrical shapes to calibrate the camera. The parameters used in novel measurement technique CMT that enables the system to infer the real distance for regular and irregular objects from the 2-D images. The proposed system of CMT feeds into FQD to measure the distance between the facial points. Quadtree decomposition enhances the representation of edges and other singularities along curves of the face, and thus improves directional features from face detection across face pose. The proposed FCMT system is the new combination of CMT and FQD to recognise the faces in the various pose. The theoretical foundation of the proposed solutions has been thoroughly developed and discussed in detail. The results show that the proposed algorithms outperform existing algorithms in face recognition, with a 2.5% improvement in main error recognition rate compared with recent studies

    Robust Focal Length Computation

    Get PDF
    Problém automatického výpočtu ohniskových vzdáleností dvou kamer z odpovídajících obrázků je stále obtížnou úlohou pro komunitu 3D rekonstrukce. Pro vyřešení tohoto problému byla navržena řada metod. Má se ale za to, že žadná z nich nefunguje natolik dobře, aby mohla být použita v praktické situaci. V této tezi se proto zaměříme na úlohu výpočtu ohniskové vzdálenosti z korespondencí v obrázku, kterou považujeme za chybějící článek pro vyřešení problému. Zejména se zaměříme na existující algebraické solvery pro výpočet fundamentální matice, a na Bougnouxův vzorec, který z ní vypočte ohniskovou vzdálenost. Tyto metody prozkoumáme a zanalyzujeme jejich výkonnost. Ukážeme, že počet imaginárních odhadů, jakož i chyba odhadu ohniskové vzdálenosti klesají s rostoucím počtem použitých korespondencí. Rovněž zanalyzujeme degenerace metod a jejich efektivitu v degenerovaných situacích, stejně jako výkon existujících iteračních solverů [10,4], a navrhneme zlepšení solveru [4]. Dále provedeme analýzu problému výpočtu ohniskové vzdálenosti z hlediska algebraické geometrie. Ukážeme, že kromě Bougnouxova vzorce existují další dva vzorce pro výpočet ohniskové vzdálenosti kamery z fundamentální matice. Ukážeme, že použitím spravného z těchto vzorců se můžeme v některých případech vyhnout známé degeneraci. Konkrétně takové, kde rovina definovaná baselineou a optickou osou jedné kamery je kolmá na rovinu definovanou baselineou a optickou osou druhé kamery, a kde Bougnouxův vzorec [3] selhává. Degenerace se redukuje na případ, kdy selhávají všechny tři vzorce.The problem of automatically computing focal lengths of a pair of cameras from corre- sponding pair of images has long been a daunting task for 3D reconstruction community. A number of methods were developed, but the commonly held view is that neither of them works good enough to be used in practical situations. We focus on the particular task of computing focal lengths from the point correspondences, which we deem to be the missing link for the problem solution. We especially focus on existing algebraic solvers for computing the fundamental ma- trix and the Bougnoux formula for computing the focal lengths therefrom. We survey these methods, as well as iterative methods [10, 4] proposed as their extensions, and analyze their performance. Our results show that the number of imaginary estimates, as well as the error of the estimation, declines with growing number of correspondences used. Moreover, based on our analysis we suggest that the computation of the ratio of focal length r = f 2 /f 1 is more robust than computation of f 1 or f 2 alone. We propose an improvement to the solver of [10] based on this suggestion. We furthermore assess performance of the methods in degenerate situations, and show that for bigger levels of noise the effect of the degeneracies significantly decreases. Specifically, the degenerate case of intersecting optical axes is shown to almost vanish for realistic levels of noise. We finally analyze the problem of computing focal length from the theoretical stand- point of algebraic geometry, and give two new formulae for computing camera focal length from a fundamental matrix. We show that using the right of them might help to avoid a known degeneracy. Specifically, the degeneracy where the plane defined by the baseline and the optical axis of one camera is perpendicular to the plane defined by the baseline and optical axis of the other camera, and where Bougnoux ([3]) formula fails can in some cases be avoided. The degeneracy reduces to the case where all three formulae fail
    corecore