7 research outputs found

    Depth Estimation Analysis of Orthogonally Divergent Fisheye Cameras with Distortion Removal

    Full text link
    Stereo vision systems have become popular in computer vision applications, such as 3D reconstruction, object tracking, and autonomous navigation. However, traditional stereo vision systems that use rectilinear lenses may not be suitable for certain scenarios due to their limited field of view. This has led to the popularity of vision systems based on one or multiple fisheye cameras in different orientations, which can provide a field of view of 180x180 degrees or more. However, fisheye cameras introduce significant distortion at the edges that affects the accuracy of stereo matching and depth estimation. To overcome these limitations, this paper proposes a method for distortion-removal and depth estimation analysis for stereovision system using orthogonally divergent fisheye cameras (ODFC). The proposed method uses two virtual pinhole cameras (VPC), each VPC captures a small portion of the original view and presents it without any lens distortions, emulating the behavior of a pinhole camera. By carefully selecting the captured regions, it is possible to create a stereo pair using two VPCs. The performance of the proposed method is evaluated in both simulation using virtual environment and experiments using real cameras and their results compared to stereo cameras with parallel optical axes. The results demonstrate the effectiveness of the proposed method in terms of distortion removal and depth estimation accuracy

    RomniStereo: Recurrent Omnidirectional Stereo Matching

    Full text link
    Omnidirectional stereo matching (OSM) is an essential and reliable means for 360∘360^{\circ} depth sensing. However, following earlier works on conventional stereo matching, prior state-of-the-art (SOTA) methods rely on a 3D encoder-decoder block to regularize the cost volume, causing the whole system complicated and sub-optimal results. Recently, the Recurrent All-pairs Field Transforms (RAFT) based approach employs the recurrent update in 2D and has efficiently improved image-matching tasks, ie, optical flow, and stereo matching. To bridge the gap between OSM and RAFT, we mainly propose an opposite adaptive weighting scheme to seamlessly transform the outputs of spherical sweeping of OSM into the required inputs for the recurrent update, thus creating a recurrent omnidirectional stereo matching (RomniStereo) algorithm. Furthermore, we introduce two techniques, ie, grid embedding and adaptive context feature generation, which also contribute to RomniStereo's performance. Our best model improves the average MAE metric by 40.7\% over the previous SOTA baseline across five datasets. When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples. The code is available at \url{https://github.com/HalleyJiang/RomniStereo}.Comment: accepted by IEEE RA-L, https://github.com/HalleyJiang/RomniStere

    Omnidirectional Stereo Vision Study from Vertical and Horizontal Stereo Configuration

    Get PDF
    In stereo vision, an omnidirectional camera has high distortion compared to a standard camera, so the camera calibration is very decisive in its stereo matching. In this study, we will perform stereo matching for an omnidirectional camera with vertical and horizontal configuration so that the result of the image's depth has a 360-degree field of view by transforming the image using a calibration-based method. The result is that by using a vertical camera configuration, the image can be stereo matched directly, but by configuring a horizontal image, it is necessary to carry out a different stereo-matching process in each direction. Stereo matching with the semi-global matching method has better image results than block matching with more image objects detectable by the semi-global block matching method with a maximum disparity value of 32 pixels and with a window size of 21 pixels

    D2D^2SLAM: Decentralized and Distributed Collaborative Visual-inertial SLAM System for Aerial Swarm

    Full text link
    In recent years, aerial swarm technology has developed rapidly. In order to accomplish a fully autonomous aerial swarm, a key technology is decentralized and distributed collaborative SLAM (CSLAM) for aerial swarms, which estimates the relative pose and the consistent global trajectories. In this paper, we propose D2D^2SLAM: a decentralized and distributed (D2D^2) collaborative SLAM algorithm. This algorithm has high local accuracy and global consistency, and the distributed architecture allows it to scale up. D2D^2SLAM covers swarm state estimation in two scenarios: near-field state estimation for high real-time accuracy at close range and far-field state estimation for globally consistent trajectories estimation at the long-range between UAVs. Distributed optimization algorithms are adopted as the backend to achieve the D2D^2 goal. D2D^2SLAM is robust to transient loss of communication, network delays, and other factors. Thanks to the flexible architecture, D2D^2SLAM has the potential of applying in various scenarios

    Visual Odometry and Sparse Scene Reconstruction for UAVs with a Multi-Fisheye Camera System

    Get PDF
    Autonomously operating UAVs demand a fast localization for navigation, to actively explore unknown areas and to create maps. For pose estimation, many UAV systems make use of a combination of GPS receivers and inertial sensor units (IMU). However, GPS signal coverage may go down occasionally, especially in the close vicinity of objects, and precise IMUs are too heavy to be carried by lightweight UAVs. This and the high cost of high quality IMU motivate the use of inexpensive vision based sensors for localization using visual odometry or visual SLAM (simultaneous localization and mapping) techniques. The first contribution of this thesis is a more general approach to bundle adjustment with an extended version of the projective coplanarity equation which enables us to make use of omnidirectional multi-camera systems which may consist of fisheye cameras that can capture a large field of view with one shot. We use ray directions as observations instead of image points which is why our approach does not rely on a specific projection model assuming a central projection. In addition, our approach allows the integration and estimation of points at infinity, which classical bundle adjustments are not capable of. We show that the integration of far or infinitely far points stabilizes the estimation of the rotation angles of the camera poses. In its second contribution, we employ this approach to bundle adjustment in a highly integrated system for incremental pose estimation and mapping on light-weight UAVs. Based on the image sequences of a multi-camera system our system makes use of tracked feature points to incrementally build a sparse map and incrementally refines this map using the iSAM2 algorithm. Our system is able to optionally integrate GPS information on the level of carrier phase observations even in underconstrained situations, e.g. if only two satellites are visible, for georeferenced pose estimation. This way, we are able to use all available information in underconstrained GPS situations to keep the mapped 3D model accurate and georeferenced. In its third contribution, we present an approach for re-using existing methods for dense stereo matching with fisheye cameras, which has the advantage that highly optimized existing methods can be applied as a black-box without modifications even with cameras that have field of view of more than 180 deg. We provide a detailed accuracy analysis of the obtained dense stereo results. The accuracy analysis shows the growing uncertainty of observed image points of fisheye cameras due to increasing blur towards the image border. Core of the contribution is a rigorous variance component estimation which allows to estimate the variance of the observed disparities at an image point as a function of the distance of that point to the principal point. We show that this improved stochastic model provides a more realistic prediction of the uncertainty of the triangulated 3D points.Autonom operierende UAVs benötigen eine schnelle Lokalisierung zur Navigation, zur Exploration unbekannter Umgebungen und zur Kartierung. Zur Posenbestimmung verwenden viele UAV-Systeme eine Kombination aus GPS-EmpfĂ€ngern und Inertial-Messeinheiten (IMU). Die VerfĂŒgbarkeit von GPS-Signalen ist jedoch nicht ĂŒberall gewĂ€hrleistet, insbesondere in der NĂ€he abschattender Objekte, und prĂ€zise IMUs sind fĂŒr leichtgewichtige UAVs zu schwer. Auch die hohen Kosten qualitativ hochwertiger IMUs motivieren den Einsatz von kostengĂŒnstigen bildgebenden Sensoren zur Lokalisierung mittels visueller Odometrie oder SLAM-Techniken zur simultanen Lokalisierung und Kartierung. Im ersten wissenschaftlichen Beitrag dieser Arbeit entwickeln wir einen allgemeineren Ansatz fĂŒr die BĂŒndelausgleichung mit einem erweiterten Modell fĂŒr die projektive KollinearitĂ€tsgleichung, sodass auch omnidirektionale Multikamerasysteme verwendet werden können, welche beispielsweise bestehend aus Fisheyekameras mit einer Aufnahme einen großen Sichtbereich abdecken. Durch die Integration von Strahlrichtungen als Beobachtungen ist unser Ansatz nicht von einem kameraspezifischen Abbildungsmodell abhĂ€ngig solange dieses der Zentralprojektion folgt. Zudem erlaubt unser Ansatz die Integration und SchĂ€tzung von unendlich fernen Punkten, was bei klassischen BĂŒndelausgleichungen nicht möglich ist. Wir zeigen, dass durch die Integration weit entfernter und unendlich ferner Punkte die SchĂ€tzung der Rotationswinkel der Kameraposen stabilisiert werden kann. Im zweiten Beitrag verwenden wir diesen entwickelten Ansatz zur BĂŒndelausgleichung fĂŒr ein System zur inkrementellen PosenschĂ€tzung und dĂŒnnbesetzten Kartierung auf einem leichtgewichtigen UAV. Basierend auf den Bildsequenzen eines Mulitkamerasystems baut unser System mittels verfolgter markanter Bildpunkte inkrementell eine dĂŒnnbesetzte Karte auf und verfeinert diese inkrementell mittels des iSAM2-Algorithmus. Unser System ist in der Lage optional auch GPS Informationen auf dem Level von GPS-TrĂ€gerphasen zu integrieren, wodurch sogar in unterbestimmten Situation - beispielsweise bei nur zwei verfĂŒgbaren Satelliten - diese Informationen zur georeferenzierten PosenschĂ€tzung verwendet werden können. Im dritten Beitrag stellen wir einen Ansatz zur Verwendung existierender Methoden fĂŒr dichtes Stereomatching mit Fisheyekameras vor, sodass hoch optimierte existierende Methoden als Black Box ohne Modifzierungen sogar mit Kameras mit einem Gesichtsfeld von mehr als 180 Grad verwendet werden können. Wir stellen eine detaillierte Genauigkeitsanalyse basierend auf dem Ergebnis des dichten Stereomatchings dar. Die Genauigkeitsanalyse zeigt, wie stark die Genauigkeit beobachteter Bildpunkte bei Fisheyekameras zum Bildrand aufgrund von zunehmender UnschĂ€rfe abnimmt. Das KernstĂŒck dieses Beitrags ist eine VarianzkomponentenschĂ€tzung, welche die SchĂ€tzung der Varianz der beobachteten DisparitĂ€ten an einem Bildpunkt als Funktion von der Distanz dieses Punktes zum Hauptpunkt des Bildes ermöglicht. Wir zeigen, dass dieses verbesserte stochastische Modell eine realistischere PrĂ€diktion der Genauigkeiten der 3D Punkte ermöglicht
    corecore