7 research outputs found
Depth Estimation Analysis of Orthogonally Divergent Fisheye Cameras with Distortion Removal
Stereo vision systems have become popular in computer vision applications,
such as 3D reconstruction, object tracking, and autonomous navigation. However,
traditional stereo vision systems that use rectilinear lenses may not be
suitable for certain scenarios due to their limited field of view. This has led
to the popularity of vision systems based on one or multiple fisheye cameras in
different orientations, which can provide a field of view of 180x180 degrees or
more. However, fisheye cameras introduce significant distortion at the edges
that affects the accuracy of stereo matching and depth estimation. To overcome
these limitations, this paper proposes a method for distortion-removal and
depth estimation analysis for stereovision system using orthogonally divergent
fisheye cameras (ODFC). The proposed method uses two virtual pinhole cameras
(VPC), each VPC captures a small portion of the original view and presents it
without any lens distortions, emulating the behavior of a pinhole camera. By
carefully selecting the captured regions, it is possible to create a stereo
pair using two VPCs. The performance of the proposed method is evaluated in
both simulation using virtual environment and experiments using real cameras
and their results compared to stereo cameras with parallel optical axes. The
results demonstrate the effectiveness of the proposed method in terms of
distortion removal and depth estimation accuracy
RomniStereo: Recurrent Omnidirectional Stereo Matching
Omnidirectional stereo matching (OSM) is an essential and reliable means for
depth sensing. However, following earlier works on conventional
stereo matching, prior state-of-the-art (SOTA) methods rely on a 3D
encoder-decoder block to regularize the cost volume, causing the whole system
complicated and sub-optimal results. Recently, the Recurrent All-pairs Field
Transforms (RAFT) based approach employs the recurrent update in 2D and has
efficiently improved image-matching tasks, ie, optical flow, and stereo
matching. To bridge the gap between OSM and RAFT, we mainly propose an opposite
adaptive weighting scheme to seamlessly transform the outputs of spherical
sweeping of OSM into the required inputs for the recurrent update, thus
creating a recurrent omnidirectional stereo matching (RomniStereo) algorithm.
Furthermore, we introduce two techniques, ie, grid embedding and adaptive
context feature generation, which also contribute to RomniStereo's performance.
Our best model improves the average MAE metric by 40.7\% over the previous SOTA
baseline across five datasets. When visualizing the results, our models
demonstrate clear advantages on both synthetic and realistic examples. The code
is available at \url{https://github.com/HalleyJiang/RomniStereo}.Comment: accepted by IEEE RA-L, https://github.com/HalleyJiang/RomniStere
Omnidirectional Stereo Vision Study from Vertical and Horizontal Stereo Configuration
In stereo vision, an omnidirectional camera has high distortion compared to a standard camera, so the camera calibration is very decisive in its stereo matching. In this study, we will perform stereo matching for an omnidirectional camera with vertical and horizontal configuration so that the result of the image's depth has a 360-degree field of view by transforming the image using a calibration-based method. The result is that by using a vertical camera configuration, the image can be stereo matched directly, but by configuring a horizontal image, it is necessary to carry out a different stereo-matching process in each direction. Stereo matching with the semi-global matching method has better image results than block matching with more image objects detectable by the semi-global block matching method with a maximum disparity value of 32 pixels and with a window size of 21 pixels
SLAM: Decentralized and Distributed Collaborative Visual-inertial SLAM System for Aerial Swarm
In recent years, aerial swarm technology has developed rapidly. In order to
accomplish a fully autonomous aerial swarm, a key technology is decentralized
and distributed collaborative SLAM (CSLAM) for aerial swarms, which estimates
the relative pose and the consistent global trajectories. In this paper, we
propose SLAM: a decentralized and distributed () collaborative SLAM
algorithm. This algorithm has high local accuracy and global consistency, and
the distributed architecture allows it to scale up. SLAM covers swarm
state estimation in two scenarios: near-field state estimation for high
real-time accuracy at close range and far-field state estimation for globally
consistent trajectories estimation at the long-range between UAVs. Distributed
optimization algorithms are adopted as the backend to achieve the goal.
SLAM is robust to transient loss of communication, network delays, and
other factors. Thanks to the flexible architecture, SLAM has the potential
of applying in various scenarios
Visual Odometry and Sparse Scene Reconstruction for UAVs with a Multi-Fisheye Camera System
Autonomously operating UAVs demand a fast localization for navigation, to actively explore unknown areas and to create maps. For pose estimation, many UAV systems make use of a combination of GPS receivers and inertial sensor units (IMU). However, GPS signal coverage may go down occasionally, especially in the close vicinity of objects, and precise IMUs are too heavy to be carried by lightweight UAVs. This and the high cost of high quality IMU motivate the use of inexpensive vision based sensors for localization using visual odometry or visual SLAM (simultaneous localization and mapping) techniques. The first contribution of this thesis is a more general approach to bundle adjustment with an extended version of the projective coplanarity equation which enables us to make use of omnidirectional multi-camera systems which may consist of fisheye cameras that can capture a large field of view with one shot. We use ray directions as observations instead of image points which is why our approach does not rely on a specific projection model assuming a central projection. In addition, our approach allows the integration and estimation of points at infinity, which classical bundle adjustments are not capable of. We show that the integration of far or infinitely far points stabilizes the estimation of the rotation angles of the camera poses. In its second contribution, we employ this approach to bundle adjustment in a highly integrated system for incremental pose estimation and mapping on light-weight UAVs. Based on the image sequences of a multi-camera system our system makes use of tracked feature points to incrementally build a sparse map and incrementally refines this map using the iSAM2 algorithm. Our system is able to optionally integrate GPS information on the level of carrier phase observations even in underconstrained situations, e.g. if only two satellites are visible, for georeferenced pose estimation. This way, we are able to use all available information in underconstrained GPS situations to keep the mapped 3D model accurate and georeferenced. In its third contribution, we present an approach for re-using existing methods for dense stereo matching with fisheye cameras, which has the advantage that highly optimized existing methods can be applied as a black-box without modifications even with cameras that have field of view of more than 180 deg. We provide a detailed accuracy analysis of the obtained dense stereo results. The accuracy analysis shows the growing uncertainty of observed image points of fisheye cameras due to increasing blur towards the image border. Core of the contribution is a rigorous variance component estimation which allows to estimate the variance of the observed disparities at an image point as a function of the distance of that point to the principal point. We show that this improved stochastic model provides a more realistic prediction of the uncertainty of the triangulated 3D points.Autonom operierende UAVs benötigen eine schnelle Lokalisierung zur Navigation, zur Exploration unbekannter Umgebungen und zur Kartierung. Zur Posenbestimmung verwenden viele UAV-Systeme eine Kombination aus GPS-EmpfĂ€ngern und Inertial-Messeinheiten (IMU). Die VerfĂŒgbarkeit von GPS-Signalen ist jedoch nicht ĂŒberall gewĂ€hrleistet, insbesondere in der NĂ€he abschattender Objekte, und prĂ€zise IMUs sind fĂŒr leichtgewichtige UAVs zu schwer. Auch die hohen Kosten qualitativ hochwertiger IMUs motivieren den Einsatz von kostengĂŒnstigen bildgebenden Sensoren zur Lokalisierung mittels visueller Odometrie oder SLAM-Techniken zur simultanen Lokalisierung und Kartierung. Im ersten wissenschaftlichen Beitrag dieser Arbeit entwickeln wir einen allgemeineren Ansatz fĂŒr die BĂŒndelausgleichung mit einem erweiterten Modell fĂŒr die projektive KollinearitĂ€tsgleichung, sodass auch omnidirektionale Multikamerasysteme verwendet werden können, welche beispielsweise bestehend aus Fisheyekameras mit einer Aufnahme einen groĂen Sichtbereich abdecken. Durch die Integration von Strahlrichtungen als Beobachtungen ist unser Ansatz nicht von einem kameraspezifischen Abbildungsmodell abhĂ€ngig solange dieses der Zentralprojektion folgt. Zudem erlaubt unser Ansatz die Integration und SchĂ€tzung von unendlich fernen Punkten, was bei klassischen BĂŒndelausgleichungen nicht möglich ist. Wir zeigen, dass durch die Integration weit entfernter und unendlich ferner Punkte die SchĂ€tzung der Rotationswinkel der Kameraposen stabilisiert werden kann. Im zweiten Beitrag verwenden wir diesen entwickelten Ansatz zur BĂŒndelausgleichung fĂŒr ein System zur inkrementellen PosenschĂ€tzung und dĂŒnnbesetzten Kartierung auf einem leichtgewichtigen UAV. Basierend auf den Bildsequenzen eines Mulitkamerasystems baut unser System mittels verfolgter markanter Bildpunkte inkrementell eine dĂŒnnbesetzte Karte auf und verfeinert diese inkrementell mittels des iSAM2-Algorithmus. Unser System ist in der Lage optional auch GPS Informationen auf dem Level von GPS-TrĂ€gerphasen zu integrieren, wodurch sogar in unterbestimmten Situation - beispielsweise bei nur zwei verfĂŒgbaren Satelliten - diese Informationen zur georeferenzierten PosenschĂ€tzung verwendet werden können. Im dritten Beitrag stellen wir einen Ansatz zur Verwendung existierender Methoden fĂŒr dichtes Stereomatching mit Fisheyekameras vor, sodass hoch optimierte existierende Methoden als Black Box ohne Modifzierungen sogar mit Kameras mit einem Gesichtsfeld von mehr als 180 Grad verwendet werden können. Wir stellen eine detaillierte Genauigkeitsanalyse basierend auf dem Ergebnis des dichten Stereomatchings dar. Die Genauigkeitsanalyse zeigt, wie stark die Genauigkeit beobachteter Bildpunkte bei Fisheyekameras zum Bildrand aufgrund von zunehmender UnschĂ€rfe abnimmt. Das KernstĂŒck dieses Beitrags ist eine VarianzkomponentenschĂ€tzung, welche die SchĂ€tzung der Varianz der beobachteten DisparitĂ€ten an einem Bildpunkt als Funktion von der Distanz dieses Punktes zum Hauptpunkt des Bildes ermöglicht. Wir zeigen, dass dieses verbesserte stochastische Modell eine realistischere PrĂ€diktion der Genauigkeiten der 3D Punkte ermöglicht