81 research outputs found
Dive Deeper into Rectifying Homography for Stereo Camera Online Self-Calibration
Accurate estimation of stereo camera extrinsic parameters is the key to
guarantee the performance of stereo matching algorithms. In prior arts, the
online self-calibration of stereo cameras has commonly been formulated as a
specialized visual odometry problem, without taking into account the principles
of stereo rectification. In this paper, we first delve deeply into the concept
of rectifying homography, which serves as the cornerstone for the development
of our novel stereo camera online self-calibration algorithm, for cases where
only a single pair of images is available. Furthermore, we introduce a simple
yet effective solution for global optimum extrinsic parameter estimation in the
presence of stereo video sequences. Additionally, we emphasize the
impracticality of using three Euler angles and three components in the
translation vectors for performance quantification. Instead, we introduce four
new evaluation metrics to quantify the robustness and accuracy of extrinsic
parameter estimation, applicable to both single-pair and multi-pair cases.
Extensive experiments conducted across indoor and outdoor environments using
various experimental setups validate the effectiveness of our proposed
algorithm. The comprehensive evaluation results demonstrate its superior
performance in comparison to the baseline algorithm. Our source code, demo
video, and supplement are publicly available at mias.group/StereoCalibrator
E3CM: Epipolar-Constrained Cascade Correspondence Matching
Accurate and robust correspondence matching is of utmost importance for
various 3D computer vision tasks. However, traditional explicit
programming-based methods often struggle to handle challenging scenarios, and
deep learning-based methods require large well-labeled datasets for network
training. In this article, we introduce Epipolar-Constrained Cascade
Correspondence (E3CM), a novel approach that addresses these limitations.
Unlike traditional methods, E3CM leverages pre-trained convolutional neural
networks to match correspondence, without requiring annotated data for any
network training or fine-tuning. Our method utilizes epipolar constraints to
guide the matching process and incorporates a cascade structure for progressive
refinement of matches. We extensively evaluate the performance of E3CM through
comprehensive experiments and demonstrate its superiority over existing
methods. To promote further research and facilitate reproducibility, we make
our source code publicly available at https://mias.group/E3CM.Comment: accepted to Neurocomputin
A New Single-blade Based Hybrid CFD Method for Hovering and Forward-flight Rotor Computation
AbstractA hybrid Euler/full potential/Lagrangian wake method, based on single-blade simulation, for predicting unsteady aerodynamic flow around helicopter rotors in hover and forward flight has been developed. In this method, an Euler solver is used to model the near wake evolution and transonic flow phenomena in the vicinity of the blade, and a full potential equation (FPE) is used to model the isentropic potential flow region far away from the rotor, while the wake effects of other blades and the far wake are incorporated into the flow solution as an induced inflow distribution using a Lagrangian based wake analysis. To further reduce the execution time, the computational fluid dynamics (CFD) solution and rotor wake analysis (including induced velocity update) are conducted parallelly, and a load balancing strategy is employed to account for the information exchange between two solvers. By the developed method, several hover and forward-flight cases on Caradonna-Tung and Helishape 7A rotors are performed. Good agreements of the loadings on blade surface with available measured data demonstrate the validation of the method. Also, the CPU time required for different computation runs is compared in the paper, and the results show that the present hybrid method is superior to conventional CFD method in time cost, and will be more efficient with the number of blades increasing
LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving
Despite the impressive performance achieved by data-fusion networks with
duplex encoders for visual semantic segmentation, they become ineffective when
spatial geometric data are not available. Implicitly infusing the spatial
geometric prior knowledge acquired by a duplex-encoder teacher model into a
single-encoder student model is a practical, albeit less explored research
avenue. This paper delves into this topic and resorts to knowledge distillation
approaches to address this problem. We introduce the Learning to Infuse "X"
(LIX) framework, with novel contributions in both logit distillation and
feature distillation aspects. We present a mathematical proof that underscores
the limitation of using a single fixed weight in decoupled knowledge
distillation and introduce a logit-wise dynamic weight controller as a solution
to this issue. Furthermore, we develop an adaptively-recalibrated feature
distillation algorithm, including two technical novelties: feature
recalibration via kernel regression and in-depth feature consistency
quantification via centered kernel alignment. Extensive experiments conducted
with intermediate-fusion and late-fusion networks across various public
datasets provide both quantitative and qualitative evaluations, demonstrating
the superior performance of our LIX framework when compared to other
state-of-the-art approaches.Comment: 13 pages, 4 figures, 5 table
SM-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving
Semantic segmentation and stereo matching are two essential components of 3D
environmental perception systems for autonomous driving. Nevertheless,
conventional approaches often address these two problems independently,
employing separate models for each task. This approach poses practical
limitations in real-world scenarios, particularly when computational resources
are scarce or real-time performance is imperative. Hence, in this article, we
introduce SM-Net, a novel joint learning framework developed to perform
semantic segmentation and stereo matching simultaneously. Specifically,
SM-Net shares the features extracted from RGB images between both tasks,
resulting in an improved overall scene understanding capability. This feature
sharing process is realized using a feature fusion adaption (FFA) module, which
effectively transforms the shared features into semantic space and subsequently
fuses them with the encoded disparity features. The entire joint learning
framework is trained by minimizing a novel semantic consistency-guided (SCG)
loss, which places emphasis on the structural consistency in both tasks.
Extensive experimental results conducted on the vKITTI2 and KITTI datasets
demonstrate the effectiveness of our proposed joint learning framework and its
superior performance compared to other state-of-the-art single-task networks.
Our project webpage is accessible at mias.group/S3M-Net.Comment: accepted to IEEE Trans. on Intelligent Vehicles (T-IV
- …