127 research outputs found
Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors
Visual localization is an attractive problem that estimates the camera
localization from database images based on the query image. It is a crucial
task for various applications, such as autonomous vehicles, assistive
navigation and augmented reality. The challenging issues of the task lie in
various appearance variations between query and database images, including
illumination variations, dynamic object variations and viewpoint variations. In
order to tackle those challenges, Panoramic Annular Localizer into which
panoramic annular lens and robust deep image descriptors are incorporated is
proposed in this paper. The panoramic annular images captured by the single
camera are processed and fed into the NetVLAD network to form the active deep
descriptor, and sequential matching is utilized to generate the localization
result. The experiments carried on the public datasets and in the field
illustrate the validation of the proposed system.Comment: Accepted by ITSC 201
FlowLens: Seeing Beyond the FoV via Flow-guided Clip-Recurrent Transformer
Limited by hardware cost and system size, camera's Field-of-View (FoV) is not
always satisfactory. However, from a spatio-temporal perspective, information
beyond the camera's physical FoV is off-the-shelf and can actually be obtained
"for free" from the past. In this paper, we propose a novel task termed
Beyond-FoV Estimation, aiming to exploit past visual cues and bidirectional
break through the physical FoV of a camera. We put forward a FlowLens
architecture to expand the FoV by achieving feature propagation explicitly by
optical flow and implicitly by a novel clip-recurrent transformer, which has
two appealing features: 1) FlowLens comprises a newly proposed Clip-Recurrent
Hub with 3D-Decoupled Cross Attention (DDCA) to progressively process global
information accumulated in the temporal dimension. 2) A multi-branch Mix Fusion
Feed Forward Network (MixF3N) is integrated to enhance the spatially-precise
flow of local features. To foster training and evaluation, we establish
KITTI360-EX, a dataset for outer- and inner FoV expansion. Extensive
experiments on both video inpainting and beyond-FoV estimation tasks show that
FlowLens achieves state-of-the-art performance. Code will be made publicly
available at https://github.com/MasterHow/FlowLens.Comment: Code will be made publicly available at
https://github.com/MasterHow/FlowLen
FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving
Key-point-based scene understanding is fundamental for autonomous driving
applications. At the same time, optical flow plays an important role in many
vision tasks. However, due to the implicit bias of equal attention on all
points, classic data-driven optical flow estimation methods yield less
satisfactory performance on key points, limiting their implementations in
key-point-critical safety-relevant scenarios. To address these issues, we
introduce a points-based modeling method that requires the model to learn
key-point-related priors explicitly. Based on the modeling method, we present
FocusFlow, a framework consisting of 1) a mix loss function combined with a
classic photometric loss function and our proposed Conditional Point Control
Loss (CPCL) function for diverse point-wise supervision; 2) a conditioned
controlling model which substitutes the conventional feature encoder by our
proposed Condition Control Encoder (CCE). CCE incorporates a Frame Feature
Encoder (FFE) that extracts features from frames, a Condition Feature Encoder
(CFE) that learns to control the feature extraction behavior of FFE from input
masks containing information of key points, and fusion modules that transfer
the controlling information between FFE and CFE. Our FocusFlow framework shows
outstanding performance with up to +44.5% precision improvement on various key
points such as ORB, SIFT, and even learning-based SiLK, along with exceptional
scalability for most existing data-driven optical flow methods like PWC-Net,
RAFT, and FlowFormer. Notably, FocusFlow yields competitive or superior
performances rivaling the original models on the whole frame. The source code
will be available at https://github.com/ZhonghuaYi/FocusFlow_official.Comment: The source code of FocusFlow will be available at
https://github.com/ZhonghuaYi/FocusFlow_officia
- …