4 research outputs found
Paying Attention to Multiscale Feature Maps in Multimodal Image Matching
We propose an attention-based approach for multimodal image patch matching
using a Transformer encoder attending to the feature maps of a multiscale
Siamese CNN. Our encoder is shown to efficiently aggregate multiscale image
embeddings while emphasizing task-specific appearance-invariant image cues. We
also introduce an attention-residual architecture, using a residual connection
bypassing the encoder. This additional learning signal facilitates end-to-end
training from scratch. Our approach is experimentally shown to achieve new
state-of-the-art accuracy on both multimodal and single modality benchmarks,
illustrating its general applicability. To the best of our knowledge, this is
the first successful implementation of the Transformer encoder architecture to
the multimodal image patch matching task
Improved symmetric-SIFT for Multi-modal image registration
Multi-modal image registration has received significant research attention over the past decade. SymmetricSIFT is a recently proposed local description technique that can be used for registering multi-modal images. It is based on a well-known general image registration technique named Scale Invariant Feature Transform (SIFT). Symmetric-SIFT, however, achieves this invariance to multi-modality at the cost of losing important information. In this paper, we show how this loss may adversely affect the accuracy of registration results. We then propose an improvement to Symmetric-SIFT to overcome the problem. Our experimental results show that the proposed technique can improve the number of true matches by up to 10 times and overall matching accuracy by up to 30%
Achieving high multi-modal registration performance using simplified Hough-transform with improved symmetric-SIFT
The traditional way of using Hough Transform with SIFT is for the purpose of reliable object recognition. However, it cannot be effectively applied to image registration in the same way as the recall rate can be significantly lower. In this paper, we propose an alternative implementation of Hough Transform that can be used with Improved Symmetric-SIFT for multi-modal image registration. Our experimental results show that the proposed technique of applying Hough Transform can significantly improve the key-point matching as well as registration accuracy by utilizing aggregated information from key-points throughout the input images