1,860 research outputs found
MS-DETR: Multispectral Pedestrian Detection Transformer with Loosely Coupled Fusion and Modality-Balanced Optimization
Multispectral pedestrian detection is an important task for many
around-the-clock applications, since the visible and thermal modalities can
provide complementary information especially under low light conditions. Most
of the available multispectral pedestrian detectors are based on non-end-to-end
detectors, while in this paper, we propose MultiSpectral pedestrian DEtection
TRansformer (MS-DETR), an end-to-end multispectral pedestrian detector, which
extends DETR into the field of multi-modal detection. MS-DETR consists of two
modality-specific backbones and Transformer encoders, followed by a multi-modal
Transformer decoder, and the visible and thermal features are fused in the
multi-modal Transformer decoder. To well resist the misalignment between
multi-modal images, we design a loosely coupled fusion strategy by sparsely
sampling some keypoints from multi-modal features independently and fusing them
with adaptively learned attention weights. Moreover, based on the insight that
not only different modalities, but also different pedestrian instances tend to
have different confidence scores to final detection, we further propose an
instance-aware modality-balanced optimization strategy, which preserves visible
and thermal decoder branches and aligns their predicted slots through an
instance-wise dynamic loss. Our end-to-end MS-DETR shows superior performance
on the challenging KAIST, CVC-14 and LLVIP benchmark datasets. The source code
is available at https://github.com/YinghuiXing/MS-DETR
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
- …