7,743 research outputs found
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
LIGHT: Joint Individual Building Extraction and Height Estimation from Satellite Images through a Unified Multitask Learning Network
Building extraction and height estimation are two important basic tasks in
remote sensing image interpretation, which are widely used in urban planning,
real-world 3D construction, and other fields. Most of the existing research
regards the two tasks as independent studies. Therefore the height information
cannot be fully used to improve the accuracy of building extraction and vice
versa. In this work, we combine the individuaL buIlding extraction and heiGHt
estimation through a unified multiTask learning network (LIGHT) for the first
time, which simultaneously outputs a height map, bounding boxes, and a
segmentation mask map of buildings. Specifically, LIGHT consists of an instance
segmentation branch and a height estimation branch. In particular, so as to
effectively unify multi-scale feature branches and alleviate feature spans
between branches, we propose a Gated Cross Task Interaction (GCTI) module that
can efficiently perform feature interaction between branches. Experiments on
the DFC2023 dataset show that our LIGHT can achieve superior performance, and
our GCTI module with ResNet101 as the backbone can significantly improve the
performance of multitask learning by 2.8% AP50 and 6.5% delta1, respectively
- …