6,362 research outputs found
Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery
Automatic multi-class object detection in remote sensing images in
unconstrained scenarios is of high interest for several applications including
traffic monitoring and disaster management. The huge variation in object scale,
orientation, category, and complex backgrounds, as well as the different camera
sensors pose great challenges for current algorithms. In this work, we propose
a new method consisting of a novel joint image cascade and feature pyramid
network with multi-size convolution kernels to extract multi-scale strong and
weak semantic features. These features are fed into rotation-based region
proposal and region of interest networks to produce object detections. Finally,
rotational non-maximum suppression is applied to remove redundant detections.
During training, we minimize joint horizontal and oriented bounding box loss
functions, as well as a novel loss that enforces oriented boxes to be
rectangular. Our method achieves 68.16% mAP on horizontal and 72.45% mAP on
oriented bounding box detection tasks on the challenging DOTA dataset,
outperforming all published methods by a large margin (+6% and +12% absolute
improvement, respectively). Furthermore, it generalizes to two other datasets,
NWPU VHR-10 and UCAS-AOD, and achieves competitive results with the baselines
even when trained on DOTA. Our method can be deployed in multi-class object
detection applications, regardless of the image and object scales and
orientations, making it a great choice for unconstrained aerial and satellite
imagery.Comment: ACCV 201
EAST: An Efficient and Accurate Scene Text Detector
Previous approaches for scene text detection have already achieved promising
performances across various benchmarks. However, they usually fall short when
dealing with challenging scenarios, even when equipped with deep neural network
models, because the overall performance is determined by the interplay of
multiple stages and components in the pipelines. In this work, we propose a
simple yet powerful pipeline that yields fast and accurate text detection in
natural scenes. The pipeline directly predicts words or text lines of arbitrary
orientations and quadrilateral shapes in full images, eliminating unnecessary
intermediate steps (e.g., candidate aggregation and word partitioning), with a
single neural network. The simplicity of our pipeline allows concentrating
efforts on designing loss functions and neural network architecture.
Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500
demonstrate that the proposed algorithm significantly outperforms
state-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR
2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fps
at 720p resolution.Comment: Accepted to CVPR 2017, fix equation (3
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
- …