18,755 research outputs found
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
SpaceNet MVOI: a Multi-View Overhead Imagery Dataset
Detection and segmentation of objects in overheard imagery is a challenging
task. The variable density, random orientation, small size, and
instance-to-instance heterogeneity of objects in overhead imagery calls for
approaches distinct from existing models designed for natural scene datasets.
Though new overhead imagery datasets are being developed, they almost
universally comprise a single view taken from directly overhead ("at nadir"),
failing to address a critical variable: look angle. By contrast, views vary in
real-world overhead imagery, particularly in dynamic scenarios such as natural
disasters where first looks are often over 40 degrees off-nadir. This
represents an important challenge to computer vision methods, as changing view
angle adds distortions, alters resolution, and changes lighting. At present,
the impact of these perturbations for algorithmic detection and segmentation of
objects is untested. To address this problem, we present an open source
Multi-View Overhead Imagery dataset, termed SpaceNet MVOI, with 27 unique looks
from a broad range of viewing angles (-32.5 degrees to 54.0 degrees). Each of
these images cover the same 665 square km geographic extent and are annotated
with 126,747 building footprint labels, enabling direct assessment of the
impact of viewpoint perturbation on model performance. We benchmark multiple
leading segmentation and object detection models on: (1) building detection,
(2) generalization to unseen viewing angles and resolutions, and (3)
sensitivity of building footprint extraction to changes in resolution. We find
that state of the art segmentation and object detection models struggle to
identify buildings in off-nadir imagery and generalize poorly to unseen views,
presenting an important benchmark to explore the broadly relevant challenge of
detecting small, heterogeneous target objects in visually dynamic contexts.Comment: Accepted into IEEE International Conference on Computer Vision (ICCV)
201
Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network
The detection performance of small objects in remote sensing images is not
satisfactory compared to large objects, especially in low-resolution and noisy
images. A generative adversarial network (GAN)-based model called enhanced
super-resolution GAN (ESRGAN) shows remarkable image enhancement performance,
but reconstructed images miss high-frequency edge information. Therefore,
object detection performance degrades for small objects on recovered noisy and
low-resolution remote sensing images. Inspired by the success of edge enhanced
GAN (EEGAN) and ESRGAN, we apply a new edge-enhanced super-resolution GAN
(EESRGAN) to improve the image quality of remote sensing images and use
different detector networks in an end-to-end manner where detector loss is
backpropagated into the EESRGAN to improve the detection performance. We
propose an architecture with three components: ESRGAN, Edge Enhancement Network
(EEN), and Detection network. We use residual-in-residual dense blocks (RRDB)
for both the ESRGAN and EEN, and for the detector network, we use the faster
region-based convolutional network (FRCNN) (two-stage detector) and single-shot
multi-box detector (SSD) (one stage detector). Extensive experiments on a
public (car overhead with context) and a self-assembled (oil and gas storage
tank) satellite dataset show superior performance of our method compared to the
standalone state-of-the-art object detectors.Comment: This paper contains 27 pages and accepted for publication in MDPI
remote sensing journal. GitHub Repository:
https://github.com/Jakaria08/EESRGAN (Implementation
DOTA: A Large-scale Dataset for Object Detection in Aerial Images
Object detection is an important and challenging problem in computer vision.
Although the past decade has witnessed major advances in object detection in
natural scenes, such successes have been slow to aerial imagery, not only
because of the huge variation in the scale, orientation and shape of the object
instances on the earth's surface, but also due to the scarcity of
well-annotated datasets of objects in aerial scenes. To advance object
detection research in Earth Vision, also known as Earth Observation and Remote
Sensing, we introduce a large-scale Dataset for Object deTection in Aerial
images (DOTA). To this end, we collect aerial images from different
sensors and platforms. Each image is of the size about 4000-by-4000 pixels and
contains objects exhibiting a wide variety of scales, orientations, and shapes.
These DOTA images are then annotated by experts in aerial image interpretation
using common object categories. The fully annotated DOTA images contains
instances, each of which is labeled by an arbitrary (8 d.o.f.)
quadrilateral To build a baseline for object detection in Earth Vision, we
evaluate state-of-the-art object detection algorithms on DOTA. Experiments
demonstrate that DOTA well represents real Earth Vision applications and are
quite challenging.Comment: Accepted to CVPR 201
Smart environment monitoring through micro unmanned aerial vehicles
In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
- …