32,807 research outputs found
SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery
Accurately and timely detecting multiscale small objects that contain tens of
pixels from remote sensing images (RSI) remains challenging. Most of the
existing solutions primarily design complex deep neural networks to learn
strong feature representations for objects separated from the background, which
often results in a heavy computation burden. In this article, we propose an
accurate yet fast object detection method for RSI, named SuperYOLO, which fuses
multimodal data and performs high-resolution (HR) object detection on
multiscale objects by utilizing the assisted super resolution (SR) learning and
considering both the detection accuracy and computation cost. First, we utilize
a symmetric compact multimodal fusion (MF) to extract supplementary information
from various data for improving small object detection in RSI. Furthermore, we
design a simple and flexible SR branch to learn HR feature representations that
can discriminate small objects from vast backgrounds with low-resolution (LR)
input, thus further improving the detection accuracy. Moreover, to avoid
introducing additional computation, the SR branch is discarded in the inference
stage, and the computation of the network model is reduced due to the LR input.
Experimental results show that, on the widely used VEDAI RS dataset, SuperYOLO
achieves an accuracy of 75.09% (in terms of mAP50 ), which is more than 10%
higher than the SOTA large models, such as YOLOv5l, YOLOv5x, and RS designed
YOLOrs. Meanwhile, the parameter size and GFLOPs of SuperYOLO are about 18
times and 3.8 times less than YOLOv5x. Our proposed model shows a favorable
accuracy and speed tradeoff compared to the state-of-the-art models. The code
will be open-sourced at https://github.com/icey-zhang/SuperYOLO.Comment: The article is accepted by IEEE Transactions on Geoscience and Remote
Sensin
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding
Recent trends in image understanding have pushed for holistic scene
understanding models that jointly reason about various tasks such as object
detection, scene recognition, shape analysis, contextual reasoning, and local
appearance based classifiers. In this work, we are interested in understanding
the roles of these different tasks in improved scene understanding, in
particular semantic segmentation, object detection and scene recognition.
Towards this goal, we "plug-in" human subjects for each of the various
components in a state-of-the-art conditional random field model. Comparisons
among various hybrid human-machine CRFs give us indications of how much "head
room" there is to improve scene understanding by focusing research efforts on
various individual tasks
Investigation of a new method for improving image resolution for camera tracking applications
Camera based systems have been a preferred choice in many motion tracking applications due to the ease of installation and the ability to work in unprepared environments. The concept of these systems is based on extracting image information (colour and shape properties) to detect the object location. However, the resolution of the image and the camera field-of- view (FOV) are two main factors that can restrict the tracking applications for which these systems can be used. Resolution can be addressed partially by using higher resolution cameras but this may not always be possible or cost effective.
This research paper investigates a new method utilising averaging of offset images to improve the effective resolution using a standard camera. The initial results show that the minimum detectable position change of a tracked object could be improved by up to 4 times
- …