714 research outputs found
Assessing thermal imagery integration into object detection methods on ground-based and air-based collection platforms
Object detection models commonly deployed on uncrewed aerial systems (UAS)
focus on identifying objects in the visible spectrum using Red-Green-Blue (RGB)
imagery. However, there is growing interest in fusing RGB with thermal long
wave infrared (LWIR) images to increase the performance of object detection
machine learning (ML) models. Currently LWIR ML models have received less
research attention, especially for both ground- and air-based platforms,
leading to a lack of baseline performance metrics evaluating LWIR, RGB and
LWIR-RGB fused object detection models. Therefore, this research contributes
such quantitative metrics to the literature. The results found that the
ground-based blended RGB-LWIR model exhibited superior performance compared to
the RGB or LWIR approaches, achieving a mAP of 98.4%. Additionally, the blended
RGB-LWIR model was also the only object detection model to work in both day and
night conditions, providing superior operational capabilities. This research
additionally contributes a novel labelled training dataset of 12,600 images for
RGB, LWIR, and RGB-LWIR fused imagery, collected from ground-based and
air-based platforms, enabling further multispectral machine-driven object
detection research.Comment: 18 pages, 12 figures, 2 table
Advances in Object and Activity Detection in Remote Sensing Imagery
The recent revolution in deep learning has enabled considerable development in the fields of object and activity detection. Visual object detection tries to find objects of target classes with precise localisation in an image and assign each object instance a corresponding class label. At the same time, activity recognition aims to determine the actions or activities of an agent or group of agents based on sensor or video observation data. It is a very important and challenging problem to detect, identify, track, and understand the behaviour of objects through images and videos taken by various cameras. Together, objects and their activity recognition in imaging data captured by remote sensing platforms is a highly dynamic and challenging research topic. During the last decade, there has been significant growth in the number of publications in the field of object and activity recognition. In particular, many researchers have proposed application domains to identify objects and their specific behaviours from air and spaceborne imagery. This Special Issue includes papers that explore novel and challenging topics for object and activity detection in remote sensing images and videos acquired by diverse platforms
A Review on Deep Learning in UAV Remote Sensing
Deep Neural Networks (DNNs) learn representation from data with an impressive
capability, and brought important breakthroughs for processing images,
time-series, natural language, audio, video, and many others. In the remote
sensing field, surveys and literature revisions specifically involving DNNs
algorithms' applications have been conducted in an attempt to summarize the
amount of information produced in its subfields. Recently, Unmanned Aerial
Vehicles (UAV) based applications have dominated aerial sensing research.
However, a literature revision that combines both "deep learning" and "UAV
remote sensing" thematics has not yet been conducted. The motivation for our
work was to present a comprehensive review of the fundamentals of Deep Learning
(DL) applied in UAV-based imagery. We focused mainly on describing
classification and regression techniques used in recent applications with
UAV-acquired data. For that, a total of 232 papers published in international
scientific journal databases was examined. We gathered the published material
and evaluated their characteristics regarding application, sensor, and
technique used. We relate how DL presents promising results and has the
potential for processing tasks associated with UAV-based image data. Lastly, we
project future perspectives, commentating on prominent DL paths to be explored
in the UAV remote sensing field. Our revision consists of a friendly-approach
to introduce, commentate, and summarize the state-of-the-art in UAV-based image
applications with DNNs algorithms in diverse subfields of remote sensing,
grouping it in the environmental, urban, and agricultural contexts.Comment: 38 pages, 10 figure
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
While semantic segmentation has seen tremendous improvements in the past,
there is still significant labeling efforts necessary and the problem of
limited generalization to classes that have not been present during training.
To address this problem, zero-shot semantic segmentation makes use of large
self-supervised vision-language models, allowing zero-shot transfer to unseen
classes. In this work, we build a benchmark for Multi-domain Evaluation of
Semantic Segmentation (MESS), which allows a holistic analysis of performance
across a wide range of domain-specific datasets such as medicine, engineering,
earth monitoring, biology, and agriculture. To do this, we reviewed 120
datasets, developed a taxonomy, and classified the datasets according to the
developed taxonomy. We select a representative subset consisting of 22 datasets
and propose it as the MESS benchmark. We evaluate eight recently published
models on the proposed MESS benchmark and analyze characteristics for the
performance of zero-shot transfer models. The toolkit is available at
https://github.com/blumenstiel/MESS
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery
A robust and fast automatic moving object detection and tracking system is
essential to characterize target object and extract spatial and temporal
information for different functionalities including video surveillance systems,
urban traffic monitoring and navigation, robotic. In this dissertation, I
present a collaborative Spatial Pyramid Context-aware moving object detection
and Tracking system. The proposed visual tracker is composed of one master
tracker that usually relies on visual object features and two auxiliary
trackers based on object temporal motion information that will be called
dynamically to assist master tracker. SPCT utilizes image spatial context at
different level to make the video tracking system resistant to occlusion,
background noise and improve target localization accuracy and robustness. We
chose a pre-selected seven-channel complementary features including RGB color,
intensity and spatial pyramid of HoG to encode object color, shape and spatial
layout information. We exploit integral histogram as building block to meet the
demands of real-time performance. A novel fast algorithm is presented to
accurately evaluate spatially weighted local histograms in constant time
complexity using an extension of the integral histogram method. Different
techniques are explored to efficiently compute integral histogram on GPU
architecture and applied for fast spatio-temporal median computations and 3D
face reconstruction texturing. We proposed a multi-component framework based on
semantic fusion of motion information with projected building footprint map to
significantly reduce the false alarm rate in urban scenes with many tall
structures. The experiments on extensive VOTC2016 benchmark dataset and aerial
video confirm that combining complementary tracking cues in an intelligent
fusion framework enables persistent tracking for Full Motion Video and Wide
Aerial Motion Imagery.Comment: PhD Dissertation (162 pages
- …