4,855 research outputs found
A Novel GAN-Based Anomaly Detection and Localization Method for Aerial Video Surveillance at Low Altitude
The last two decades have seen an incessant growth in the use of Unmanned Aerial Vehicles (UAVs) equipped with HD cameras for developing aerial vision-based systems to support civilian and military tasks, including land monitoring, change detection, and object classification. To perform most of these tasks, the artificial intelligence algorithms usually need to know, a priori, what to look for, identify. or recognize. Actually, in most operational scenarios, such as war zones or post-disaster situations, areas and objects of interest are not decidable a priori since their shape and visual features may have been altered by events or even intentionally disguised (e.g., improvised explosive devices (IEDs)). For these reasons, in recent years, more and more research groups are investigating the design of original anomaly detection methods, which, in short, are focused on detecting samples that differ from the others in terms of visual appearance and occurrences with respect to a given environment. In this paper, we present a novel two-branch Generative Adversarial Network (GAN)-based method for low-altitude RGB aerial video surveillance to detect and localize anomalies. We have chosen to focus on the low-altitude sequences as we are interested in complex operational scenarios where even a small object or device can represent a reason for danger or attention. The proposed model was tested on the UAV Mosaicking and Change Detection (UMCD) dataset, a one-of-a-kind collection of challenging videos whose sequences were acquired between 6 and 15 m above sea level on three types of ground (i.e., urban, dirt, and countryside). Results demonstrated the effectiveness of the model in terms of Area Under the Receiving Operating Curve (AUROC) and Structural Similarity Index (SSIM), achieving an average of 97.2% and 95.7%, respectively, thus suggesting that the system can be deployed in real-world applications
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
A Comprehensive Review on Computer Vision Analysis of Aerial Data
With the emergence of new technologies in the field of airborne platforms and
imaging sensors, aerial data analysis is becoming very popular, capitalizing on
its advantages over land data. This paper presents a comprehensive review of
the computer vision tasks within the domain of aerial data analysis. While
addressing fundamental aspects such as object detection and tracking, the
primary focus is on pivotal tasks like change detection, object segmentation,
and scene-level analysis. The paper provides the comparison of various hyper
parameters employed across diverse architectures and tasks. A substantial
section is dedicated to an in-depth discussion on libraries, their
categorization, and their relevance to different domain expertise. The paper
encompasses aerial datasets, the architectural nuances adopted, and the
evaluation metrics associated with all the tasks in aerial data analysis.
Applications of computer vision tasks in aerial data across different domains
are explored, with case studies providing further insights. The paper
thoroughly examines the challenges inherent in aerial data analysis, offering
practical solutions. Additionally, unresolved issues of significance are
identified, paving the way for future research directions in the field of
aerial data analysis.Comment: 112 page
InsPLAD: A Dataset and Benchmark for Power Line Asset Inspection in UAV Images
Power line maintenance and inspection are essential to avoid power supply
interruptions, reducing its high social and financial impacts yearly.
Automating power line visual inspections remains a relevant open problem for
the industry due to the lack of public real-world datasets of power line
components and their various defects to foster new research. This paper
introduces InsPLAD, a Power Line Asset Inspection Dataset and Benchmark
containing 10,607 high-resolution Unmanned Aerial Vehicles colour images. The
dataset contains seventeen unique power line assets captured from real-world
operating power lines. Additionally, five of those assets present six defects:
four of which are corrosion, one is a broken component, and one is a bird's
nest presence. All assets were labelled according to their condition, whether
normal or the defect name found on an image level. We thoroughly evaluate
state-of-the-art and popular methods for three image-level computer vision
tasks covered by InsPLAD: object detection, through the AP metric; defect
classification, through Balanced Accuracy; and anomaly detection, through the
AUROC metric. InsPLAD offers various vision challenges from uncontrolled
environments, such as multi-scale objects, multi-size class instances, multiple
objects per image, intra-class variation, cluttered background, distinct
point-of-views, perspective distortion, occlusion, and varied lighting
conditions. To the best of our knowledge, InsPLAD is the first large real-world
dataset and benchmark for power line asset inspection with multiple components
and defects for various computer vision tasks, with a potential impact to
improve state-of-the-art methods in the field. It will be publicly available in
its integrity on a repository with a thorough description. It can be found at
https://github.com/andreluizbvs/InsPLAD.Comment: This is an original manuscript of an article published by Taylor &
Francis in the International Journal of Remote Sensing on 29 Nov 2023,
available online: https://doi.org/10.1080/01431161.2023.228390
- …