144 research outputs found
Aggregated Deep Local Features for Remote Sensing Image Retrieval
Remote Sensing Image Retrieval remains a challenging topic due to the special
nature of Remote Sensing Imagery. Such images contain various different
semantic objects, which clearly complicates the retrieval task. In this paper,
we present an image retrieval pipeline that uses attentive, local convolutional
features and aggregates them using the Vector of Locally Aggregated Descriptors
(VLAD) to produce a global descriptor. We study various system parameters such
as the multiplicative and additive attention mechanisms and descriptor
dimensionality. We propose a query expansion method that requires no external
inputs. Experiments demonstrate that even without training, the local
convolutional features and global representation outperform other systems.
After system tuning, we can achieve state-of-the-art or competitive results.
Furthermore, we observe that our query expansion method increases overall
system performance by about 3%, using only the top-three retrieved images.
Finally, we show how dimensionality reduction produces compact descriptors with
increased retrieval performance and fast retrieval computation times, e.g. 50%
faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal
contributio
Exploiting ConvNet Diversity for Flooding Identification
Flooding is the world's most costly type of natural disaster in terms of both economic losses and human causalities. A first and essential procedure toward flood monitoring is based on identifying the area most vulnerable to flooding, which gives authorities relevant regions to focus. In this letter, we propose several methods to perform flooding identification in high-resolution remote sensing images using deep learning. Specifically, some proposed techniques are based upon unique networks, such as dilated and deconvolutional ones, whereas others were conceived to exploit diversity of distinct networks in order to extract the maximum performance of each classifier. The evaluation of the proposed methods was conducted in a high-resolution remote sensing data set. Results show that the proposed algorithms outperformed the state-of-the-art baselines, providing improvements ranging from 1% to 4% in terms of the Jaccard Index
Project RISE: Recognizing Industrial Smoke Emissions
Industrial smoke emissions pose a significant concern to human health. Prior
works have shown that using Computer Vision (CV) techniques to identify smoke
as visual evidence can influence the attitude of regulators and empower
citizens to pursue environmental justice. However, existing datasets are not of
sufficient quality nor quantity to train the robust CV models needed to support
air quality advocacy. We introduce RISE, the first large-scale video dataset
for Recognizing Industrial Smoke Emissions. We adopted a citizen science
approach to collaborate with local community members to annotate whether a
video clip has smoke emissions. Our dataset contains 12,567 clips from 19
distinct views from cameras that monitored three industrial facilities. These
daytime clips span 30 days over two years, including all four seasons. We ran
experiments using deep neural networks to establish a strong performance
baseline and reveal smoke recognition challenges. Our survey study discussed
community feedback, and our data analysis displayed opportunities for
integrating citizen scientists and crowd workers into the application of
Artificial Intelligence for social good.Comment: Technical repor
Facing Erosion Identification in Railway Lines Using Pixel-wise Deep-based Approaches
Soil erosion is considered one of the most expensive natural hazards with a high impact on several infrastructure assets. Among them, railway lines are one of the most likely constructions for the appearance of erosion and, consequently, one of the most troublesome due to the maintenance costs, risks of derailments, and so on. Therefore, it is fundamental to identify and monitor erosion in railway lines to prevent major consequences. Currently, erosion identification is manually performed by humans using huge image sets, a time-consuming and slow task. Hence, automatic machine learning methods appear as an appealing alternative. A crucial step for automatic erosion identification is to create a good feature representation. Towards such objective, deep learning can learn data-driven features and classifiers. In this paper, we propose a novel deep learning-based framework capable of performing erosion identification in railway lines. Six techniques were evaluated and the best one, Dynamic Dilated ConvNet, was integrated into this framework that was then encapsulated into a new ArcGIS plugin to facilitate its use by non-programmer users. To analyze such techniques, we also propose a new dataset, composed of almost 2,000 high-resolution images
An Approach Of Features Extraction And Heatmaps Generation Based Upon Cnns And 3D Object Models
The rapid advancements in artificial intelligence have enabled recent progress of self-driving vehicles. However, the dependence on 3D object models and their annotations collected and owned by individual companies has become a major problem for the development of new algorithms. This thesis proposes an approach of directly using graphics models created from open-source datasets as the virtual representation of real-world objects. This approach uses Machine Learning techniques to extract 3D feature points and to create annotations from graphics models for the recognition of dynamic objects, such as cars, and for the verification of stationary and variable objects, such as buildings and trees. Moreover, it generates heat maps for the elimination of stationary/variable objects in real-time images before working on the recognition of dynamic objects. The proposed approach helps to bridge the gap between the virtual and physical worlds and to facilitate the development of new algorithms for self-driving vehicles
Deep learning for small object detection
Small object detection has become increasingly relevant due to the fact that
the performance of common object detectors falls significantly as objects become smaller. Many computer vision
applications require the analysis of the entire set of objects in the image, including extremely small objects.
Moreover, the detection of small objects allows to perceive objects at a greater distance, thus giving more time to
adapt to any situation or unforeseen event
Bayesian Multi Scale Neural Network for Crowd Counting
Crowd Counting is a difficult but important problem in computer vision.
Convolutional Neural Networks based on estimating the density map over the
image has been highly successful in this domain. However dense crowd counting
remains an open problem because of severe occlusion and perspective view in
which people can be present at various sizes. In this work, we propose a new
network which uses a ResNet based feature extractor, downsampling block which
uses dilated convolutions and upsampling block using transposed convolutions.
We present a novel aggregation module which makes our network robust to the
perspective view problem. We present the optimization details, loss functions
and the algorithm used in our work. On evaluating on ShanghaiTech, UCF-CC-50
and UCF-QNRF datasets using MSE and MAE as evaluation metrics, our network
outperforms previous state of the art approaches while giving uncertainty
estimates in a principled bayesian manner.Comment: 10 page
- …