751 research outputs found
Recent trends and long-standing problems in archaeological remote sensing
The variety and sophistication of data sources, sensors, and platforms employed in archaeological remote sensing have increased significantly over the past decade. Projects incorporating data from UAV surveys, regional and research-driven lidar surveys, the uptake of hyperspectral imaging, the launch of high-temporal revisit satellites, the advent of multi-sensor rigs for geophysical survey, and increased use of structure from motion mean that more archaeologists are engaging with remote sensing than ever. These technological advances continue to drive research in the specialist community and provide reasons for optimism about future applications, but many social and technical obstacles to the integration of remote sensing into archaeological research and heritage management remain. This article addresses the challenges of contemporary archaeological remote sensing by briefly reviewing trends and then focusing on providing a critical overview of the main structural problems. The discussion here concentrates on topics that have dominated the discourse in recent archaeological literature and featured prominently in ongoing fieldwork for the past decade across three broad segments of landscape archaeology: data collection in the field, the current state of data access and archives, and processing and interpretation
Deep Probabilistic Models for Camera Geo-Calibration
The ultimate goal of image understanding is to transfer visual images into numerical or symbolic descriptions of the scene that are helpful for decision making. Knowing when, where, and in which direction a picture was taken, the task of geo-calibration makes it possible to use imagery to understand the world and how it changes in time. Current models for geo-calibration are mostly deterministic, which in many cases fails to model the inherent uncertainties when the image content is ambiguous. Furthermore, without a proper modeling of the uncertainty, subsequent processing can yield overly confident predictions. To address these limitations, we propose a probabilistic model for camera geo-calibration using deep neural networks. While our primary contribution is geo-calibration, we also show that learning to geo-calibrate a camera allows us to implicitly learn to understand the content of the scene
RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization
We study an important, yet largely unexplored problem of large-scale
cross-modal visual localization by matching ground RGB images to a
geo-referenced aerial LIDAR 3D point cloud (rendered as depth images). Prior
works were demonstrated on small datasets and did not lend themselves to
scaling up for large-scale applications. To enable large-scale evaluation, we
introduce a new dataset containing over 550K pairs (covering 143 km^2 area) of
RGB and aerial LIDAR depth images. We propose a novel joint embedding based
method that effectively combines the appearance and semantic cues from both
modalities to handle drastic cross-modal variations. Experiments on the
proposed dataset show that our model achieves a strong result of a median rank
of 5 in matching across a large test set of 50K location pairs collected from a
14km^2 area. This represents a significant advancement over prior works in
performance and scale. We conclude with qualitative results to highlight the
challenging nature of this task and the benefits of the proposed model. Our
work provides a foundation for further research in cross-modal visual
localization.Comment: ACM Multimedia 202
An Informative Path Planning Framework for Active Learning in UAV-based Semantic Mapping
Unmanned aerial vehicles (UAVs) are frequently used for aerial mapping and
general monitoring tasks. Recent progress in deep learning enabled automated
semantic segmentation of imagery to facilitate the interpretation of
large-scale complex environments. Commonly used supervised deep learning for
segmentation relies on large amounts of pixel-wise labelled data, which is
tedious and costly to annotate. The domain-specific visual appearance of aerial
environments often prevents the usage of models pre-trained on publicly
available datasets. To address this, we propose a novel general planning
framework for UAVs to autonomously acquire informative training images for
model re-training. We leverage multiple acquisition functions and fuse them
into probabilistic terrain maps. Our framework combines the mapped acquisition
function information into the UAV's planning objectives. In this way, the UAV
adaptively acquires informative aerial images to be manually labelled for model
re-training. Experimental results on real-world data and in a photorealistic
simulation show that our framework maximises model performance and drastically
reduces labelling efforts. Our map-based planners outperform state-of-the-art
local planning.Comment: 18 pages, 24 figure
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Seeing the smart city on Twitter: Colour and the affective territories of becoming smart
This paper pays attention to the immense and febrile field of digital image files which picture the smart city as they circulate on the social media platform Twitter. The paper considers tweeted images as an affective field in which flow and colour are especially generative. This luminescent field is territorialised into different, emergent forms of becoming ‘smart’. The paper identifies these territorialisations in two ways: firstly, by using the data visualisation software ImagePlot to create a visualisation of 9030 tweeted images related to smart cities; and secondly, by responding to the affective pushes of the image files thus visualised. It identifies two colours and three ways of affectively becoming smart: participating in smart, learning about smart, and anticipating smart, which are enacted with different distributions of mostly orange and blue images. The paper thus argues that debates about the power relations embedded in the smart city should consider the particular affective enactment of being smart that happens via social media. More generally, the paper concludes that geographers must pay more attention to the diverse and productive vitalities of social media platforms in urban life and that this will require experiment with methods that are responsive to specific digital qualities
Automatic Rural Road Centerline Extraction from Aerial Images for a Forest Fire Support System
In the last decades, Portugal has been severely affected by forest fires which have caused
massive damage both environmentally and socially. Having a well-structured and precise
mapping of rural roads is critical to help firefighters to mitigate these events. The
traditional process of extracting rural roads centerlines from aerial images is extremely
time-consuming and tedious, because the mapping operator has to manually label the road
area and extract the road centerline.
A frequent challenge in the process of extracting rural roads centerlines is the high
amount of environmental complexity and road occlusions caused by vehicles, shadows, wild
vegetation, and trees, bringing heterogeneous segments that can be further improved. This
dissertation proposes an approach to automatically detect rural road segments as well as
extracting the road centerlines from aerial images.
The proposed method focuses on two main steps: on the first step, an architecture based
on a deep learning model (DeepLabV3+) is used, to extract the road features maps and
detect the rural roads. On the second step, the first stage of the process is an optimization
for improving road connections, as well as cleaning white small objects from the predicted
image by the neural network. Finally, a morphological approach is proposed to extract
the rural road centerlines from the previously detected roads by using thinning algorithms
like the Zhang-Suen and Guo-Hall methods.
With the automation of these two stages, it is now possible to detect and extract road
centerlines from complex rural environments automatically and faster than the traditional
ways, and possibly integrating that data in a Geographical Information System (GIS),
allowing the creation of real-time mapping applications.Nas últimas décadas, Portugal tem sido severamente afetado por fogos florestais, que têm
causado grandes estragos ambientais e sociais. Possuir um sistema de mapeamento de
estradas rurais bem estruturado e preciso é essencial para ajudar os bombeiros a mitigar
este tipo de eventos. Os processos tradicionais de extração de eixos de via em estradas
rurais a partir de imagens aéreas são extremamente demorados e fastidiosos. Um desafio
frequente na extração de eixos de via de estradas rurais é a alta complexidade dos ambientes
rurais e de estes serem obstruídos por veículos, sombras, vegetação selvagem e árvores,
trazendo segmentos heterogéneos que podem ser melhorados.
Esta dissertação propõe uma abordagem para detetar automaticamente estradas rurais,
bem como extrair os eixos de via de imagens aéreas.
O método proposto concentra-se em duas etapas principais: na primeira etapa é utilizada
uma arquitetura baseada em modelos de aprendizagem profunda (DeepLabV3+),
para detetar as estradas rurais. Na segunda etapa, primeiramente é proposta uma otimização
de intercessões melhorando as conexões relativas aos eixos de via, bem como a
remoção de pequenos artefactos que estejam a introduzir ruído nas imagens previstas pela
rede neuronal. E, por último, é utilizada uma abordagem morfológica para extrair os eixos
de via das estradas previamente detetadas recorrendo a algoritmos de esqueletização tais
como os algoritmos Zhang-Suen e Guo-Hall.
Automatizando estas etapas, é então possível extrair eixos de via de ambientes rurais
de grande complexidade de forma automática e com uma maior rapidez em relação aos
métodos tradicionais, permitindo, eventualmente, integrar os dados num Sistema de Informação
Geográfica (SIG), possibilitando a criação de aplicativos de mapeamento em tempo
real
A Survey on Continual Semantic Segmentation: Theory, Challenge, Method and Application
Continual learning, also known as incremental learning or life-long learning,
stands at the forefront of deep learning and AI systems. It breaks through the
obstacle of one-way training on close sets and enables continuous adaptive
learning on open-set conditions. In the recent decade, continual learning has
been explored and applied in multiple fields especially in computer vision
covering classification, detection and segmentation tasks. Continual semantic
segmentation (CSS), of which the dense prediction peculiarity makes it a
challenging, intricate and burgeoning task. In this paper, we present a review
of CSS, committing to building a comprehensive survey on problem formulations,
primary challenges, universal datasets, neoteric theories and multifarious
applications. Concretely, we begin by elucidating the problem definitions and
primary challenges. Based on an in-depth investigation of relevant approaches,
we sort out and categorize current CSS models into two main branches including
\textit{data-replay} and \textit{data-free} sets. In each branch, the
corresponding approaches are similarity-based clustered and thoroughly
analyzed, following qualitative comparison and quantitative reproductions on
relevant datasets. Besides, we also introduce four CSS specialities with
diverse application scenarios and development tendencies. Furthermore, we
develop a benchmark for CSS encompassing representative references, evaluation
results and reproductions, which is available
at~\url{https://github.com/YBIO/SurveyCSS}. We hope this survey can serve as a
reference-worthy and stimulating contribution to the advancement of the
life-long learning field, while also providing valuable perspectives for
related fields.Comment: 20 pages, 12 figures. Undergoing Revie
- …