1,289 research outputs found

    Learning Aerial Image Segmentation from Online Maps

    Get PDF
    This study deals with semantic segmentation of high-resolution (aerial) images where a semantic class label is assigned to each pixel via supervised classification as a basis for automatic map generation. Recently, deep convolutional neural networks (CNNs) have shown impressive performance and have quickly become the de-facto standard for semantic segmentation, with the added benefit that task-specific feature design is no longer necessary. However, a major downside of deep learning methods is that they are extremely data-hungry, thus aggravating the perennial bottleneck of supervised classification, to obtain enough annotated training data. On the other hand, it has been observed that they are rather robust against noise in the training labels. This opens up the intriguing possibility to avoid annotating huge amounts of training data, and instead train the classifier from existing legacy data or crowd-sourced maps which can exhibit high levels of noise. The question addressed in this paper is: can training with large-scale, publicly available labels replace a substantial part of the manual labeling effort and still achieve sufficient performance? Such data will inevitably contain a significant portion of errors, but in return virtually unlimited quantities of it are available in larger parts of the world. We adapt a state-of-the-art CNN architecture for semantic segmentation of buildings and roads in aerial images, and compare its performance when using different training data sets, ranging from manually labeled, pixel-accurate ground truth of the same city to automatic training data derived from OpenStreetMap data from distant locations. We report our results that indicate that satisfying performance can be obtained with significantly less manual annotation effort, by exploiting noisy large-scale training data.Comment: Published in IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSIN

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    A computer vision system for detecting and analysing critical events in cities

    Get PDF
    Whether for commuting or leisure, cycling is a growing transport mode in many cities worldwide. However, it is still perceived as a dangerous activity. Although serious incidents related to cycling leading to major injuries are rare, the fear of getting hit or falling hinders the expansion of cycling as a major transport mode. Indeed, it has been shown that focusing on serious injuries only touches the tip of the iceberg. Near miss data can provide much more information about potential problems and how to avoid risky situations that may lead to serious incidents. Unfortunately, there is a gap in the knowledge in identifying and analysing near misses. This hinders drawing statistically significant conclusions to provide measures for the built-environment that ensure a safer environment for people on bikes. In this research, we develop a method to detect and analyse near misses and their risk factors using artificial intelligence. This is accomplished by analysing video streams linked to near miss incidents within a novel framework relying on deep learning and computer vision. This framework automatically detects near misses and extracts their risk factors from video streams before analysing their statistical significance. It also provides practical solutions implemented in a camera with embedded AI (URBAN-i Box) and a cloud-based service (URBAN-i Cloud) to tackle the stated issue in the real-world settings for use by researchers, policy-makers, or citizens. The research aims to provide human-centred evidence that may enable policy-makers and planners to provide a safer built environment for cycling in London, or elsewhere. More broadly, this research aims to contribute to the scientific literature with the theoretical and empirical foundations of a computer vision system that can be utilised for detecting and analysing other critical events in a complex environment. Such a system can be applied to a wide range of events, such as traffic incidents, crime or overcrowding

    Efficient Hybrid Transformer: Learning Global-local Context for Urban Scene Segmentation

    Full text link
    Semantic segmentation of fine-resolution urban scene images plays a vital role in extensive practical applications, such as land cover mapping, urban change detection, environmental protection and economic assessment. Driven by rapid developments in deep learning technologies, the convolutional neural network (CNN) has dominated the semantic segmentation task for many years. Convolutional neural networks adopt hierarchical feature representation, demonstrating strong local information extraction. However, the local property of the convolution layer limits the network from capturing global context that is crucial for precise segmentation. Recently, Transformer comprise a hot topic in the computer vision domain. Transformer demonstrates the great capability of global information modelling, boosting many vision tasks, such as image classification, object detection and especially semantic segmentation. In this paper, we propose an efficient hybrid Transformer (EHT) for real-time urban scene segmentation. The EHT adopts a hybrid structure with and CNN-based encoder and a transformer-based decoder, learning global-local context with lower computation. Extensive experiments demonstrate that our EHT has faster inference speed with competitive accuracy compared with state-of-the-art lightweight models. Specifically, the proposed EHT achieves a 66.9% mIoU on the UAVid test set and outperforms other benchmark networks significantly. The code will be available soon

    Reducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods

    Get PDF
    This dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended

    Analysis of traffic signs and traffic lights for autonomously driven vehicles

    Get PDF
    Dissertação de mestrado integrado em Engenharia Eletrónica Industrial e ComputadoresOs Advanced Driver Assistance Systems(ADAS) estão relacionados a vários sistemas nos veículos que se destinam a melhorar a segurança do tráfego rodoviário, ajudando os condutores a terem melhor consciência da estrada e dos seus perigos inerentes, bem como de outros motoristas nas proximidades. A deteção e o reconhecimento de sinais de trânsito são parte integrante do ADAS. Os sinais de trânsito fornecem informações sobre as regras de trânsito, condições das estradas, direções de rotas e auxiliam os motoristas para uma condução segura. Isto garante que o limite de velocidade atual e outros sinais de trânsito sejam exibidos para o motorista continuamente. O projeto de reconhecimento de sinais de trânsito tem sido um problema desafiador por muitos anos e, portanto, tornou-se um tópico de pesquisa importante e ativo na área de sistemas de transporte inteligentes. Esta tecnologia está a ser desenvolvida por uma variedade de fornecedores automóveis. Esta pode usar técnicas de processamento de imagem para detetar sinais de trânsito. Uma abordagem do problema de deteção e reconhecimento de sinais/semáforos usando Su pervised Learning é apresentada nesta dissertação para dois cenários diferentes. Nesta disser tação, para cada objetivo, são apresentadas duas abordagens diferentes de Supervised Learning, bem como um estudo estendido dos hiperparâmetros. Para os dois primeiros objetivos, foram de senvolvidas as abordagens para um robô produzido pela equipa de Condução Autónoma do Labo ratório de Automação e Robótica. O robô deve detetar e classificar corretamente o sinal/semáforo de trânsito apresentado. O terceiro objetivo é para a via pública e a abordagem desenvolvida deve detetar e classificar corretamente o sinal/semáforo de trânsito apresentado no conjunto de dados restritos.Advanced Driver Assistance Systems (ADAS) relate to various in-vehicle systems that are intended to improve road traffic safety by supporting and improve drivers awareness of the road and its dangers as well as other drivers in the vicinity. Traffic sign detection and recognition is part of ADAS. Traffic signs give knowledge about the traffic rules, road conditions, route directions and assist drivers for safe driving. This ensures that the current speed limit and other road signs are displayed to the driver on an ongoing basis. The design of traffic sign recognition has been a challenging problem for many years and therefore became an important and active research topic in the area of intelligent transport systems. This technology is being developed by a variety of automotive suppliers. Typically it uses classical image processing techniques to detect traffic signs. An approach to the problem of Traffic Sign/Light detection and recognition using Supervised Learning is presented in this dissertation for two different scenarios. Two different Supervised Learning approaches are presented for each objective as well as an extended hyperparameter study. For the first two objectives, the approaches were developed for a robot produced by the Autonomous Driving team from the Laboratório de Automação e Robótica fom University of Minho. The robot must correctly detect and classify the presented Traffic Sign/Light. The third objective is for the public road and the developed approach must correctly detect and classify the presented Traffic Sign/Light in the restrained dataset
    • …
    corecore