6 research outputs found

    Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset

    Full text link
    Vehicle classification is a hot computer vision topic, with studies ranging from ground-view up to top-view imagery. In remote sensing, the usage of top-view images allows for understanding city patterns, vehicle concentration, traffic management, and others. However, there are some difficulties when aiming for pixel-wise classification: (a) most vehicle classification studies use object detection methods, and most publicly available datasets are designed for this task, (b) creating instance segmentation datasets is laborious, and (c) traditional instance segmentation methods underperform on this task since the objects are small. Thus, the present research objectives are: (1) propose a novel semi-supervised iterative learning approach using GIS software, (2) propose a box-free instance segmentation approach, and (3) provide a city-scale vehicle dataset. The iterative learning procedure considered: (1) label a small number of vehicles, (2) train on those samples, (3) use the model to classify the entire image, (4) convert the image prediction into a polygon shapefile, (5) correct some areas with errors and include them in the training data, and (6) repeat until results are satisfactory. To separate instances, we considered vehicle interior and vehicle borders, and the DL model was the U-net with the Efficient-net-B7 backbone. When removing the borders, the vehicle interior becomes isolated, allowing for unique object identification. To recover the deleted 1-pixel borders, we proposed a simple method to expand each prediction. The results show better pixel-wise metrics when compared to the Mask-RCNN (82% against 67% in IoU). On per-object analysis, the overall accuracy, precision, and recall were greater than 90%. This pipeline applies to any remote sensing target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa

    A CNN-Based Method of Vehicle Detection from Aerial Images Using Hard Example Mining

    No full text
    Recently, deep learning techniques have had a practical role in vehicle detection. While much effort has been spent on applying deep learning to vehicle detection, the effective use of training data has not been thoroughly studied, although it has great potential for improving training results, especially in cases where the training data are sparse. In this paper, we proposed using hard example mining (HEM) in the training process of a convolutional neural network (CNN) for vehicle detection in aerial images. We applied HEM to stochastic gradient descent (SGD) to choose the most informative training data by calculating the loss values in each batch and employing the examples with the largest losses. We picked 100 out of both 500 and 1000 examples for training in one iteration, and we tested different ratios of positive to negative examples in the training data to evaluate how the balance of positive and negative examples would affect the performance. In any case, our method always outperformed the plain SGD. The experimental results for images from New York showed improved performance over a CNN trained in plain SGD where the F1 score of our method was 0.02 higher

    Aplicações de modelos de deep learning para monitoramento ambiental e agrícola no Brasil

    Get PDF
    Tese (doutorado) — Universidade de Brasília, Instituto de Ciências Humanas, Departamento de Geografia, Programa de Pós-Graduação em Geografia, 2022.Algoritmos do novo campo de aprendizado de máquina conhecido como Deep Learning têm se popularizado recentemente, mostrando resultados superiores a modelos tradicionais em métodos de classificação e regressão. O histórico de sua utilização no campo do sensoriamento remoto ainda é breve, porém eles têm mostrado resultados similarmente superiores em processos como a classificação de uso e cobertura da terra e detecção de mudança. Esta tese teve como objetivo o desenvolvimento de metodologias utilizando estes algoritmos com um enfoque no monitoramento de alvos críticos no Brasil por via de imagens de satélite a fim de buscar modelos de alta precisão e acurácia para substituir metodologias utilizadas atualmente. Ao longo de seu desenvolvimento, foram produzidos três artigos onde foi avaliado o uso destes algoritmos para a detecção de três alvos distintos: (a) áreas queimadas no Cerrado brasileiro, (b) áreas desmatadas na região da Amazônia e (c) plantios de arroz no sul do Brasil. Apesar do objetivo similar na produção dos artigos, procurou-se distinguir suficientemente suas metodologias a fim de expandir o espaço metodológico conhecido para fornecer uma base teórica para facilitar e incentivar a adoção destes algoritmos em contexto nacional. O primeiro artigo avaliou diferentes dimensões de amostras para a classificação de áreas queimadas em imagens Landsat-8. O segundo artigo avaliou a utilização de séries temporais binárias de imagens Landsat para a detecção de novas áreas desmatadas entre os anos de 2017, 2018 e 2019. O último artigo utilizou imagens de radar Sentinel-1 (SAR) em uma série temporal contínua para a delimitação dos plantios de arroz no Rio Grande do Sul. Modelos similares foram utilizados em todos os artigos, porém certos modelos foram exclusivos a cada publicação, produzindo diferentes resultados. De maneira geral, os resultados encontrados mostram que algoritmos de Deep Learning são não só viáveis para detecção destes alvos mas também oferecem desempenho superior a métodos existentes na literatura, representando uma alternativa altamente eficiente para classificação e detecção de mudança dos alvos avaliados.Algorithms belonging to the new field of machine learning called Deep Learning have been gaining popularity recently, showing superior results when compared to traditional classification and regression methods. The history of their use in the field of remote sensing is not long, however they have been showing similarly superior results in processes such as land use classification and change detection. This thesis had as its objective the development of methodologies using these algorithms with a focus on monitoring critical targets in Brazil through satellite imagery in order to find high accuracy and precision models to substitute methods used currently. Through the development of this thesis, articles were produced evaluating their use for the detection of three distinct targets: (a) burnt areas in the Brazilian Cerrado, (b) deforested areas in the Amazon region and (c) rice fields in the south of Brazil. Despite the similar objective in the production of these articles, the methodologies in each of them was made sufficiently distinct in order to expand the methodological space known. The first article evaluated the use of differently sized samples to classify burnt areas in Landsat-8 imagery. The second article evaluated the use of binary Landsat time series to detect new deforested areas between the years of 2017, 2018 and 2019. The last article used continuous radar Sentinel-1 (SAR) time series to map rice fields in the state of Rio Grande do Sul. Similar models were used in all articles, however certain models were exclusive to each one. In general, the results show that not only are the Deep Learning models viable but also offer better results in comparison to other existing methods, representing an efficient alternative when it comes to the classification and change detection of the targets evaluated

    Deep Learning based Vehicle Detection in Aerial Imagery

    Get PDF
    Der Einsatz von luftgestützten Plattformen, die mit bildgebender Sensorik ausgestattet sind, ist ein wesentlicher Bestandteil von vielen Anwendungen im Bereich der zivilen Sicherheit. Bekannte Anwendungsgebiete umfassen unter anderem die Entdeckung verbotener oder krimineller Aktivitäten, Verkehrsüberwachung, Suche und Rettung, Katastrophenhilfe und Umweltüberwachung. Aufgrund der großen Menge zu verarbeitender Daten und der daraus resultierenden kognitiven Überbelastung ist jedoch eine Analyse der Luftbilddaten ausschließlich durch menschliche Auswerter in der Praxis nicht anwendbar. Zur Unterstützung der menschlichen Auswerter kommen daher in der Regel automatische Bild- und Videoverarbeitungsalgorithmen zum Einsatz. Eine zentrale Aufgabe bildet dabei eine zuverlässige Detektion relevanter Objekte im Sichtfeld der Kamera, bevor eine Interpretation der gegebenen Szene stattfinden kann. Die geringe Bodenauflösung aufgrund der großen Distanz zwischen Kamera und Erde macht die Objektdetektion in Luftbilddaten zu einer herausfordernden Aufgabe, welche durch Bewegungsunschärfe, Verdeckungen und Schattenwurf zusätzlich erschwert wird. Obwohl in der Literatur eine Vielzahl konventioneller Ansätze zur Detektion von Objekten in Luftbilddaten existiert, ist die Detektionsgenauigkeit durch die Repräsentationsfähigkeit der verwendeten manuell entworfenen Merkmale beschränkt. Im Rahmen dieser Arbeit wird ein neuer Deep-Learning basierter Ansatz zur Detektion von Objekten in Luftbilddaten präsentiert. Der Fokus der Arbeit liegt dabei auf der Detektion von Fahrzeugen in Luftbilddaten, die senkrecht von oben aufgenommen wurden. Grundlage des entwickelten Ansatzes bildet der Faster R-CNN Detektor, der im Vergleich zu anderen Deep-Learning basierten Detektionsverfahren eine höhere Detektionsgenauigkeit besitzt. Da Faster R-CNN wie auch die anderen Deep-Learning basierten Detektionsverfahren auf Benchmark Datensätzen optimiert wurden, werden in einem ersten Schritt notwendige Anpassungen an die Eigenschaften der Luftbilddaten, wie die geringen Abmessungen der zu detektierenden Fahrzeuge, systematisch untersucht und daraus resultierende Probleme identifiziert. Im Hinblick auf reale Anwendungen sind hier vor allem die hohe Anzahl fehlerhafter Detektionen durch fahrzeugähnliche Strukturen und die deutlich erhöhte Laufzeit problematisch. Zur Reduktion der fehlerhaften Detektionen werden zwei neue Ansätze vorgeschlagen. Beide Ansätze verfolgen dabei das Ziel, die verwendete Merkmalsrepräsentation durch zusätzliche Kontextinformationen zu verbessern. Der erste Ansatz verfeinert die räumlichen Kontextinformationen durch eine Kombination der Merkmale von frühen und tiefen Schichten der zugrundeliegenden CNN Architektur, so dass feine und grobe Strukturen besser repräsentiert werden. Der zweite Ansatz macht Gebrauch von semantischer Segmentierung um den semantischen Informationsgehalt zu erhöhen. Hierzu werden zwei verschiedene Varianten zur Integration der semantischen Segmentierung in das Detektionsverfahren realisiert: zum einen die Verwendung der semantischen Segmentierungsergebnisse zur Filterung von unwahrscheinlichen Detektionen und zum anderen explizit durch Verschmelzung der CNN Architekturen zur Detektion und Segmentierung. Sowohl durch die Verfeinerung der räumlichen Kontextinformationen als auch durch die Integration der semantischen Kontextinformationen wird die Anzahl der fehlerhaften Detektionen deutlich reduziert und somit die Detektionsgenauigkeit erhöht. Insbesondere der starke Rückgang von fehlerhaften Detektionen in unwahrscheinlichen Bildregionen, wie zum Beispiel auf Gebäuden, zeigt die erhöhte Robustheit der gelernten Merkmalsrepräsentationen. Zur Reduktion der Laufzeit werden im Rahmen der Arbeit zwei alternative Strategien verfolgt. Die erste Strategie ist das Ersetzen der zur Merkmalsextraktion standardmäßig verwendeten CNN Architektur mit einer laufzeitoptimierten CNN Architektur unter Berücksichtigung der Eigenschaften der Luftbilddaten, während die zweite Strategie ein neues Modul zur Reduktion des Suchraumes umfasst. Mit Hilfe der vorgeschlagenen Strategien wird die Gesamtlaufzeit sowie die Laufzeit für jede Komponente des Detektionsverfahrens deutlich reduziert. Durch Kombination der vorgeschlagenen Ansätze kann sowohl die Detektionsgenauigkeit als auch die Laufzeit im Vergleich zur Faster R-CNN Baseline signifikant verbessert werden. Repräsentative Ansätze zur Fahrzeugdetektion in Luftbilddaten aus der Literatur werden quantitativ und qualitativ auf verschiedenen Datensätzen übertroffen. Des Weiteren wird die Generalisierbarkeit des entworfenen Ansatzes auf ungesehenen Bildern von weiteren Luftbilddatensätzen mit abweichenden Eigenschaften demonstriert

    Mapping agricultural land in support of opium monitoring in Afghanistan with Convolutional Neural Networks (CNNs).

    Get PDF
    This work investigates the use of advanced image classification techniques for improving the accuracy and efficiency in determining agricultural areas from satellite images. The United Nations Office on Drugs and Crime (UNODC) need to accurately delineate the potential area under opium cultivation as part of their opium monitoring programme in Afghanistan. They currently use unsupervised image classification, but this is unable to separate some areas of agriculture from natural vegetation and requires time-consuming manual editing. This is a significant task as each image must be classified and interpreted separately. The aim of this research is to derive information about annual changes in land-use related to opium cultivation using convolutional neural networks with Earth observation data. Supervised machine learning techniques were investigated for agricultural land classification using training data from existing manual interpretations. Although pixel-based machine learning techniques achieved high overall classification accuracy (89%) they had difficulty separating between agriculture and natural vegetation at some locations. Convolutional Neural Networks (CNNs) have achieved ground-breaking performance in computer vision applications. They use localised image features and offer transfer learning to overcome the limitations of pixel-based methods. There are challenges related to training CNNs for land cover classification because of underlying radiometric and temporal variations in satellite image datasets. Optimisation of CNNs with a targeted sampling strategy focused on areas of known confusion (agricultural boundaries and natural vegetation). The results showed an improved overall classification accuracy of +6%. Localised differences in agricultural mapping were identified using a new tool called ‘localised intersection over union’. This provides greater insight than commonly used assessment techniques (overall accuracy and kappa statistic), that are not suitable for comparing smaller differences in mapping accuracy. A generalised fully convolutional model (FCN) was developed and evaluated using six years of data and transfer learning. Image datasets were standardised across image dates and different sensors (DMC, Landsat, and Sentinel-2), achieving high classification accuracy (up to 95%) with no additional training. Further fine-tuning with minimal training data and a targeted training strategy further increased model performance between years (up to +5%). The annual changes in agricultural area from 2010 to 2019 were mapped using the generalised FCN model in Helmand Province, Afghanistan. This provided new insight into the expansion of agriculture into marginal areas in response to counter-narcotic and alternative livelihoods policy. New areas of cultivation were found to contribute to the expansion of opium cultivation in Helmand Province. The approach demonstrates the use of FCNs for fully automated land cover classification. They are fast and efficient, can be used to classify satellite imagery from different sensors and can be continually refined using transfer learning. The proposed method overcomes the manual effort associated with mapping agricultural areas within the opium survey while improving accuracy. These findings have wider implications for improving land cover classification using legacy data on scalable cloud-based platforms.Simms, Daniel M. (Associate)PhD in Environment and Agrifoo
    corecore