32 research outputs found

    Object Counting with Deep Learning

    Get PDF
    This thesis explores various empirical aspects of deep learning or convolutional network based models for efficient object counting. First, we train moderately large convolutional networks on comparatively smaller datasets containing few hundred samples from scratch with conventional image processing based data augmentation. Then, we extend this approach for unconstrained, outdoor images using more advanced architectural concepts. Additionally, we propose an efficient, randomized data augmentation strategy based on sub-regional pixel distribution for low-resolution images. Next, the effectiveness of depth-to-space shuffling of feature elements for efficient segmentation is investigated for simpler problems like binary segmentation -- often required in the counting framework. This depth-to-space operation violates the basic assumption of encoder-decoder type of segmentation architectures. Consequently, it helps to train the encoder model as a sparsely connected graph. Nonetheless, we have found comparable accuracy to that of the standard encoder-decoder architectures with our depth-to-space models. After that, the subtleties regarding the lack of localization information in the conventional scalar count loss for one-look models are illustrated. At this point, without using additional annotations, a possible solution is proposed based on the regulation of a network-generated heatmap in the form of a weak, subsidiary loss. The models trained with this auxiliary loss alongside the conventional loss perform much better compared to their baseline counterparts, both qualitatively and quantitatively. Lastly, the intricacies of tiled prediction for high-resolution images are studied in detail, and a simple and effective trick of eliminating the normalization factor in an existing computational block is demonstrated. All of the approaches employed here are thoroughly benchmarked across multiple heterogeneous datasets for object counting against previous, state-of-the-art approaches

    A Review of the Challenges of Using Deep Learning Algorithms to Support Decision-Making in Agricultural Activities

    Get PDF
    Deep Learning has been successfully applied to image recognition, speech recognition, and natural language processing in recent years. Therefore, there has been an incentive to apply it in other fields as well. The field of agriculture is one of the most important fields in which the application of deep learning still needs to be explored, as it has a direct impact on human well-being. In particular, there is a need to explore how deep learning models can be used as a tool for optimal planting, land use, yield improvement, production/disease/pest control, and other activities. The vast amount of data received from sensors in smart farms makes it possible to use deep learning as a model for decision-making in this field. In agriculture, no two environments are exactly alike, which makes testing, validating, and successfully implementing such technologies much more complex than in most other industries. This paper reviews some recent scientific developments in the field of deep learning that have been applied to agriculture, and highlights some challenges and potential solutions using deep learning algorithms in agriculture. The results in this paper indicate that by employing new methods from deep learning, higher performance in terms of accuracy and lower inference time can be achieved, and the models can be made useful in real-world applications. Finally, some opportunities for future research in this area are suggested.This work is supported by the R&D Project BioDAgro—Sistema operacional inteligente de informação e suporte á decisão em AgroBiodiversidade, project PD20-00011, promoted by Fundação La Caixa and Fundação para a Ciência e a Tecnologia, taking place at the C-MAST-Centre for Mechanical and Aerospace Sciences and Technology, Department of Electromechanical Engineering of the University of Beira Interior, Covilhã, Portugal.info:eu-repo/semantics/publishedVersio

    Advancing Land Cover Mapping in Remote Sensing with Deep Learning

    Get PDF
    Automatic mapping of land cover in remote sensing data plays an increasingly significant role in several earth observation (EO) applications, such as sustainable development, autonomous agriculture, and urban planning. Due to the complexity of the real ground surface and environment, accurate classification of land cover types is facing many challenges. This thesis provides novel deep learning-based solutions to land cover mapping challenges such as how to deal with intricate objects and imbalanced classes in multi-spectral and high-spatial resolution remote sensing data. The first work presents a novel model to learn richer multi-scale and global contextual representations in very high-resolution remote sensing images, namely the dense dilated convolutions' merging (DDCM) network. The proposed method is light-weighted, flexible and extendable, so that it can be used as a simple yet effective encoder and decoder module to address different classification and semantic mapping challenges. Intensive experiments on different benchmark remote sensing datasets demonstrate that the proposed method can achieve better performance but consume much fewer computation resources compared with other published methods. Next, a novel graph model is developed for capturing long-range pixel dependencies in remote sensing images to improve land cover mapping. One key component in the method is the self-constructing graph (SCG) module that can effectively construct global context relations (latent graph structure) without requiring prior knowledge graphs. The proposed SCG-based models achieved competitive performance on different representative remote sensing datasets with faster training and lower computational cost compared to strong baseline models. The third work introduces a new framework, namely the multi-view self-constructing graph (MSCG) network, to extend the vanilla SCG model to be able to capture multi-view context representations with rotation invariance to achieve improved segmentation performance. Meanwhile, a novel adaptive class weighting loss function is developed to alleviate the issue of class imbalance commonly found in EO datasets for semantic segmentation. Experiments on benchmark data demonstrate the proposed framework is computationally efficient and robust to produce improved segmentation results for imbalanced classes. To address the key challenges in multi-modal land cover mapping of remote sensing data, namely, 'what', 'how' and 'where' to effectively fuse multi-source features and to efficiently learn optimal joint representations of different modalities, the last work presents a compact and scalable multi-modal deep learning framework (MultiModNet) based on two novel modules: the pyramid attention fusion module and the gated fusion unit. The proposed MultiModNet outperforms the strong baselines on two representative remote sensing datasets with fewer parameters and at a lower computational cost. Extensive ablation studies also validate the effectiveness and flexibility of the framework

    Deep Learning based Vehicle Detection in Aerial Imagery

    Get PDF
    Der Einsatz von luftgestützten Plattformen, die mit bildgebender Sensorik ausgestattet sind, ist ein wesentlicher Bestandteil von vielen Anwendungen im Bereich der zivilen Sicherheit. Bekannte Anwendungsgebiete umfassen unter anderem die Entdeckung verbotener oder krimineller Aktivitäten, Verkehrsüberwachung, Suche und Rettung, Katastrophenhilfe und Umweltüberwachung. Aufgrund der großen Menge zu verarbeitender Daten und der daraus resultierenden kognitiven Überbelastung ist jedoch eine Analyse der Luftbilddaten ausschließlich durch menschliche Auswerter in der Praxis nicht anwendbar. Zur Unterstützung der menschlichen Auswerter kommen daher in der Regel automatische Bild- und Videoverarbeitungsalgorithmen zum Einsatz. Eine zentrale Aufgabe bildet dabei eine zuverlässige Detektion relevanter Objekte im Sichtfeld der Kamera, bevor eine Interpretation der gegebenen Szene stattfinden kann. Die geringe Bodenauflösung aufgrund der großen Distanz zwischen Kamera und Erde macht die Objektdetektion in Luftbilddaten zu einer herausfordernden Aufgabe, welche durch Bewegungsunschärfe, Verdeckungen und Schattenwurf zusätzlich erschwert wird. Obwohl in der Literatur eine Vielzahl konventioneller Ansätze zur Detektion von Objekten in Luftbilddaten existiert, ist die Detektionsgenauigkeit durch die Repräsentationsfähigkeit der verwendeten manuell entworfenen Merkmale beschränkt. Im Rahmen dieser Arbeit wird ein neuer Deep-Learning basierter Ansatz zur Detektion von Objekten in Luftbilddaten präsentiert. Der Fokus der Arbeit liegt dabei auf der Detektion von Fahrzeugen in Luftbilddaten, die senkrecht von oben aufgenommen wurden. Grundlage des entwickelten Ansatzes bildet der Faster R-CNN Detektor, der im Vergleich zu anderen Deep-Learning basierten Detektionsverfahren eine höhere Detektionsgenauigkeit besitzt. Da Faster R-CNN wie auch die anderen Deep-Learning basierten Detektionsverfahren auf Benchmark Datensätzen optimiert wurden, werden in einem ersten Schritt notwendige Anpassungen an die Eigenschaften der Luftbilddaten, wie die geringen Abmessungen der zu detektierenden Fahrzeuge, systematisch untersucht und daraus resultierende Probleme identifiziert. Im Hinblick auf reale Anwendungen sind hier vor allem die hohe Anzahl fehlerhafter Detektionen durch fahrzeugähnliche Strukturen und die deutlich erhöhte Laufzeit problematisch. Zur Reduktion der fehlerhaften Detektionen werden zwei neue Ansätze vorgeschlagen. Beide Ansätze verfolgen dabei das Ziel, die verwendete Merkmalsrepräsentation durch zusätzliche Kontextinformationen zu verbessern. Der erste Ansatz verfeinert die räumlichen Kontextinformationen durch eine Kombination der Merkmale von frühen und tiefen Schichten der zugrundeliegenden CNN Architektur, so dass feine und grobe Strukturen besser repräsentiert werden. Der zweite Ansatz macht Gebrauch von semantischer Segmentierung um den semantischen Informationsgehalt zu erhöhen. Hierzu werden zwei verschiedene Varianten zur Integration der semantischen Segmentierung in das Detektionsverfahren realisiert: zum einen die Verwendung der semantischen Segmentierungsergebnisse zur Filterung von unwahrscheinlichen Detektionen und zum anderen explizit durch Verschmelzung der CNN Architekturen zur Detektion und Segmentierung. Sowohl durch die Verfeinerung der räumlichen Kontextinformationen als auch durch die Integration der semantischen Kontextinformationen wird die Anzahl der fehlerhaften Detektionen deutlich reduziert und somit die Detektionsgenauigkeit erhöht. Insbesondere der starke Rückgang von fehlerhaften Detektionen in unwahrscheinlichen Bildregionen, wie zum Beispiel auf Gebäuden, zeigt die erhöhte Robustheit der gelernten Merkmalsrepräsentationen. Zur Reduktion der Laufzeit werden im Rahmen der Arbeit zwei alternative Strategien verfolgt. Die erste Strategie ist das Ersetzen der zur Merkmalsextraktion standardmäßig verwendeten CNN Architektur mit einer laufzeitoptimierten CNN Architektur unter Berücksichtigung der Eigenschaften der Luftbilddaten, während die zweite Strategie ein neues Modul zur Reduktion des Suchraumes umfasst. Mit Hilfe der vorgeschlagenen Strategien wird die Gesamtlaufzeit sowie die Laufzeit für jede Komponente des Detektionsverfahrens deutlich reduziert. Durch Kombination der vorgeschlagenen Ansätze kann sowohl die Detektionsgenauigkeit als auch die Laufzeit im Vergleich zur Faster R-CNN Baseline signifikant verbessert werden. Repräsentative Ansätze zur Fahrzeugdetektion in Luftbilddaten aus der Literatur werden quantitativ und qualitativ auf verschiedenen Datensätzen übertroffen. Des Weiteren wird die Generalisierbarkeit des entworfenen Ansatzes auf ungesehenen Bildern von weiteren Luftbilddatensätzen mit abweichenden Eigenschaften demonstriert

    Deep Learning Methods for Remote Sensing

    Get PDF
    Remote sensing is a field where important physical characteristics of an area are exacted using emitted radiation generally captured by satellite cameras, sensors onboard aerial vehicles, etc. Captured data help researchers develop solutions to sense and detect various characteristics such as forest fires, flooding, changes in urban areas, crop diseases, soil moisture, etc. The recent impressive progress in artificial intelligence (AI) and deep learning has sparked innovations in technologies, algorithms, and approaches and led to results that were unachievable until recently in multiple areas, among them remote sensing. This book consists of sixteen peer-reviewed papers covering new advances in the use of AI for remote sensing

    Deep Learning based Vehicle Detection in Aerial Imagery

    Get PDF
    This book proposes a novel deep learning based detection method, focusing on vehicle detection in aerial imagery recorded in top view. The base detection framework is extended by two novel components to improve the detection accuracy by enhancing the contextual and semantical content of the employed feature representation. To reduce the inference time, a lightweight CNN architecture is proposed as base architecture and a novel module that restricts the search area is introduced

    Advanced machine learning algorithms for Canadian wetland mapping using polarimetric synthetic aperture radar (PolSAR) and optical imagery

    Get PDF
    Wetlands are complex land cover ecosystems that represent a wide range of biophysical conditions. They are one of the most productive ecosystems and provide several important environmental functionalities. As such, wetland mapping and monitoring using cost- and time-efficient approaches are of great interest for sustainable management and resource assessment. In this regard, satellite remote sensing data are greatly beneficial, as they capture a synoptic and multi-temporal view of landscapes. The ability to extract useful information from satellite imagery greatly affects the accuracy and reliability of the final products. This is of particular concern for mapping complex land cover ecosystems, such as wetlands, where complex, heterogeneous, and fragmented landscape results in similar backscatter/spectral signatures of land cover classes in satellite images. Accordingly, the overarching purpose of this thesis is to contribute to existing methodologies of wetland classification by proposing and developing several new techniques based on advanced remote sensing tools and optical and Synthetic Aperture Radar (SAR) imagery. Specifically, the importance of employing an efficient speckle reduction method for polarimetric SAR (PolSAR) image processing is discussed and a new speckle reduction technique is proposed. Two novel techniques are also introduced for improving the accuracy of wetland classification. In particular, a new hierarchical classification algorithm using multi-frequency SAR data is proposed that discriminates wetland classes in three steps depending on their complexity and similarity. The experimental results reveal that the proposed method is advantageous for mapping complex land cover ecosystems compared to single stream classification approaches, which have been extensively used in the literature. Furthermore, a new feature weighting approach is proposed based on the statistical and physical characteristics of PolSAR data to improve the discrimination capability of input features prior to incorporating them into the classification scheme. This study also demonstrates the transferability of existing classification algorithms, which have been developed based on RADARSAT-2 imagery, to compact polarimetry SAR data that will be collected by the upcoming RADARSAT Constellation Mission (RCM). The capability of several well-known deep Convolutional Neural Network (CNN) architectures currently employed in computer vision is first introduced in this thesis for classification of wetland complexes using multispectral remote sensing data. Finally, this research results in the first provincial-scale wetland inventory maps of Newfoundland and Labrador using the Google Earth Engine (GEE) cloud computing resources and open access Earth Observation (EO) collected by the Copernicus Sentinel missions. Overall, the methodologies proposed in this thesis address fundamental limitations/challenges of wetland mapping using remote sensing data, which have been ignored in the literature. These challenges include the backscattering/spectrally similar signature of wetland classes, insufficient classification accuracy of wetland classes, and limitations of wetland mapping on large scales. In addition to the capabilities of the proposed methods for mapping wetland complexes, the use of these developed techniques for classifying other complex land cover types beyond wetlands, such as sea ice and crop ecosystems, offers a potential avenue for further research

    Photogrammetric suite to manage the survey workflow in challenging environments and conditions

    Get PDF
    The present work is intended in providing new and innovative instruments to support the photogrammetric survey workflow during all its phases. A suite of tools has been conceived in order to manage the planning, the acquisition, the post-processing and the restitution steps, with particular attention to the rigorousness of the approach and to the final precision. The main focus of the research has been the implementation of the tool MAGO, standing for Adaptive Mesh for Orthophoto Generation. Its novelty consists in the possibility to automatically reconstruct \u201cunrolled\u201d orthophotos of adjacent fa\ue7ades of a building using the point cloud, instead of the mesh, as input source for the orthophoto reconstruction. The second tool has been conceived as a photogrammetric procedure based on Bundle Block Adjustment. The same issue is analysed from two mirrored perspectives: on the one hand, the use of moving cameras in a static scenario in order to manage real-time indoor navigation; on the other hand, the use of static cameras in a moving scenario in order to achieve the simultaneously reconstruction of the 3D model of the changing object. A third tool named U.Ph.O., standing for Unmanned Photogrammetric Office, has been integrated with a new module. The general aim is on the one hand to plan the photogrammetric survey considering the expected precision, computed on the basis of a network simulation, and on the other hand to check if the achieved survey has been collected compatibly with the planned conditions. The provided integration concerns the treatment of surfaces with a generic orientation further than the ones with a planimetric development. After a brief introduction, a general description about the photogrammetric principles is given in the first chapter of the dissertation; a chapter follows about the parallelism between Photogrammetry and Computer Vision and the contribution of this last in the development of the described tools. The third chapter specifically regards, indeed, the implemented software and tools, while the fourth contains the training test and the validation. Finally, conclusions and future perspectives are reported