140 research outputs found

    Self-Supervised Learning for Invariant Representations From Multi-Spectral and SAR Images

    Get PDF
    Self-Supervised learning (SSL) has become the new state of the art in several domain classification and segmentation tasks. One popular category of SSL are distillation networks such as Bootstrap Your Own Latent (BYOL). This work proposes RS-BYOL, which builds on BYOL in the remote sensing (RS) domain where data are non-trivially different from natural RGB images. Since multi-spectral (MS) and synthetic aperture radar (SAR) sensors provide varied spectral and spatial resolution information, we utilise them as an implicit augmentation to learn invariant feature embeddings. In order to learn RS based invariant features with SSL, we trained RS-BYOL in two ways, i.e. single channel feature learning and three channel feature learning. This work explores the usefulness of single channel feature learning from random 10 MS bands of 10m-20 m resolution and VV-VH of SAR bands compared to the common notion of using three or more bands. In our linear probing evaluation, these single channel features reached a 0.92 F1 score on the EuroSAT classification task and 59.6 mIoU on the IEEE Data Fusion Contest (DFC) segmentation task for certain single bands. We also compare our results with ImageNet weights and show that the RS based SSL model outperforms the supervised ImageNet based model. We further explore the usefulness of multi-modal data compared to single modality data, and it is shown that utilising MS and SAR data allows better invariant representations to be learnt than utilising only MS data

    Synthetic Aperture Radar (SAR) Meets Deep Learning

    Get PDF
    This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports

    Advances in Image Processing, Analysis and Recognition Technology

    Get PDF
    For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

    Deep Learning Methods for Remote Sensing

    Get PDF
    Remote sensing is a field where important physical characteristics of an area are exacted using emitted radiation generally captured by satellite cameras, sensors onboard aerial vehicles, etc. Captured data help researchers develop solutions to sense and detect various characteristics such as forest fires, flooding, changes in urban areas, crop diseases, soil moisture, etc. The recent impressive progress in artificial intelligence (AI) and deep learning has sparked innovations in technologies, algorithms, and approaches and led to results that were unachievable until recently in multiple areas, among them remote sensing. This book consists of sixteen peer-reviewed papers covering new advances in the use of AI for remote sensing

    Sustainable Agriculture and Advances of Remote Sensing (Volume 1)

    Get PDF
    Agriculture, as the main source of alimentation and the most important economic activity globally, is being affected by the impacts of climate change. To maintain and increase our global food system production, to reduce biodiversity loss and preserve our natural ecosystem, new practices and technologies are required. This book focuses on the latest advances in remote sensing technology and agricultural engineering leading to the sustainable agriculture practices. Earth observation data, in situ and proxy-remote sensing data are the main source of information for monitoring and analyzing agriculture activities. Particular attention is given to earth observation satellites and the Internet of Things for data collection, to multispectral and hyperspectral data analysis using machine learning and deep learning, to WebGIS and the Internet of Things for sharing and publishing the results, among others

    Pre-Trained Driving in Localized Surroundings with Semantic Radar Information and Machine Learning

    Get PDF
    Entlang der Signalverarbeitungskette von Radar Detektionen bis zur Fahrzeugansteuerung, diskutiert diese Arbeit eine semantischen Radar Segmentierung, einen darauf aufbauenden Radar SLAM, sowie eine im Verbund realisierte autonome Parkfunktion. Die Radarsegmentierung der (statischen) Umgebung wird durch ein Radar-spezifisches neuronales Netzwerk RadarNet erreicht. Diese Segmentierung ermöglicht die Entwicklung des semantischen Radar Graph-SLAM SERALOC. Auf der Grundlage der semantischen Radar SLAM Karte wird eine beispielhafte autonome Parkfunktionalität in einem realen Versuchsträger umgesetzt. Entlang eines aufgezeichneten Referenzfades parkt die Funktion ausschließlich auf Basis der Radar Wahrnehmung mit bisher unerreichter Positioniergenauigkeit. Im ersten Schritt wird ein Datensatz von 8.2 · 10^6 punktweise semantisch gelabelten Radarpunktwolken über eine Strecke von 2507.35m generiert. Es sind keine vergleichbaren Datensätze dieser Annotationsebene und Radarspezifikation öffentlich verfügbar. Das überwachte Training der semantischen Segmentierung RadarNet erreicht 28.97% mIoU auf sechs Klassen. Außerdem wird ein automatisiertes Radar-Labeling-Framework SeRaLF vorgestellt, welches das Radarlabeling multimodal mittels Referenzkameras und LiDAR unterstützt. Für die kohärente Kartierung wird ein Radarsignal-Vorfilter auf der Grundlage einer Aktivierungskarte entworfen, welcher Rauschen und andere dynamische Mehrwegreflektionen unterdrückt. Ein speziell für Radar angepasstes Graph-SLAM-Frontend mit Radar-Odometrie Kanten zwischen Teil-Karten und semantisch separater NDT Registrierung setzt die vorgefilterten semantischen Radarscans zu einer konsistenten metrischen Karte zusammen. Die Kartierungsgenauigkeit und die Datenassoziation werden somit erhöht und der erste semantische Radar Graph-SLAM für beliebige statische Umgebungen realisiert. Integriert in ein reales Testfahrzeug, wird das Zusammenspiel der live RadarNet Segmentierung und des semantischen Radar Graph-SLAM anhand einer rein Radar-basierten autonomen Parkfunktionalität evaluiert. Im Durchschnitt über 42 autonome Parkmanöver (∅3.73 km/h) bei durchschnittlicher Manöverlänge von ∅172.75m wird ein Median absoluter Posenfehler von 0.235m und End-Posenfehler von 0.2443m erreicht, der vergleichbare Radar-Lokalisierungsergebnisse um ≈ 50% übertrifft. Die Kartengenauigkeit von veränderlichen, neukartierten Orten über eine Kartierungsdistanz von ∅165m ergibt eine ≈ 56%-ige Kartenkonsistenz bei einer Abweichung von ∅0.163m. Für das autonome Parken wurde ein gegebener Trajektorienplaner und Regleransatz verwendet

    Learning representations in the hyperspectral domain in aerial imagery

    Get PDF
    We establish two new datasets with baselines and network architectures for the task of hyperspectral image analysis. The first dataset, AeroRIT, is a moving camera static scene captured from a flight and contains per pixel labeling across five categories for the task of semantic segmentation. The second dataset, RooftopHSI, helps design and interpret learnt features on hyperspectral object detection on scenes captured from an university rooftop. This dataset accounts for static camera, moving scene hyperspectral imagery. We further broaden the scope of our understanding of neural networks with the development of two novel algorithms - S4AL and S4AL+. We develop these frameworks on natural (color) imagery, by combining semi-supervised learning and active learning, and display promising results for learning with limited amount of labeled data, which can be extended to hyperspectral imagery. In this dissertation, we curated two new datasets for hyperspectral image analysis, significantly larger than existing datasets and broader in terms of categories for classification. We then adapt existing neural network architectures to function on the increased channel information, in a smart manner, to leverage all hyperspectral information. We also develop novel active learning algorithms on natural (color) imagery, and discuss the hope for expanding their functionality to hyperspectral imagery

    Contribuciones a la estimación de la pose de la cámara en aplicaciones industriales de realidad aumentada

    Get PDF
    Augmented Reality (AR) aims to complement the visual perception of the user environment superimposing virtual elements. The main challenge of this technology is to combine the virtual and real world in a precise and natural way. To carry out this goal, estimating the user position and orientation in both worlds at all times is a crucial task. Currently, there are numerous techniques and algorithms developed for camera pose estimation. However, the use of synthetic square markers has become the fastest, most robust and simplest solution in these cases. In this scope, a big number of marker detection systems have been developed. Nevertheless, most of them presents some limitations, (1) their unattractive and non-customizable visual appearance prevent their use in industrial products and (2) the detection rate is drastically reduced in presence of noise, blurring and occlusions. In this doctoral dissertation the above-mentioned limitations are addressed. In first place, a comparison has been made between the different marker detection systems currently available in the literature, emphasizing the limitations of each. Secondly, a novel approach to design, detect and track customized markers capable of easily adapting to the visual limitations of commercial products has been developed. In third place, a method that combines the detection of black and white square markers with keypoints and contours has been implemented to estimate the camera position in AR applications. The main motivation of this work is to offer a versatile alternative (based on contours and keypoints) in cases where, due to noise, blurring or occlusions, it is not possible to identify markers in the images. Finally, a method for reconstruction and semantic segmentation of 3D objects using square markers in photogrammetry processes has been presented.La Realidad Aumentada (AR) tiene como objetivo complementar la percepción visual del entorno circunstante al usuario mediante la superposición de elementos virtuales. El principal reto de dicha tecnología se basa en fusionar, de forma precisa y natural, el mundo virtual con el mundo real. Para llevar a cabo dicha tarea, es de vital importancia conocer en todo momento tanto la posición, así como la orientación del usuario en ambos mundos. Actualmente, existen un gran número de técnicas de estimación de pose. No obstante, el uso de marcadores sintéticos cuadrados se ha convertido en la solución más rápida, robusta y sencilla utilizada en estos casos. En este ámbito de estudio, existen un gran número de sistemas de detección de marcadores ampliamente extendidos. Sin embargo, su uso presenta ciertas limitaciones, (1) su aspecto visual, poco atractivo y nada customizable impiden su uso en ciertos productos industriales en donde la personalización comercial es un aspecto crucial y (2) la tasa de detección se ve duramente decrementada ante la presencia de ruido, desenfoques y oclusiones Esta tesis doctoral se ocupa de las limitaciones anteriormente mencionadas. En primer lugar, se ha realizado una comparativa entre los diferentes sistemas de detección de marcadores actualmente en uso, enfatizando las limitaciones de cada uno. En segundo lugar, se ha desarrollado un novedoso enfoque para diseñar, detectar y trackear marcadores personalizados capaces de adaptarse fácilmente a las limitaciones visuales de productos comerciales. En tercer lugar, se ha implementado un método que combina la detección de marcadores cuadrados blancos y negros con keypoints y contornos, para estimar de la posición de la cámara en aplicaciones AR. La principal motivación de este trabajo se basa en ofrecer una alternativa versátil (basada en contornos y keypoints) en aquellos casos donde, por motivos de ruido, desenfoques u oclusiones no sea posible identificar marcadores en las imágenes. Por último, se ha desarrollado un método de reconstrucción y segmentación semántica de objetos 3D utilizando marcadores cuadrados en procesos de fotogrametría
    • …
    corecore