448 research outputs found

    Aggregated Deep Local Features for Remote Sensing Image Retrieval

    Get PDF
    Remote Sensing Image Retrieval remains a challenging topic due to the special nature of Remote Sensing Imagery. Such images contain various different semantic objects, which clearly complicates the retrieval task. In this paper, we present an image retrieval pipeline that uses attentive, local convolutional features and aggregates them using the Vector of Locally Aggregated Descriptors (VLAD) to produce a global descriptor. We study various system parameters such as the multiplicative and additive attention mechanisms and descriptor dimensionality. We propose a query expansion method that requires no external inputs. Experiments demonstrate that even without training, the local convolutional features and global representation outperform other systems. After system tuning, we can achieve state-of-the-art or competitive results. Furthermore, we observe that our query expansion method increases overall system performance by about 3%, using only the top-three retrieved images. Finally, we show how dimensionality reduction produces compact descriptors with increased retrieval performance and fast retrieval computation times, e.g. 50% faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal contributio

    Deep learning in remote sensing: a review

    Get PDF
    Standing at the paradigm shift towards data-intensive science, machine learning techniques are becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to all? Or, should we resist a 'black-box' solution? There are controversial opinions in the remote sensing community. In this article, we analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale influential challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

    Ship Multimodel 3D Reconstruction and Corrosion Detection

    Get PDF
    3D reconstruction has been an area of increased interest due to the current higher demand in applications, such as virtual realities, 3D mapping, medical imaging, and many others. Although, there are still many problems associated with reconstructing a real-life object, such as capturing occluded zones, noise, and processing time. Furthermore, as deep learning technologies advance, there has been a growing interest in using such methods to replace human-driven tasks, namely corrosion inspection, as it decreases the risk of injury of the inspector, it is more efficient due to less time taken, and is cost-saving. This dissertation proposes a method for reconstructing a 3D model of ships using aerial RGB images and terrestrial RGB-D images, along with a system capable of detecting the corroded parts of the ship and highlighting them in the model. Using two different sensors in two different ground planes mitigates some of the occlusion problems and increases the final model’s accuracy. The current dissertation also aims to pick the methods that have the best trade-off between accuracy and computational speed. The final model can be advantageous for corrosion inspectors, as they will have the model of the ship, as well as the corroded zones which, with that information, can choose the steps to take next without the need to manually inspect the ship or even be in the same site as the ship. The final model is a fusion of three different 3D models. The model obtained from RGB images exploits Structure from Motion algorithm which recovers the 3D aspect of the ship from 2D images. As for the remaining models, RGB-D images were used in conjunction with the Open3D library to create 3D structures from both sides of the ship. The corrosion classifier model was trained in Google Colab and achieved an accuracy of 97.44 % on the test dataset. The images used to create the SfM 3D model were each divided into a total of 40 regions and fed into the classifier to simulate a less concise image detection algorithm instead of an image classification algorithm. The results were encoded into the 3D model, highlighting the corroded zones.A reconstrução 3D tem sido uma área com crescente interesse devido à maior demanda em aplicações como realidade virtual, mapeamento 3D, imagens médicas e muitos outros. Embora, existem ainda muitos problemas associados à reconstrução 3D de um objeto real. Exemplos desses são a captura de zonas oclusas, o ruído e o tempo de processamento necessário para efetuar a reconstrução. Adicionalmente, com o avanço das tecnologias de deep learning, tem havido um acrescido interesse em usar ditos métodos para substituir tarefas realizadas por humanos como, por exemplo, a inspeção de corrosão, pois diminui o risco de lesões ao inspetor, tem maior eficiência devido a um menor tempo gasto, e economiza os custos. Esta dissertação propõe um método de reconstrução de um modelo 3D de navios, utilizando imagens RGB aéreas e imagens RGB-D terrestres, juntamente com um sistema capaz de detetar as zonas com corrosão no navio e destacá-las no modelo. O uso de dois sensores diferentes em dois meios diferentes atenuará alguns dos problemas de oclusão e aumentará a precisão do modelo final. A presente dissertação também visa escolher os métodos que apresentam o melhor compromisso entre precisão e velocidade de processamento. O modelo final poderá ser vantajoso para os inspetores de corrosão, pois terão o modelo do navio, bem como as zonas com corrosão que, com essa informação, poderão escolher quais os passos a seguir, sem a necessidade de inspecionar manualmente o navio ou mesmo deslocar-se para o local do navio. O modelo final é uma fusão de três modelos 3D diferentes. O modelo obtido a partir de imagens RGB tirou partido do algoritmo Structure from Motion, que recupera o aspeto 3D do navio a partir de imagens 2D. Quanto aos modelos restantes, as imagens RGB-D foram utilizadas em conjunto com a biblioteca Open3D para criar estruturas 3D de ambos os lados do navio. O modelo de classificação de corrosão foi treinado em ambiente Google Colab e alcançou uma exatidão de 97.44% no dataset de teste. As imagens usadas para criar o modelo SfM 3D foram, cada uma, fracionadas num total de 40 regiões e dadas ao modelo de classificação com o intuito de simularum modelo de deteção de imagem menos conciso em vez de um modelo de classificação de imagem. Os resultados foram codificados no modelo 3D, destacando as zonas com corrosão

    Intelligent Automation System for Vessels Recognition: Comparison of SIFT and SURF Methods

    Get PDF
    Nowadays, with the rise of drone and satellite technology, there is a possibility for its application in sea and coastal surveillance. An advantage of this type of application is the automated recognition of marine objects, among which the most important are vessels. This paper presents the principle of vessel recognition based on the extraction of satellite image features of the vessel and the application of a multilayer perceptron (MLP). Dataset used in this research contains the total of 2750 images, where 2112 images are used as training set while the remaining 638 images are used for testing purposes. The SIFT and SURF algorithms were used to extract image features, which were later used as the input vector for MLP.The best results are achieved if a model with four hidden layers is used. These layers are constructed with 32, 128, 32, 128 neurons with ReLU activation function, respectively. Regarding the application of feature extraction, it can be observed that better results are achieved if the SIFT algorithm is used. The ROC AUC value achieved with the combination of SIFT and MLP reaches 0.99

    Remote Sensing Image Scene Classification: Benchmark and State of the Art

    Full text link
    Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.Comment: This manuscript is the accepted version for Proceedings of the IEE

    Using Prior Knowledge for Verification and Elimination of Stationary and Variable Objects in Real-time Images

    Get PDF
    With the evolving technologies in the autonomous vehicle industry, now it has become possible for automobile passengers to sit relaxed instead of driving the car. Technologies like object detection, object identification, and image segmentation have enabled an autonomous car to identify and detect an object on the road in order to drive safely. While an autonomous car drives by itself on the road, the types of objects surrounding the car can be dynamic (e.g., cars and pedestrians), stationary (e.g., buildings and benches), and variable (e.g., trees) depending on if the location or shape of an object changes or not. Different from the existing image-based approaches to detect and recognize objects in the scene, in this research 3D virtual world is employed to verify and eliminate stationary and variable objects to allow the autonomous car to focus on dynamic objects that may cause danger to its driving. This methodology takes advantage of prior knowledge of stationary and variable objects presented in a virtual city and verifies their existence in a real-time scene by matching keypoints between the virtual and real objects. In case of a stationary or variable object that does not exist in the virtual world due to incomplete pre-existing information, this method uses machine learning for object detection. Verified objects are then removed from the real-time image with a combined algorithm using contour detection and class activation map (CAM), which helps to enhance the efficiency and accuracy when recognizing moving objects

    Fourier-based Rotation-invariant Feature Boosting: An Efficient Framework for Geospatial Object Detection

    Get PDF
    Geospatial object detection of remote sensing imagery has been attracting an increasing interest in recent years, due to the rapid development in spaceborne imaging. Most of previously proposed object detectors are very sensitive to object deformations, such as scaling and rotation. To this end, we propose a novel and efficient framework for geospatial object detection in this letter, called Fourier-based rotation-invariant feature boosting (FRIFB). A Fourier-based rotation-invariant feature is first generated in polar coordinate. Then, the extracted features can be further structurally refined using aggregate channel features. This leads to a faster feature computation and more robust feature representation, which is good fitting for the coming boosting learning. Finally, in the test phase, we achieve a fast pyramid feature extraction by estimating a scale factor instead of directly collecting all features from image pyramid. Extensive experiments are conducted on two subsets of NWPU VHR-10 dataset, demonstrating the superiority and effectiveness of the FRIFB compared to previous state-of-the-art methods
    corecore