722 research outputs found

    Spatiotemporal Data Augmentation of MODIS-LANDSAT Water Bodies Using Generative Adversarial Networks

    Get PDF
    The monitoring of the shape and area of a water body is an essential component for many Earth science and Hydrological applications. For this purpose, these applications require remote sensing data which provides accurate analysis of the water bodies. In this thesis the same is being attempted, first, a model is created that can map the information from one kind of satellite that captures the data from a distance of 500m to another data that is captured by a different satellite at a distance of 30m. To achieve this, we first collected the data from both of the satellites and translated the data from one satellite to another using our proposed Hydro-GAN model. This translation gives us the accurate shape, boundary, and area of the water body. We evaluated the method by using several different similarity metrics for the area and the shape of the water body. The second part of this thesis involves augmenting the data that we obtained from the Hydro-GAN model with the original data and using this enriched data to predict the area of a water body in the future. We used the case study of Great Salt lake for this purpose. The results indicated that our proposed model was creating accurate area and shape of the water bodies. When we used our proposed model to generate data at a resolution of 30m it gave us better areal and shape accuracy. If we get more data at this resolution, we can use that data to better predict coastal lines, boundaries, as well as erosion monitoring

    Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation

    Full text link
    Remote sensing (RS) image retrieval is of great significant for geological information mining. Over the past two decades, a large amount of research on this task has been carried out, which mainly focuses on the following three core issues: feature extraction, similarity metric and relevance feedback. Due to the complexity and multiformity of ground objects in high-resolution remote sensing (HRRS) images, there is still room for improvement in the current retrieval approaches. In this paper, we analyze the three core issues of RS image retrieval and provide a comprehensive review on existing methods. Furthermore, for the goal to advance the state-of-the-art in HRRS image retrieval, we focus on the feature extraction issue and delve how to use powerful deep representations to address this task. We conduct systematic investigation on evaluating correlative factors that may affect the performance of deep features. By optimizing each factor, we acquire remarkable retrieval results on publicly available HRRS datasets. Finally, we explain the experimental phenomenon in detail and draw conclusions according to our analysis. Our work can serve as a guiding role for the research of content-based RS image retrieval

    TempEE: Temporal-Spatial Parallel Transformer for Radar Echo Extrapolation Beyond Auto-Regression

    Full text link
    Meteorological radar reflectivity data (i.e. radar echo) significantly influences precipitation prediction. It can facilitate accurate and expeditious forecasting of short-term heavy rainfall bypassing the need for complex Numerical Weather Prediction (NWP) models. In comparison to conventional models, Deep Learning (DL)-based radar echo extrapolation algorithms exhibit higher effectiveness and efficiency. Nevertheless, the development of reliable and generalized echo extrapolation algorithm is impeded by three primary challenges: cumulative error spreading, imprecise representation of sparsely distributed echoes, and inaccurate description of non-stationary motion processes. To tackle these challenges, this paper proposes a novel radar echo extrapolation algorithm called Temporal-Spatial Parallel Transformer, referred to as TempEE. TempEE avoids using auto-regression and instead employs a one-step forward strategy to prevent cumulative error spreading during the extrapolation process. Additionally, we propose the incorporation of a Multi-level Temporal-Spatial Attention mechanism to improve the algorithm's capability of capturing both global and local information while emphasizing task-related regions, including sparse echo representations, in an efficient manner. Furthermore, the algorithm extracts spatio-temporal representations from continuous echo images using a parallel encoder to model the non-stationary motion process for echo extrapolation. The superiority of our TempEE has been demonstrated in the context of the classic radar echo extrapolation task, utilizing a real-world dataset. Extensive experiments have further validated the efficacy and indispensability of various components within TempEE.Comment: Have been accepted by IEEE Transactions on Geoscience and Remote Sensing, see https://ieeexplore.ieee.org/document/1023874

    Air pollution prediction with multi-modal data and deep neural networks

    Get PDF
    Air pollution is becoming a rising and serious environmental problem, especially in urban areas affected by an increasing migration rate. The large availability of sensor data enables the adoption of analytical tools to provide decision support capabilities. Employing sensors facilitates air pollution monitoring, but the lack of predictive capability limits such systems’ potential in practical scenarios. On the other hand, forecasting methods offer the opportunity to predict the future pollution in specific areas, potentially suggesting useful preventive measures. To date, many works tackled the problem of air pollution forecasting, most of which are based on sequence models. These models are trained with raw pollution data and are subsequently utilized to make predictions. This paper proposes a novel approach evaluating four different architectures that utilize camera images to estimate the air pollution in those areas. These images are further enhanced with weather data to boost the classification accuracy. The proposed approach exploits generative adversarial networks combined with data augmentation techniques to mitigate the class imbalance problem. The experiments show that the proposed method achieves robust accuracy of up to 0.88, which is comparable to sequence models and conventional models that utilize air pollution data. This is a remarkable result considering that the historic air pollution data is directly related to the output—future air pollution data, whereas the proposed architecture uses camera images to recognize the air pollution—which is an inherently much more difficult problem

    Segmentation and Classification of Multimodal Imagery

    Get PDF
    Segmentation and classification are two important computer vision tasks that transform input data into a compact representation that allow fast and efficient analysis. Several challenges exist in generating accurate segmentation or classification results. In a video, for example, objects often change the appearance and are partially occluded, making it difficult to delineate the object from its surroundings. This thesis proposes video segmentation and aerial image classification algorithms to address some of the problems and provide accurate results. We developed a gradient driven three-dimensional segmentation technique that partitions a video into spatiotemporal objects. The algorithm utilizes the local gradient computed at each pixel location together with the global boundary map acquired through deep learning methods to generate initial pixel groups by traversing from low to high gradient regions. A local clustering method is then employed to refine these initial pixel groups. The refined sub-volumes in the homogeneous regions of video are selected as initial seeds and iteratively combined with adjacent groups based on intensity similarities. The volume growth is terminated at the color boundaries of the video. The over-segments obtained from the above steps are then merged hierarchically by a multivariate approach yielding a final segmentation map for each frame. In addition, we also implemented a streaming version of the above algorithm that requires a lower computational memory. The results illustrate that our proposed methodology compares favorably well, on a qualitative and quantitative level, in segmentation quality and computational efficiency with the latest state of the art techniques. We also developed a convolutional neural network (CNN)-based method to efficiently combine information from multisensor remotely sensed images for pixel-wise semantic classification. The CNN features obtained from multiple spectral bands are fused at the initial layers of deep neural networks as opposed to final layers. The early fusion architecture has fewer parameters and thereby reduces the computational time and GPU memory during training and inference. We also introduce a composite architecture that fuses features throughout the network. The methods were validated on four different datasets: ISPRS Potsdam, Vaihingen, IEEE Zeebruges, and Sentinel-1, Sentinel-2 dataset. For the Sentinel-1,-2 datasets, we obtain the ground truth labels for three classes from OpenStreetMap. Results on all the images show early fusion, specifically after layer three of the network, achieves results similar to or better than a decision level fusion mechanism. The performance of the proposed architecture is also on par with the state-of-the-art results

    Comparison of Five Spatio-Temporal Satellite Image Fusion Models over Landscapes with Various Spatial Heterogeneity and Temporal Variation

    Get PDF
    In recent years, many spatial and temporal satellite image fusion (STIF) methods have been developed to solve the problems of trade-off between spatial and temporal resolution of satellite sensors. This study, for the first time, conducted both scene-level and local-level comparison of five state-of-art STIF methods from four categories over landscapes with various spatial heterogeneity and temporal variation. The five STIF methods include the spatial and temporal adaptive reflectance fusion model (STARFM) and Fit-FC model from the weight function-based category, an unmixing-based data fusion (UBDF) method from the unmixing-based category, the one-pair learning method from the learning-based category, and the Flexible Spatiotemporal DAta Fusion (FSDAF) method from hybrid category. The relationship between the performances of the STIF methods and scene-level and local-level landscape heterogeneity index (LHI) and temporal variation index (TVI) were analyzed. Our results showed that (1) the FSDAF model was most robust regardless of variations in LHI and TVI at both scene level and local level, while it was less computationally efficient than the other models except for one-pair learning; (2) Fit-FC had the highest computing efficiency. It was accurate in predicting reflectance but less accurate than FSDAF and one-pair learning in capturing image structures; (3) One-pair learning had advantages in prediction of large-area land cover change with the capability of preserving image structures. However, it was the least computational efficient model; (4) STARFM was good at predicting phenological change, while it was not suitable for applications of land cover type change; (5) UBDF is not recommended for cases with strong temporal changes or abrupt changes. These findings could provide guidelines for users to select appropriate STIF method for their own applications

    Spatiotemporal Fusion of Land Surface Temperature Based on a Convolutional Neural Network

    Get PDF
    © 1980-2012 IEEE. Due to the tradeoff between spatial and temporal resolutions commonly encountered in remote sensing, no single satellite sensor can provide fine spatial resolution land surface temperature (LST) products with frequent coverage. This situation greatly limits applications that require LST data with fine spatiotemporal resolution. Here, a deep learning-based spatiotemporal temperature fusion network (STTFN) method for the generation of fine spatiotemporal resolution LST products is proposed. In STTFN, a multiscale fusion convolutional neural network is employed to build the complex nonlinear relationship between input and output LSTs. Thus, unlike other LST spatiotemporal fusion approaches, STTFN is able to form the potentially complicated relationships through the use of training data without manually designed mathematical rules making it is more flexible and intelligent than other methods. In addition, two target fine spatial resolution LST images are predicted and then integrated by a spatiotemporal-consistency (STC)-weighting function to take advantage of STC of LST data. A set of analyses using two real LST data sets obtained from Landsat and moderate resolution imaging spectroradiometer (MODIS) were undertaken to evaluate the ability of STTFN to generate fine spatiotemporal resolution LST products. The results show that, compared with three classic fusion methods [the enhanced spatial and temporal adaptive reflectance fusion model (ESTARFM), the spatiotemporal integrated temperature fusion model (STITFM), and the two-stream convolutional neural network for spatiotemporal image fusion (StfNet)], the proposed network produced the most accurate outputs [average root mean square error (RMSE) 0.971]
    • 

    corecore