1,287 research outputs found

    LR-CNN: Local-aware Region CNN for Vehicle Detection in Aerial Imagery

    Get PDF
    State-of-the-art object detection approaches such as Fast/Faster R-CNN, SSD, or YOLO have difficulties detecting dense, small targets with arbitrary orientation in large aerial images. The main reason is that using interpolation to align RoI features can result in a lack of accuracy or even loss of location information. We present the Local-aware Region Convolutional Neural Network (LR-CNN), a novel two-stage approach for vehicle detection in aerial imagery. We enhance translation invariance to detect dense vehicles and address the boundary quantization issue amongst dense vehicles by aggregating the high-precision RoIs' features. Moreover, we resample high-level semantic pooled features, making them regain location information from the features of a shallower convolutional block. This strengthens the local feature invariance for the resampled features and enables detecting vehicles in an arbitrary orientation. The local feature invariance enhances the learning ability of the focal loss function, and the focal loss further helps to focus on the hard examples. Taken together, our method better addresses the challenges of aerial imagery. We evaluate our approach on several challenging datasets (VEDAI, DOTA), demonstrating a significant improvement over state-of-the-art methods. We demonstrate the good generalization ability of our approach on the DLR 3K dataset.Comment: 8 page

    Impact of Feature Representation on Remote Sensing Image Retrieval

    Get PDF
    Remote sensing images are acquired using special platforms, sensors and are classified as aerial, multispectral and hyperspectral images. Multispectral and hyperspectral images are represented using large spectral vectors as compared to normal Red, Green, Blue (RGB) images. Hence, remote sensing image retrieval process from large archives is a challenging task.  Remote sensing image retrieval mainly consist of feature representation as first step and finding out similar images to a query image as second step. Feature representation plays important part in the performance of remote sensing image retrieval process. Research work focuses on impact of feature representation of remote sensing images on the performance of remote sensing image retrieval. This study shows that more discriminative features of remote sensing images are needed to improve performance of remote sensing image retrieval process

    Knowledge-Based Classification of Grassland Ecosystem Based on Multi-Temporal WorldView-2 Data and FAO-LCCS Taxonomy

    Get PDF
    Grassland ecosystems can provide a variety of services for humans, such as carbon storage, food production, crop pollination and pest regulation. However, grasslands are today one of the most endangered ecosystems due to land use change, agricultural intensification, land abandonment as well as climate change. The present study explores the performance of a knowledge-driven GEOgraphic-Object—based Image Analysis (GEOBIA) learning scheme to classify Very High Resolution(VHR)imagesfornaturalgrasslandecosystemmapping. Theclassificationwasappliedto a Natura 2000 protected area in Southern Italy. The Food and Agricultural Organization Land Cover Classification System (FAO-LCCS) hierarchical scheme was instantiated in the learning phase of the algorithm. Four multi-temporal WorldView-2 (WV-2) images were classified by combining plant phenology and agricultural practices rules with prior-image spectral knowledge. Drawing on this knowledge, spectral bands and entropy features from one single date (Post Peak of Biomass) were firstly used for multiple-scale image segmentation into Small Objects (SO) and Large Objects (LO). Thereafter, SO were labelled by considering spectral and context-sensitive features from the whole multi-seasonal data set available together with ancillary data. Lastly, the labelled SO were overlaid to LO segments and, in turn, the latter were labelled by adopting FAO-LCCS criteria about the SOs presence dominance in each LO. Ground reference samples were used only for validating the SO and LO output maps. The knowledge driven GEOBIA classifier for SO classification obtained an OA value of 97.35% with an error of 0.04. For LO classification the value was 75.09% with an error of 0.70. At SO scale, grasslands ecosystem was classified with 92.6%, 99.9% and 96.1% of User’s, Producer’s Accuracy and F1-score, respectively. The findings reported indicate that the knowledge-driven approach not only can be applied for (semi)natural grasslands ecosystem mapping in vast and not accessible areas but can also reduce the costs of ground truth data acquisition. The approach used may provide different level of details (small and large objects in the scene) but also indicates how to design and validate local conservation policies

    Multitemporal Relearning with Convolutional LSTM Models for Land Use Classification

    Get PDF
    In this article, we present a novel hybrid framework, which integrates spatial–temporal semantic segmentation with postclassification relearning, for multitemporal land use and land cover (LULC) classification based on very high resolution (VHR) satellite imagery. To efficiently obtain optimal multitemporal LULC classification maps, the hybrid framework utilizes a spatial–temporal semantic segmentation model to harness temporal dependency for extracting high-level spatial–temporal features. In addition, the principle of postclassification relearning is adopted to efficiently optimize model output. Thereby, the initial outcome of a semantic segmentation model is provided to a subsequent model via an extended input space to guide the learning of discriminative feature representations in an end-to-end fashion. Last, object-based voting is coupled with postclassification relearning for coping with the high intraclass and low interclass variances. The framework was tested with two different postclassification relearning strategies (i.e., pixel-based relearning and object-based relearning) and three convolutional neural network models, i.e., UNet, a simple Convolutional LSTM, and a UNet Convolutional-LSTM. The experiments were conducted on two datasets with LULC labels that contain rich semantic information and variant building morphologic features (e.g., informal settlements). Each dataset contains four time steps from WorldView-2 and Quickbird imagery. The experimental results unambiguously underline that the proposed framework is efficient in terms of classifying complex LULC maps with multitemporal VHR images
    corecore