4,598 research outputs found

    Superpixel nonlocal weighting joint sparse representation for hyperspectral image classification.

    Get PDF
    Joint sparse representation classification (JSRC) is a representative spectral–spatial classifier for hyperspectral images (HSIs). However, the JSRC is inappropriate for highly heterogeneous areas due to the spatial information being extracted from a fixed-sized neighborhood block, which is often unable to conform to the naturally irregular structure of land cover. To address this problem, a superpixel-based JSRC with nonlocal weighting, i.e., superpixel-based nonlocal weighted JSRC (SNLW-JSRC), is proposed in this paper. In SNLW-JSRC, the superpixel representation of an HSI is first constructed based on an entropy rate segmentation method. This strategy forms homogeneous neighborhoods with naturally irregular structures and alleviates the inclusion of pixels from different classes in the process of spatial information extraction. Afterwards, the superpixel-based nonlocal weighting (SNLW) scheme is built to weigh the superpixel based on its structural and spectral information. In this way, the weight of one specific neighboring pixel is determined by the local structural similarity between the neighboring pixel and the central test pixel. Then, the obtained local weights are used to generate the weighted mean data for each superpixel. Finally, JSRC is used to produce the superpixel-level classification. This speeds up the sparse representation and makes the spatial content more centralized and compact. To verify the proposed SNLW-JSRC method, we conducted experiments on four benchmark hyperspectral datasets, namely Indian Pines, Pavia University, Salinas, and DFC2013. The experimental results suggest that the SNLW-JSRC can achieve better classification results than the other four SRC-based algorithms and the classical support vector machine algorithm. Moreover, the SNLW-JSRC can also outperform the other SRC-based algorithms, even with a small number of training samples

    Robust Mobile Visual Recognition System: From Bag of Visual Words to Deep Learning

    Get PDF
    With billions of images captured by mobile users everyday, automatically recognizing contents in such images has become a particularly important feature for various mobile apps, including augmented reality, product search, visual-based authentication etc. Traditionally, a client-server architecture is adopted such that the mobile client sends captured images/video frames to a cloud server, which runs a set of task-specific computer vision algorithms and sends back the recognition results. However, such scheme may cause problems related to user privacy, network stability/availability and device energy.In this dissertation, we investigate the problem of building a robust mobile visual recognition system that achieves high accuracy, low latency, low energy cost and privacy protection. Generally, we study two broad types of recognition methods: the bag of visual words (BOVW) based retrieval methods, which search the nearest neighbor image to a query image, and the state-of-the-art deep learning based methods, which recognize a given image using a trained deep neural network. The challenges of deploying BOVW based retrieval methods include: size of indexed image database, query latency, feature extraction efficiency and re-ranking performance. To address such challenges, we first proposed EMOD which enables efficient on-device image retrieval on a downloaded context-dependent partial image database. The efficiency is achieved by analyzing the BOVW processing pipeline and optimizing each module with algorithmic improvement.Recent deep learning based recognition approaches have been shown to greatly exceed the performance of traditional approaches. We identify several challenges of applying deep learning based recognition methods on mobile scenarios, namely energy efficiency and privacy protection for real-time visual processing, and mobile visual domain biases. Thus, we proposed two techniques to address them, (i) efficiently splitting the workload across heterogeneous computing resources, i.e., mobile devices and the cloud using our Moca framework, and (ii) using mobile visual domain adaptation as proposed in our collaborative edge-mediated platform DeepCham. Our extensive experiments on large-scale benchmark datasets and off-the-shelf mobile devices show our solutions provide better results than the state-of-the-art solutions

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure
    • …
    corecore