215,320 research outputs found

    Spectral-Spatial Graph Reasoning Network for Hyperspectral Image Classification

    Full text link
    In this paper, we propose a spectral-spatial graph reasoning network (SSGRN) for hyperspectral image (HSI) classification. Concretely, this network contains two parts that separately named spatial graph reasoning subnetwork (SAGRN) and spectral graph reasoning subnetwork (SEGRN) to capture the spatial and spectral graph contexts, respectively. Different from the previous approaches implementing superpixel segmentation on the original image or attempting to obtain the category features under the guide of label image, we perform the superpixel segmentation on intermediate features of the network to adaptively produce the homogeneous regions to get the effective descriptors. Then, we adopt a similar idea in spectral part that reasonably aggregating the channels to generate spectral descriptors for spectral graph contexts capturing. All graph reasoning procedures in SAGRN and SEGRN are achieved through graph convolution. To guarantee the global perception ability of the proposed methods, all adjacent matrices in graph reasoning are obtained with the help of non-local self-attention mechanism. At last, by combining the extracted spatial and spectral graph contexts, we obtain the SSGRN to achieve a high accuracy classification. Extensive quantitative and qualitative experiments on three public HSI benchmarks demonstrate the competitiveness of the proposed methods compared with other state-of-the-art approaches

    Higher-level Representations of Natural Images

    Get PDF
    PhDThe traditional view of vision is that neurons in early cortical areas process information about simple features (e.g. orientation and spatial frequency) in small, spatially localised regions of visual space (the neuron’s receptive field). This piecemeal information is then fed-forward into later stages of the visual system where it gets combined to form coherent and meaningful global (higher-level) representations. The overall aim of this thesis is to examine and quantify this higher level processing; how we encode global features in natural images and to understand the extent to which our perception of these global representations is determined by the local features within images. Using the tilt after-effect as a tool, the first chapter examined the processing of a low level, local feature and found that the orientation of a sinusoidal grating could be encoded in both a retinally and spatially non-specific manner. Chapter 2 then examined these tilt aftereffects to the global orientation of the image (i.e., uprightness). We found that image uprightness was also encoded in a retinally / spatially non-specific manner, but that this global property could be processed largely independently of its local orientation content. Chapter 3 investigated if our increased sensitivity to cardinal (vertical and horizontal) structures compared to inter-cardinal (45° and 135° clockwise of vertical) structures, influenced classification of unambiguous natural images. Participants required relatively less contrast to classify images when they retained near-cardinal as compared to near-inter-cardinal structures. Finally, in chapter 4, we examined category classification when images were ambiguous. Observers were biased to classify ambiguous images, created by combining structures from two distinct image categories, as carpentered (e.g., a house). This could not be explained by differences in sensitivity to local structures and is most likely the result of our long-term exposure to city views. Overall, these results show that higher-level representations are not fully dependent on the lower level features within an image. Furthermore, our knowledge about the environment influences the extent to which we use local features to rapidly identify an image.Queen Mary University of London PhD studentship

    SpaSSA: superpixelwise adaptive SSA for unsupervised spatial-spectral feature extraction in hyperspectral image.

    Get PDF
    Singular spectral analysis (SSA) has recently been successfully applied to feature extraction in hyperspectral image (HSI), including conventional (1-D) SSA in spectral domain and 2-D SSA in spatial domain. However, there are some drawbacks, such as sensitivity to the window size, high computational complexity under a large window, and failing to extract joint spectral-spatial features. To tackle these issues, in this article, we propose superpixelwise adaptive SSA (SpaSSA), that is superpixelwise adaptive SSA for exploiting local spatial information of HSI. The extraction of local (instead of global) features, particularly in HSI, can be more effective for characterizing the objects within an image. In SpaSSA, conventional SSA and 2-D SSA are combined and adaptively applied to each superpixel derived from an oversegmented HSI. According to the size of the derived superpixels, either SSA or 2-D singular spectrum analysis (2D-SSA) is adaptively applied for feature extraction, where the embedding window in 2D-SSA is also adaptive to the size of the superpixel. Experimental results on the three datasets have shown that the proposed SpaSSA outperforms both SSA and 2D-SSA in terms of classification accuracy and computational complexity. By combining SpaSSA with the principal component analysis (SpaSSA-PCA), the accuracy of land-cover analysis can be further improved, outperforming several state-of-the-art approaches

    Hybrid image representation methods for automatic image annotation: a survey

    Get PDF
    In most automatic image annotation systems, images are represented with low level features using either global methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is beneficial in annotating images. In this paper, we provide a survey on automatic image annotation techniques according to one aspect: feature extraction, and, in order to complement existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation

    Multi-scale Orderless Pooling of Deep Convolutional Activation Features

    Full text link
    Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition. However, global CNN activations lack geometric invariance, which limits their robustness for classification and matching of highly variable scenes. To improve the invariance of CNN activations without degrading their discriminative power, this paper presents a simple but effective scheme called multi-scale orderless pooling (MOP-CNN). This scheme extracts CNN activations for local patches at multiple scale levels, performs orderless VLAD pooling of these activations at each level separately, and concatenates the result. The resulting MOP-CNN representation can be used as a generic feature for either supervised or unsupervised recognition tasks, from image classification to instance-level retrieval; it consistently outperforms global CNN activations without requiring any joint training of prediction layers for a particular target dataset. In absolute terms, it achieves state-of-the-art results on the challenging SUN397 and MIT Indoor Scenes classification datasets, and competitive results on ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets

    ParseNet: Looking Wider to See Better

    Full text link
    We present a technique for adding global context to deep convolutional networks for semantic segmentation. The approach is simple, using the average feature for a layer to augment the features at each location. In addition, we study several idiosyncrasies of training, significantly increasing the performance of baseline networks (e.g. from FCN). When we add our proposed global feature, and a technique for learning normalization parameters, accuracy increases consistently even over our improved versions of the baselines. Our proposed approach, ParseNet, achieves state-of-the-art performance on SiftFlow and PASCAL-Context with small additional computational cost over baselines, and near current state-of-the-art performance on PASCAL VOC 2012 semantic segmentation with a simple approach. Code is available at https://github.com/weiliu89/caffe/tree/fcn .Comment: ICLR 2016 submissio

    No Spare Parts: Sharing Part Detectors for Image Categorization

    Get PDF
    This work aims for image categorization using a representation of distinctive parts. Different from existing part-based work, we argue that parts are naturally shared between image categories and should be modeled as such. We motivate our approach with a quantitative and qualitative analysis by backtracking where selected parts come from. Our analysis shows that in addition to the category parts defining the class, the parts coming from the background context and parts from other image categories improve categorization performance. Part selection should not be done separately for each category, but instead be shared and optimized over all categories. To incorporate part sharing between categories, we present an algorithm based on AdaBoost to jointly optimize part sharing and selection, as well as fusion with the global image representation. We achieve results competitive to the state-of-the-art on object, scene, and action categories, further improving over deep convolutional neural networks
    • 

    corecore