11 research outputs found

    Piecewise linear regression-based single image super-resolution via Hadamard transform

    Get PDF
    Image super-resolution (SR) has extensive applications in surveillance systems, satellite imaging, medical imaging, and ultra-high definition display devices. The state-ofthe-art methods for SR still incur considerable running time. In this paper, we propose a novel approach based on Hadamard pattern and tree search structure in order to reduce the running time significantly. In this approach, LR (low-resolution)-HR (high-resolution) training patch pairs are classified into different classes based on the Hadamard patterns generated from the LR training patches. The mapping relationship between the LR space and the HR space for each class is then learned and used for SR. Experimental results show that the proposed method can achieve comparable accuracy as state-of-the-art methods with much faster running speed

    Panoptic segmentation-based attention for image captioning

    Get PDF
    Image captioning is the task of generating textual descriptions of images. In order to obtain a better image representation, attention mechanisms have been widely adopted in image captioning. However, in existing models with detection-based attention, the rectangular attention regions are not fine-grained, as they contain irrelevant regions (e.g., background or overlapped regions) around the object, making the model generate inaccurate captions. To address this issue, we propose panoptic segmentation-based attention that performs attention at a mask-level (i.e., the shape of the main part of an instance). Our approach extracts feature vectors from the corresponding segmentation regions, which is more fine-grained than current attention mechanisms. Moreover, in order to process features of different classes independently, we propose a dual-attention module which is generic and can be applied to other frameworks. Experimental results showed that our model could recognize the overlapped objects and understand the scene better. Our approach achieved competitive performance against state-of-the-art methods. We made our code available

    Multi-scale residual hierarchical dense networks for single image super-resolution

    Get PDF
    Single image super-resolution is known to be an ill-posed problem, which has been studied for decades. With the developments of deep convolutional neural networks, the CNN-based single image super-resolution methods have greatly improved the quality of the generated high-resolution images. However, it is difficult for image super-resolution to make full use of the relationship between pixels in low-resolution images. To address this issue, we propose a novel multi-scale residual hierarchical dense network, which tries to find the dependencies in multi-level and multi-scale features. Specially, we apply the atrous spatial pyramid pooling, which concatenates multiple atrous convolutions with different dilation rates, and design a residual hierarchical dense structure for single image super-resolution. The atrous-spatial pyramid-pooling module is used for learning the relationship of features at multiple scales; while the residual hierarchical dense structure, which consists of several hierarchical dense blocks with skip connections, aims to adaptively detect key information from multi-level features. Meanwhile, dense features from different groups are connected in a dense approach by hierarchical dense blocks, which can adequately extract local multi-level features. Extensive experiments on benchmark datasets illustrate the superiority of our proposed method compared with state-of-the-art methods. The super-resolution results on benchmark datasets of our method can be downloaded from https://github.com/Rainyfish/MS-RHDN, and the source code will be released upon acceptance of the paper

    Deformable non-local network for video super-resolution

    Get PDF
    The video super-resolution (VSR) task aims to restore a high-resolution (HR) video frame by using its corresponding low-resolution (LR) frame and multiple neighboring frames. At present, many deep learning-based VSR methods rely on optical flow to perform frame alignment. The final recovery results will be greatly affected by the accuracy of optical flow. However, optical flow estimation cannot be completely accurate, and there are always some errors. In this paper, we propose a novel deformable nonlocal network (DNLN) which is a non-optical-flow-based method. Specifically, we apply the deformable convolution and improve its ability of adaptive alignment at the feature level. Furthermore, we utilize a nonlocal structure to capture the global correlation between the reference frame and the aligned neighboring frames, and simultaneously enhance desired fine details in the aligned frames. To reconstruct the final highquality HR video frames, we use residual in residual dense blocks to take full advantage of the hierarchical features. Experimental results on benchmark datasets demonstrate that the proposed DNLN can achieve state-of-the-art performance on VSR task

    An evaluation of canonical forms for non-rigid 3D shape retrieval

    Get PDF
    Canonical forms attempt to factor out a non-rigid shape’s pose, giving a pose-neutral shape. This opens up the possibility of using methods originally designed for rigid shape retrieval for the task of non-rigid shape retrieval. We extend our recent benchmark for testing canonical form algorithms. Our new benchmark is used to evaluate a greater number of state-of-the-art canonical forms, on five recent non-rigid retrieval datasets, within two different retrieval frameworks. A total of fifteen different canonical form methods are compared. We find that the difference in retrieval accuracy between different canonical form methods is small, but varies significantly across different datasets. We also find that efficiency is the main difference between the methods

    A Robust High-dimensional Data Reduction Method

    No full text
    International audienceIn this paper, we propose a robust high-dimensional data reduction method. The model assumes that the pixel reflec-tance results from linear combinations of pure component spectra contaminated by an additive noise. The abundance parameters appearing in this model satisfy positivity and additive constraints. These constraints are naturally expressed in a Bayesian literature by using appropriate abundance prior distributions. The posterior distributions of the unknown model parameters are then derived. The proposed algorithm consists of Bayesian inductive cognition part and hierarchical reduction algorithm model part. The pro-posed reduction algorithm based on Bayesian inductive cognitive model is used to decide which dimensions are advantageous and to output the recommended dimensions of the hyperspectral image. The algorithm can be interpreted as a robust reduction inference method for a Bayesian inductive cognitive model. Experimental results on high-dimensional data demonstrate useful properties of the proposed reduction algorithm

    Fine-grained attention and feature-sharing generative adversarial networks for single image super-resolution

    No full text
    Traditional super-resolution (SR) methods by minimize the mean square error usually produce images with oversmoothed and blurry edges, due to the lack of high-frequency details. In this paper, we propose two novel techniques within the generative adversarial network framework to encourage generation of photo-realistic images for image super-resolution. Firstly, instead of producing a single score to discriminate real and fake images, we propose a variant, called Fine-grained Attention Generative Adversarial Network (FASRGAN), to discriminate each pixel of real and fake images. FASRGAN adopts a UNetlike network as the discriminator with two outputs: an image score and an image score map. The score map has the same spatial size as the HR/SR images, serving as the fine-grained attention to represent the degree of reconstruction difficulty for each pixel. Secondly, instead of using different networks for the generator and the discriminator, we introduce a feature-sharing variant (denoted as Fs-SRGAN) for both the generator and the discriminator. The sharing mechanism can maintain model express power while making the model more compact, and thus can improve the ability of producing high-quality images. Quantitative and visual comparisons with state-of-the-art methods on benchmark datasets demonstrate the superiority of our methods. We further apply our super-resolution images for object recognition, which further demonstrates the effectiveness of our proposed method. The code is available at https://github.com/Rainyfish/FASRGAN-and-Fs-SRGAN
    corecore