578 research outputs found

    Color Sparse Representations for Image Processing: Review, Models, and Prospects

    Get PDF
    International audienceSparse representations have been extended to deal with color images composed of three channels. A review of dictionary-learning-based sparse representations for color images is made here, detailing the differences between the models, and comparing their results on real data and simulated data. These models are considered in a unifying framework that is based on the degrees of freedom of the linear filtering/transformation of the color channels. Moreover, this allows it to be shown that the scalar quaternionic linear model is equivalent to constrained matrix-based color filtering, which highlights the filtering implicitly applied through this model. Based on this reformulation, the new color filtering model is introduced, using unconstrained filters. In this model, spatial morphologies of color images are encoded by atoms, and colors are encoded by color filters. Color variability is no longer captured in increasing the dictionary size, but with color filters, this gives an efficient color representation

    Octonion sparse representation for color and multispectral image processing

    Get PDF
    A recent trend in color image processing combines the quaternion algebra with dictionary learning methods. This paper aims to present a generalization of the quaternion dictionary learning method by using the octonion algebra. The octonion algebra combined with dictionary learning methods is well suited for representation of multispectral images with up to 7 color channels. Opposed to the classical dictionary learning techniques that treat multispectral images by concatenating spectral bands into a large monochrome image, we treat all the spectral bands simultaneously. Our approach leads to better preservation of color fidelity in true and false color images of the reconstructed multispectral image. To show the potential of the octonion based model, experiments are conducted for image reconstruction and denoising of color images as well as of extensively used Landsat 7 images

    A Reverse Hierarchy Model for Predicting Eye Fixations

    Full text link
    A number of psychological and physiological evidences suggest that early visual attention works in a coarse-to-fine way, which lays a basis for the reverse hierarchy theory (RHT). This theory states that attention propagates from the top level of the visual hierarchy that processes gist and abstract information of input, to the bottom level that processes local details. Inspired by the theory, we develop a computational model for saliency detection in images. First, the original image is downsampled to different scales to constitute a pyramid. Then, saliency on each layer is obtained by image super-resolution reconstruction from the layer above, which is defined as unpredictability from this coarse-to-fine reconstruction. Finally, saliency on each layer of the pyramid is fused into stochastic fixations through a probabilistic model, where attention initiates from the top layer and propagates downward through the pyramid. Extensive experiments on two standard eye-tracking datasets show that the proposed method can achieve competitive results with state-of-the-art models.Comment: CVPR 2014, 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). CVPR 201

    A Theoretically Guaranteed Quaternion Weighted Schatten p-norm Minimization Method for Color Image Restoration

    Full text link
    Inspired by the fact that the matrix formulated by nonlocal similar patches in a natural image is of low rank, the rank approximation issue have been extensively investigated over the past decades, among which weighted nuclear norm minimization (WNNM) and weighted Schatten pp-norm minimization (WSNM) are two prevailing methods have shown great superiority in various image restoration (IR) problems. Due to the physical characteristic of color images, color image restoration (CIR) is often a much more difficult task than its grayscale image counterpart. However, when applied to CIR, the traditional WNNM/WSNM method only processes three color channels individually and fails to consider their cross-channel correlations. Very recently, a quaternion-based WNNM approach (QWNNM) has been developed to mitigate this issue, which is capable of representing the color image as a whole in the quaternion domain and preserving the inherent correlation among the three color channels. Despite its empirical success, unfortunately, the convergence behavior of QWNNM has not been strictly studied yet. In this paper, on the one side, we extend the WSNM into quaternion domain and correspondingly propose a novel quaternion-based WSNM model (QWSNM) for tackling the CIR problems. Extensive experiments on two representative CIR tasks, including color image denoising and deblurring, demonstrate that the proposed QWSNM method performs favorably against many state-of-the-art alternatives, in both quantitative and qualitative evaluations. On the other side, more importantly, we preliminarily provide a theoretical convergence analysis, that is, by modifying the quaternion alternating direction method of multipliers (QADMM) through a simple continuation strategy, we theoretically prove that both the solution sequences generated by the QWNNM and QWSNM have fixed-point convergence guarantees.Comment: 46 pages, 10 figures; references adde

    Context-Patch Face Hallucination Based on Thresholding Locality-Constrained Representation and Reproducing Learning

    Get PDF
    Face hallucination is a technique that reconstruct high-resolution (HR) faces from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR face pairs. Most state-of-the-arts leverage position-patch prior knowledge of human face to estimate the optimal representation coefficients for each image patch. However, they focus only the position information and usually ignore the context information of image patch. In addition, when they are confronted with misalignment or the Small Sample Size (SSS) problem, the hallucination performance is very poor. To this end, this study incorporates the contextual information of image patch and proposes a powerful and efficient context-patch based face hallucination approach, namely Thresholding Locality-constrained Representation and Reproducing learning (TLcR-RL). Under the context-patch based framework, we advance a thresholding based representation method to enhance the reconstruction accuracy and reduce the computational complexity. To further improve the performance of the proposed algorithm, we propose a promotion strategy called reproducing learning. By adding the estimated HR face to the training set, which can simulates the case that the HR version of the input LR face is present in the training set, thus iteratively enhancing the final hallucination result. Experiments demonstrate that the proposed TLcR-RL method achieves a substantial increase in the hallucinated results, both subjectively and objectively. Additionally, the proposed framework is more robust to face misalignment and the SSS problem, and its hallucinated HR face is still very good when the LR test face is from the real-world. The MATLAB source code is available at https://github.com/junjun-jiang/TLcR-RL

    Super-Resolution Textured Digital Surface Map (DSM) Formation by Selecting the Texture From Multiple Perspective Texel Images Taken by a Low-Cost Small Unmanned Aerial Vehicle (UAV)

    Get PDF
    Textured Digital Surface Model (TDSM) is a three-dimensional terrain map with texture overlaid on it. Utah State University has developed a texel camera which can capture a 3D image called a texel image. A TDSM can be constructed by combining these multiple texel images, which is much cheaper than the traditional method. The overall goal is to create a TDSM for a larger area that is cheaper and equally accurate as the TDSM created using a high-cost system. The images obtained from such an inexpensive camera have a lot of errors. To create scientifically accurate TDSM, the error presented in the image must be corrected. An automatic process to create TDSM is presented that can handle a large number of input texel images. The advantage of using such a large set of input images is that they can cover a large area on the ground, making the algorithm suitable for large-scale applications. This is done by processing images and correcting them in a windowing manner. Furthermore, the appearance of the final 3D terrain map is improved by selecting the texture from many candidate images. This ensures that the best texture is selected. The selection criteria are discussed. Lastly, a method to increase the resolution of the final image is discussed. The methods described in this dissertation improve the current technique of creating TDSM, and the results are shown and analyzed
    corecore