144 research outputs found

    Image restoration with group sparse representation and low‐rank group residual learning

    Get PDF
    Image restoration, as a fundamental research topic of image processing, is to reconstruct the original image from degraded signal using the prior knowledge of image. Group sparse representation (GSR) is powerful for image restoration; it however often leads to undesirable sparse solutions in practice. In order to improve the quality of image restoration based on GSR, the sparsity residual model expects the representation learned from degraded images to be as close as possible to the true representation. In this article, a group residual learning based on low-rank self-representation is proposed to automatically estimate the true group sparse representation. It makes full use of the relation among patches and explores the subgroup structures within the same group, which makes the sparse residual model have better interpretation furthermore, results in high-quality restored images. Extensive experimental results on two typical image restoration tasks (image denoising and deblocking) demonstrate that the proposed algorithm outperforms many other popular or state-of-the-art image restoration methods

    Image restoration with group sparse representation and low‐rank group residual learning

    Get PDF

    JPEG2000-Based Semantic Image Compression using CNN

    Get PDF
    Some of the computer vision applications such as understanding, recognition as well as image processing are some areas where AI techniques like convolutional neural network (CNN) have attained great success. AI techniques are not very frequently used in applications like image compression which are a part of low-level vision applications. Intensifying the visual quality of the lossy video/image compression has been a huge obstacle for a very long time. Image processing tasks and image recognition can be addressed with the application of deep learning CNNs as a result of the availability of large training datasets and the recent advances in computing power. This paper consists of a CNN-based novel compression framework comprising of Compact CNN (ComCNN) and Reconstruction CNN (RecCNN) where they are trained concurrently and ideally consolidated into a compression framework, along with MS-ROI (Multi Structure-Region of Interest) mapping which highlights the semiotically notable portions of the image. The framework attains a mean PSNR value of 32.9dB, achieving a gain of 3.52dB and attains mean SSIM value of 0.9262, achieving a gain of 0.0723dB over the other methods when compared using the 6 main test images. Experimental results in the proposed study validate that the architecture substantially surpasses image compression frameworks, that utilized deblocking or denoising post- processing techniques, classified utilizing Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Measures (SSIM) with a mean PSNR, SSIM and Compression Ratio of 38.45, 0.9602 and 1.75x respectively for the 50 test images, thus obtaining state-of-art performance for Quality Factor (QF)=5

    Video modeling via implicit motion representations

    Get PDF
    Video modeling refers to the development of analytical representations for explaining the intensity distribution in video signals. Based on the analytical representation, we can develop algorithms for accomplishing particular video-related tasks. Therefore video modeling provides us a foundation to bridge video data and related-tasks. Although there are many video models proposed in the past decades, the rise of new applications calls for more efficient and accurate video modeling approaches.;Most existing video modeling approaches are based on explicit motion representations, where motion information is explicitly expressed by correspondence-based representations (i.e., motion velocity or displacement). Although it is conceptually simple, the limitations of those representations and the suboptimum of motion estimation techniques can degrade such video modeling approaches, especially for handling complex motion or non-ideal observation video data. In this thesis, we propose to investigate video modeling without explicit motion representation. Motion information is implicitly embedded into the spatio-temporal dependency among pixels or patches instead of being explicitly described by motion vectors.;Firstly, we propose a parametric model based on a spatio-temporal adaptive localized learning (STALL). We formulate video modeling as a linear regression problem, in which motion information is embedded within the regression coefficients. The coefficients are adaptively learned within a local space-time window based on LMMSE criterion. Incorporating a spatio-temporal resampling and a Bayesian fusion scheme, we can enhance the modeling capability of STALL on more general videos. Under the framework of STALL, we can develop video processing algorithms for a variety of applications by adjusting model parameters (i.e., the size and topology of model support and training window). We apply STALL on three video processing problems. The simulation results show that motion information can be efficiently exploited by our implicit motion representation and the resampling and fusion do help to enhance the modeling capability of STALL.;Secondly, we propose a nonparametric video modeling approach, which is not dependent on explicit motion estimation. Assuming the video sequence is composed of many overlapping space-time patches, we propose to embed motion-related information into the relationships among video patches and develop a generic sparsity-based prior for typical video sequences. First, we extend block matching to more general kNN-based patch clustering, which provides an implicit and distributed representation for motion information. We propose to enforce the sparsity constraint on a higher-dimensional data array signal, which is generated by packing the patches in the similar patch set. Then we solve the inference problem by updating the kNN array and the wanted signal iteratively. Finally, we present a Bayesian fusion approach to fuse multiple-hypothesis inferences. Simulation results in video error concealment, denoising, and deartifacting are reported to demonstrate its modeling capability.;Finally, we summarize the proposed two video modeling approaches. We also point out the perspectives of implicit motion representations in applications ranging from low to high level problems

    DEEP LEARNING FOR IMAGE RESTORATION AND ROBOTIC VISION

    Get PDF
    Traditional model-based approach requires the formulation of mathematical model, and the model often has limited performance. The quality of an image may degrade due to a variety of reasons: It could be the context of scene is affected by weather conditions such as haze, rain, and snow; It\u27s also possible that there is some noise generated during image processing/transmission (e.g., artifacts generated during compression.). The goal of image restoration is to restore the image back to desirable quality both subjectively and objectively. Agricultural robotics is gaining interest these days since most agricultural works are lengthy and repetitive. Computer vision is crucial to robots especially the autonomous ones. However, it is challenging to have a precise mathematical model to describe the aforementioned problems. Compared with traditional approach, learning-based approach has an edge since it does not require any model to describe the problem. Moreover, learning-based approach now has the best-in-class performance on most of the vision problems such as image dehazing, super-resolution, and image recognition. In this dissertation, we address the problem of image restoration and robotic vision with deep learning. These two problems are highly related with each other from a unique network architecture perspective: It is essential to select appropriate networks when dealing with different problems. Specifically, we solve the problems of single image dehazing, High Efficiency Video Coding (HEVC) loop filtering and super-resolution, and computer vision for an autonomous robot. Our technical contributions are threefold: First, we propose to reformulate haze as a signal-dependent noise which allows us to uncover it by learning a structural residual. Based on our novel reformulation, we solve dehazing with recursive deep residual network and generative adversarial network which emphasizes on objective and perceptual quality, respectively. Second, we replace traditional filters in HEVC with a Convolutional Neural Network (CNN) filter. We show that our CNN filter could achieve 7% BD-rate saving when compared with traditional filters such as bilateral and deblocking filter. We also propose to incorporate a multi-scale CNN super-resolution module into HEVC. Such post-processing module could improve visual quality under extremely low bandwidth. Third, a transfer learning technique is implemented to support vision and autonomous decision making of a precision pollination robot. Good experimental results are reported with real-world data
    corecore