5 research outputs found

    Semantic Perceptual Image Compression using Deep Convolution Networks

    Full text link
    It has long been considered a significant problem to improve the visual quality of lossy image and video compression. Recent advances in computing power together with the availability of large training data sets has increased interest in the application of deep learning cnns to address image recognition and image processing tasks. Here, we present a powerful cnn tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy compression. A modest increase in complexity is incorporated to the encoder which allows a standard, off-the-shelf jpeg decoder to be used. While jpeg encoding may be optimized for generic images, the process is ultimately unaware of the specific content of the image to be compressed. Our technique makes jpeg content-aware by designing and training a model to identify multiple semantic regions in a given image. Unlike object detection techniques, our model does not require labeling of object positions and is able to identify objects in a single pass. We present a new cnn architecture directed specifically to image compression, which generates a map that highlights semantically-salient regions so that they can be encoded at higher quality as compared to background regions. By adding a complete set of features for every class, and then taking a threshold over the sum of all feature activations, we generate a map that highlights semantically-salient regions so that they can be encoded at a better quality compared to background regions. Experiments are presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset, in which our algorithm achieves higher visual quality for the same compressed size.Comment: Accepted to Data Compression Conference, 11 pages, 5 figure

    A Novel Class of Intensity-based Metrics for Image Functions which Accommodate a Generalized Weber's Model of Perception

    Get PDF
    Even though the usual L2 image fidelity measurements, such as MSE and PSNR, characterize the mean error for each pixel of the images, these traditional measurements are not designed to predict human visual perception of image quality. In other words, the standard L2-type optimization in the context of best approximation theory is not in accordance with human visual system. To provide alternative methods of measuring image distortion perceptually, the structural similarity image quality measure (SSIM) was established decades ago. In this thesis, we are concerned with constructing a novel class of metrics via intensity-based measures, which accommodate a well-known psychological model, Weber's model of perception, by allowing greater deviations at higher intensity values and lower deviations at lower intensity values. The standard Weber model, however, is known to fail at low and high intensities, which has given rise to a generalized class of Weber models. We have derived a set of "Weberized'' distance functions which accommodate these generalized models. Mathematically, we prove the existence and uniqueness of the density functions associated with the measures which conform to generalized Weber's model of perception. Meanwhile, we consider the generalized Weber-based metrics as the optimization criteria and implement them in best approximation problems, where we also prove the existence and uniqueness of best approximations. We compare the results, which are theoretically adapted to the human visual system, with best L2-based approximations. Using a functional analysis point of view, we examine the stationary equations associated with the generalized Weber-based metrics to arrive at the Fréchet derivatives of these metrics. Finally, we establish an existence-uniqueness theorem of best generalized Weber-based L2 approximations in finite-dimensional Hilbert spaces

    Strategies for improving efficiency and efficacy of image quality assessment algorithms

    Get PDF
    Image quality assessment (IQA) research aims to predict the qualities of images in a manner that agrees with subjective quality ratings. Over the last several decades, the major impetus in IQA research has focused on improving prediction efficacy globally (across images) of distortion-specific types or general types; very few studies have explored local image quality (within images), or IQA algorithm for improved JPEG2000 coding. Even fewer studies have focused on analyzing and improving the runtime performance of IQA algorithms. Moreover, reduced-reference (RR) IQA is also a new field to be explored, when the transmitting bandwidth is limited, side information about original image was received with distorted image at the receiver. This report explored these four topics. For local image quality, we provided a local sharpness database, and we analyzed the database along with current sharpness metrics. We revealed that human highly agreed when rating sharpness of small blocks. Overall, this sharpness database is a true representation of human subjective ratings and current sharpness algorithms could reach 0.87 in terms of SROCC score. For JPEG2000 coding using IQA, we provided a new JPEG2000 image database, which includes only same total distortion images. Analysis of existing IQA algorithms on this database revealed that even though current algorithms perform reasonably well on JPEG2000-compressed images in popular image-quality databases, they often fail to predict the correct rankings on our database's images. Based on the framework of Most Apparent Distortion (MAD), a new algorithm, MADDWT is then proposed using local DWT coefficient statistics to predict the perceived distortion due to subband quantization. MADDWT outperforms all others algorithms on this database, and shows a promising use in JPEG2000 coding. For efficiency of IQA algorithms, this paper is the first to examine IQA algorithms from the perspective of their interaction with the underlying hardware and microarchitectural resources, and to perform a systematic performance analysis using state-of-the-art tools and techniques from other computing disciplines. We implemented four popular full-reference IQA algorithms and two no-reference algorithms in C++ based on the code provided by their respective authors. Hotspot analysis and microarchitectural analysis of each algorithm were performed and compared. Despite the fact that all six algorithms share common algorithmic operations (e.g., filterbanks and statistical computations), our results revealed that different IQA algorithms overwhelm different microarchitectural resources and give rise to different types of bottlenecks. For RR IQA, we also provide a new framework based on multiscale sharpness map. This framework employs multiscale sharpness maps as reduced information. As we will demonstrate, our framework with 2% reduced information can outperform other frameworks, which employ from 2% to 3% reduced information. Our framework is also competitive to current state-of-the-art FR algorithms

    A Study of the Structural Similarity Image Quality Measure with Applications to Image Processing

    Get PDF
    Since its introduction in 2004, the Structural Similarity (SSIM) index has gained widespread popularity as an image quality assessment measure. SSIM is currently recognized to be one of the most powerful methods of assessing the visual closeness of images. That being said, the Mean Squared Error (MSE), which performs very poorly from a perceptual point of view, still remains the most common optimization criterion in image processing applications because of its relative simplicity along with a number of other properties that are deemed important. In this thesis, some necessary tools to assist in the design of SSIM-optimal algorithms are developed. This work combines theoretical developments with experimental research and practical algorithms. The description of the mathematical properties of the SSIM index represents the principal theoretical achievement in this thesis. Indeed, it is demonstrated how the SSIM index can be transformed into a distance metric. Local convexity, quasi-convexity, symmetries and invariance properties are also proved. The study of the SSIM index is also generalized to a family of metrics called normalized (or M-relative) metrics. Various analytical techniques for different kinds of SSIM-based optimization are then devised. For example, the best approximation according to the SSIM is described for orthogonal and redundant basis sets. SSIM-geodesic paths with arclength parameterization are also traced between images. Finally, formulas for SSIM-optimal point estimators are obtained. On the experimental side of the research, the structural self-similarity of images is studied. This leads to the confirmation of the hypothesis that the main source of self-similarity of images lies in their regions of low variance. On the practical side, an implementation of local statistical tests on the image residual is proposed for the assessment of denoised images. Also, heuristic estimations of the SSIM index and the MSE are developed. The research performed in this thesis should lead to the development of state-of-the-art image denoising algorithms. A better comprehension of the mathematical properties of the SSIM index represents another step toward the replacement of the MSE with SSIM in image processing applications
    corecore