149,026 research outputs found

    Comparison between Structural Similarity Index Metric and Human Perception

    Get PDF
    This thesis examines the image quality assessment using Structural Similarity Index Metric (SSIM). The performance of Structural Similarity Index Metric was evaluated by comparing Mean Structural Similarity Index (MSSIM) index values with the Probability of Identification (PID) values. The perception experiments were designed for letter images with blur and letter images with blur and noise to obtain the PID values from an ensemble of observers. The other set of images used in this study were tank images for which PID data existed. All the images used in the experiment belong to Gaussian and Exponential filter shapes at various blur levels. All images at a specific blur level and specific filter shape were compared and MSSIM was obtained. MSSIM was compared with blur and PID was compared with blur at various levels for both the filter shapes to observe the correlation between SSIM and human perception. It is noticed from the results that there is no correlation between MSSIM and PID. The image quality differences between SSIM and human perception were obtained in this thesis. From the results it is noticed that SSIM cannot detect the filter shape difference where as humans perceived the difference for letter images with blur in our experiments. The Probability of Identification for Gaussian is lower than the Exponential filter shape which is explained by the edge energies analysis. It is observed that the results of tank images and letter images with blur and noise were similar where humans and MSSIM cannot distinguish between filter shapes

    A Perceptually Based Comparison of Image Similarity Metrics

    Full text link
    The assessment of how well one image matches another forms a critical component both of models of human visual processing and of many image analysis systems. Two of the most commonly used norms for quantifying image similarity are L1 and L2, which are specific instances of the Minkowski metric. However, there is often not a principled reason for selecting one norm over the other. One way to address this problem is by examining whether one metric, better than the other, captures the perceptual notion of image similarity. This can be used to derive inferences regarding similarity criteria the human visual system uses, as well as to evaluate and design metrics for use in image-analysis applications. With this goal, we examined perceptual preferences for images retrieved on the basis of the L1 versus the L2 norm. These images were either small fragments without recognizable content, or larger patterns with recognizable content created by vector quantization. In both conditions the participants showed a small but consistent preference for images matched with the L1 metric. These results suggest that, in the domain of natural images of the kind we have used, the L1 metric may better capture human notions of image similarity

    The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

    Full text link
    While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.Comment: Accepted to CVPR 2018; Code and data available at https://www.github.com/richzhang/PerceptualSimilarit
    • …
    corecore