36,667 research outputs found

    A similarity-based approach to perceptual feature validation

    Get PDF
    Which object properties matter most in human perception may well vary according to sensory modality, an important consideration for the design of multimodal interfaces. In this study, we present a similarity-based method for comparing the perceptual importance of object properties across modalities and show how it can also be used to perceptually validate computational measures of object properties. Similarity measures for a set of three-dimensional (3D) objects varying in shape and texture were gathered from humans in two modalities (vision and touch) and derived from a set of standard 2D and 3D computational measures (image and mesh subtraction, object perimeter, curvature, Gabor jet filter responses, and the Visual Difference Predictor (VDP)). Multidimensional scaling (MDS) was then performed on the similarity data to recover configurations of the stimuli in 2D perceptual/computational spaces. These two dimensions corresponded to the two dimensions of variation in the stimulus set: shape and texture. In the human visual space, shape strongly dominated texture. In the human haptic space, shape and texture were weighted roughly equally. Weights varied considerably across subjects in the haptic experiment, indicating that different strategies were used. Maps derived from shape-dominated computational measures provided good fits to the human visual map. No single computational measure provided a satisfactory fit to the map derived from mean human haptic data, though good fits were found for individual subjects; a combination of measures with individually-adjusted weights may be required to model the human haptic similarity judgments. Our method provides a high-level approach to perceptual validation, which can be applied in both unimodal and multimodal interface design

    Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment

    Full text link
    We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features

    The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

    Full text link
    While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.Comment: Accepted to CVPR 2018; Code and data available at https://www.github.com/richzhang/PerceptualSimilarit
    corecore