4,701 research outputs found
A Perceptually Based Comparison of Image Similarity Metrics
The assessment of how well one image matches another forms a critical component both of models of human visual processing and of many image analysis systems. Two of the most commonly used norms for quantifying image similarity are L1 and L2, which are specific instances of the Minkowski metric. However, there is often not a principled reason for selecting one norm over the other. One way to address this problem is by examining whether one metric, better than the other, captures the perceptual notion of image similarity. This can be used to derive inferences regarding similarity criteria the human visual system uses, as well as to evaluate and design metrics for use in image-analysis applications. With this goal, we examined perceptual preferences for images retrieved on the basis of the L1 versus the L2 norm. These images were either small fragments without recognizable content, or larger patterns with recognizable content created by vector quantization. In both conditions the participants showed a small but consistent preference for images matched with the L1 metric. These results suggest that, in the domain of natural images of the kind we have used, the L1 metric may better capture human notions of image similarity
A Similarity Measure for Material Appearance
We present a model to measure the similarity in appearance between different
materials, which correlates with human similarity judgments. We first create a
database of 9,000 rendered images depicting objects with varying materials,
shape and illumination. We then gather data on perceived similarity from
crowdsourced experiments; our analysis of over 114,840 answers suggests that
indeed a shared perception of appearance similarity exists. We feed this data
to a deep learning architecture with a novel loss function, which learns a
feature space for materials that correlates with such perceived appearance
similarity. Our evaluation shows that our model outperforms existing metrics.
Last, we demonstrate several applications enabled by our metric, including
appearance-based search for material suggestions, database visualization,
clustering and summarization, and gamut mapping.Comment: 12 pages, 17 figure
Exploring the structure of a real-time, arbitrary neural artistic stylization network
In this paper, we present a method which combines the flexibility of the
neural algorithm of artistic style with the speed of fast style transfer
networks to allow real-time stylization using any content/style image pair. We
build upon recent work leveraging conditional instance normalization for
multi-style transfer networks by learning to predict the conditional instance
normalization parameters directly from a style image. The model is successfully
trained on a corpus of roughly 80,000 paintings and is able to generalize to
paintings previously unobserved. We demonstrate that the learned embedding
space is smooth and contains a rich structure and organizes semantic
information associated with paintings in an entirely unsupervised manner.Comment: Accepted as an oral presentation at British Machine Vision Conference
(BMVC) 201
The Sound Manifesto
Computing practice today depends on visual output to drive almost all user
interaction. Other senses, such as audition, may be totally neglected, or used
tangentially, or used in highly restricted specialized ways. We have excellent
audio rendering through D-A conversion, but we lack rich general facilities for
modeling and manipulating sound comparable in quality and flexibility to
graphics. We need co-ordinated research in several disciplines to improve the
use of sound as an interactive information channel.
Incremental and separate improvements in synthesis, analysis, speech
processing, audiology, acoustics, music, etc. will not alone produce the
radical progress that we seek in sonic practice. We also need to create a new
central topic of study in digital audio research. The new topic will assimilate
the contributions of different disciplines on a common foundation. The key
central concept that we lack is sound as a general-purpose information channel.
We must investigate the structure of this information channel, which is driven
by the co-operative development of auditory perception and physical sound
production. Particular audible encodings, such as speech and music, illuminate
sonic information by example, but they are no more sufficient for a
characterization than typography is sufficient for a characterization of visual
information.Comment: To appear in the conference on Critical Technologies for the Future
of Computing, part of SPIE's International Symposium on Optical Science and
Technology, 30 July to 4 August 2000, San Diego, C
Perceptually Motivated Shape Context Which Uses Shape Interiors
In this paper, we identify some of the limitations of current-day shape
matching techniques. We provide examples of how contour-based shape matching
techniques cannot provide a good match for certain visually similar shapes. To
overcome this limitation, we propose a perceptually motivated variant of the
well-known shape context descriptor. We identify that the interior properties
of the shape play an important role in object recognition and develop a
descriptor that captures these interior properties. We show that our method can
easily be augmented with any other shape matching algorithm. We also show from
our experiments that the use of our descriptor can significantly improve the
retrieval rates
- …