140,813 research outputs found
Perceptual elaboration paradigm (PEP): A new approach for investigating mental representations of language
To examine hemispheric differences in accessing a mental representation that embodies perceptual elements and their spatial relationships (i.e., perceptual elaboration and integration), we developed a cross-modal perceptual elaboration paradigm (PEP) in which an imagined percept, rather than a propositional concept, determined congruency. Three target image conditions allow researchers to test which mental representation is primarily accessed when the target is laterally presented. For example, the “Integrated” condition is congruent with either propositional or perceptual mental representations; therefore, results from both hemifield conditions (RVF/LH vs. LVF/RH) should be comparable. Similarly, the “Unrelated” condition is incongruent with either propositional or perceptual mental representations; therefore, results from both hemifield conditions should be comparable as well. However, the “Unintegrated” condition is congruent with the propositional mental representation but not the perceptual mental representation. Should either hemisphere access one representation initially, differences will be revealed in either behavioural or electroencephalography results.
This paradigm:
• is distinct from existing paired paradigms that emphasize semantic associations.
• is important given increasing evidence that discourse comprehension involves accessing perceptual information.
• allows researchers to examine the extent to which a mental representation of discourse can embody perceptual elaboration and integration
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
In this work, we address the problem of musical timbre transfer, where the
goal is to manipulate the timbre of a sound sample from one instrument to match
another instrument while preserving other musical content, such as pitch,
rhythm, and loudness. In principle, one could apply image-based style transfer
techniques to a time-frequency representation of an audio signal, but this
depends on having a representation that allows independent manipulation of
timbre as well as high-quality waveform generation. We introduce TimbreTron, a
method for musical timbre transfer which applies "image" domain style transfer
to a time-frequency representation of the audio signal, and then produces a
high-quality waveform using a conditional WaveNet synthesizer. We show that the
Constant Q Transform (CQT) representation is particularly well-suited to
convolutional architectures due to its approximate pitch equivariance. Based on
human perceptual evaluations, we confirmed that TimbreTron recognizably
transferred the timbre while otherwise preserving the musical content, for both
monophonic and polyphonic samples.Comment: 17 pages, published as a conference paper at ICLR 201
Perceptual similarity between color images using fuzzy metrics
“NOTICE: this is the author’s version of a work that was accepted for publication in Journal of Visual Communication and Image Representation. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Visual Communication and Image Representation, [Volume 34, January 2016, Pages 230–235] https://doi.org/10.1016/j.jvcir.2015.04.003In many applications of the computer vision field measuring the similarity between (color) images is of paramount importance. However, the commonly used pixelwise similarity measures such as Mean Absolute Error, Peak Signal to Noise Ratio, Mean Squared Error or Normalized Color Difference do not match well with perceptual similarity. Recently, it has been proposed a method for gray-scale image similarity that correlates quite well with the perceptual similarity and it has been extended to color images. In this paper we use the basic ideas in this recent work to propose an alternative method based on fuzzy metrics for perceptual color image similarity. Experimental results employing a survey of observations show that the global performance of our proposal is competitive with best state of the art methods and that it shows some advantages in performance for images with low correlation among some image channels. (C) 2015 Elsevier Inc. All rights reserved.Grecova, S.; Morillas Gómez, S. (2016). Perceptual similarity between color images using fuzzy metrics. Journal of Visual Communication and Image Representation. 34:230-235. doi:10.1016/j.jvcir.2015.04.003S2302353
GAN-based Image Compression with Improved RDO Process
GAN-based image compression schemes have shown remarkable progress lately due
to their high perceptual quality at low bit rates. However, there are two main
issues, including 1) the reconstructed image perceptual degeneration in color,
texture, and structure as well as 2) the inaccurate entropy model. In this
paper, we present a novel GAN-based image compression approach with improved
rate-distortion optimization (RDO) process. To achieve this, we utilize the
DISTS and MS-SSIM metrics to measure perceptual degeneration in color, texture,
and structure. Besides, we absorb the discretized gaussian-laplacian-logistic
mixture model (GLLMM) for entropy modeling to improve the accuracy in
estimating the probability distributions of the latent representation. During
the evaluation process, instead of evaluating the perceptual quality of the
reconstructed image via IQA metrics, we directly conduct the Mean Opinion Score
(MOS) experiment among different codecs, which fully reflects the actual
perceptual results of humans. Experimental results demonstrate that the
proposed method outperforms the existing GAN-based methods and the
state-of-the-art hybrid codec (i.e., VVC)
Improved Image Partitioning for Compression and Representation using the Lab Color Space in the LAR Image Codec
International audienceThe LAR codec is an advanced image compression method relying on a quadtree partitioning of the image. The partitioning strongly impacts the LAR codec efficiency and enables both compression and representation efficiency. In order to increase the perceptual representation abilities without penalizing the compression efficiency we introduce and evaluate two partitioning criteria working in the Lab color space. These criteria are confronted to the original criterion and their compression and robustness performances are analyzed
- …