29,077 research outputs found
On Using Backpropagation for Speech Texture Generation and Voice Conversion
Inspired by recent work on neural network image generation which rely on
backpropagation towards the network inputs, we present a proof-of-concept
system for speech texture synthesis and voice conversion based on two
mechanisms: approximate inversion of the representation learned by a speech
recognition neural network, and on matching statistics of neuron activations
between different source and target utterances. Similar to image texture
synthesis and neural style transfer, the system works by optimizing a cost
function with respect to the input waveform samples. To this end we use a
differentiable mel-filterbank feature extraction pipeline and train a
convolutional CTC speech recognition network. Our system is able to extract
speaker characteristics from very limited amounts of target speaker data, as
little as a few seconds, and can be used to generate realistic speech babble or
reconstruct an utterance in a different voice.Comment: Accepted to ICASSP 201
A survey of exemplar-based texture synthesis
Exemplar-based texture synthesis is the process of generating, from an input
sample, new texture images of arbitrary size and which are perceptually
equivalent to the sample. The two main approaches are statistics-based methods
and patch re-arrangement methods. In the first class, a texture is
characterized by a statistical signature; then, a random sampling conditioned
to this signature produces genuinely different texture images. The second class
boils down to a clever "copy-paste" procedure, which stitches together large
regions of the sample. Hybrid methods try to combine ideas from both approaches
to avoid their hurdles. The recent approaches using convolutional neural
networks fit to this classification, some being statistical and others
performing patch re-arrangement in the feature space. They produce impressive
synthesis on various kinds of textures. Nevertheless, we found that most real
textures are organized at multiple scales, with global structures revealed at
coarse scales and highly varying details at finer ones. Thus, when confronted
with large natural images of textures the results of state-of-the-art methods
degrade rapidly, and the problem of modeling them remains wide open.Comment: v2: Added comments and typos fixes. New section added to describe
FRAME. New method presented: CNNMR
Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture
This paper addresses the problem of interpolating visual textures. We
formulate this problem by requiring (1) by-example controllability and (2)
realistic and smooth interpolation among an arbitrary number of texture
samples. To solve it we propose a neural network trained simultaneously on a
reconstruction task and a generation task, which can project texture examples
onto a latent space where they can be linearly interpolated and projected back
onto the image domain, thus ensuring both intuitive control and realistic
results. We show our method outperforms a number of baselines according to a
comprehensive suite of metrics as well as a user study. We further show several
applications based on our technique, which include texture brush, texture
dissolve, and animal hybridization.Comment: Accepted to CVPR'1
Texture Synthesis Through Convolutional Neural Networks and Spectrum Constraints
This paper presents a significant improvement for the synthesis of texture
images using convolutional neural networks (CNNs), making use of constraints on
the Fourier spectrum of the results. More precisely, the texture synthesis is
regarded as a constrained optimization problem, with constraints conditioning
both the Fourier spectrum and statistical features learned by CNNs. In contrast
with existing methods, the presented method inherits from previous CNN
approaches the ability to depict local structures and fine scale details, and
at the same time yields coherent large scale structures, even in the case of
quasi-periodic images. This is done at no extra computational cost. Synthesis
experiments on various images show a clear improvement compared to a recent
state-of-the art method relying on CNN constraints only
- …