19 research outputs found
Style Transfer by Relaxed Optimal Transport and Self-Similarity
Style transfer algorithms strive to render the content of one image using the
style of another. We propose Style Transfer by Relaxed Optimal Transport and
Self-Similarity (STROTSS), a new optimization-based style transfer algorithm.
We extend our method to allow user-specified point-to-point or region-to-region
control over visual similarity between the style image and the output. Such
guidance can be used to either achieve a particular visual effect or correct
errors made by unconstrained style transfer. In order to quantitatively compare
our method to prior work, we conduct a large-scale user study designed to
assess the style-content tradeoff across settings in style transfer algorithms.
Our results indicate that for any desired level of content preservation, our
method provides higher quality stylization than prior work. Code is available
at https://github.com/nkolkin13/STROTSSComment: To Appear CVPR 2019, Webdemo Available at http://style.ttic.ed
Manifold Alignment for Semantically Aligned Style Transfer
Given a content image and a style image, the goal of style transfer is to
synthesize an output image by transferring the target style to the content
image. Currently, most of the methods address the problem with global style
transfer, assuming styles can be represented by global statistics, such as Gram
matrices or covariance matrices. In this paper, we make a different assumption
that local semantically aligned (or similar) regions between the content and
style images should share similar style patterns. Based on this assumption,
content features and style features are seen as two sets of manifolds and a
manifold alignment based style transfer (MAST) method is proposed. MAST is a
subspace learning method which learns a common subspace of the content and
style features. In the common subspace, content and style features with larger
feature similarity or the same semantic meaning are forced to be close. The
learned projection matrices are added with orthogonality constraints so that
the mapping can be bidirectional, which allows us to project the content
features into the common subspace, and then into the original style space. By
using a pre-trained decoder, promising stylized images are obtained. The method
is further extended to allow users to specify corresponding semantic regions
between content and style images or using semantic segmentation maps as
guidance. Extensive experiments show the proposed MAST achieves appealing
results in style transfer.Comment: 10 page
Wasserstein Style Transfer
We propose Gaussian optimal transport for Image style transfer in an
Encoder/Decoder framework. Optimal transport for Gaussian measures has closed
forms Monge mappings from source to target distributions. Moreover interpolates
between a content and a style image can be seen as geodesics in the Wasserstein
Geometry. Using this insight, we show how to mix different target styles ,
using Wasserstein barycenter of Gaussian measures. Since Gaussians are closed
under Wasserstein barycenter, this allows us a simple style transfer and style
mixing and interpolation. Moreover we show how mixing different styles can be
achieved using other geodesic metrics between gaussians such as the Fisher Rao
metric, while the transport of the content to the new interpolate style is
still performed with Gaussian OT maps. Our simple methodology allows to
generate new stylized content interpolating between many artistic styles. The
metric used in the interpolation results in different stylizations
Arbitrary Style Transfer via Multi-Adaptation Network
Arbitrary style transfer is a significant topic with research value and
application prospect. A desired style transfer, given a content image and
referenced style painting, would render the content image with the color tone
and vivid stroke patterns of the style painting while synchronously maintaining
the detailed content structure information. Style transfer approaches would
initially learn content and style representations of the content and style
references and then generate the stylized images guided by these
representations. In this paper, we propose the multi-adaptation network which
involves two self-adaptation (SA) modules and one co-adaptation (CA) module:
the SA modules adaptively disentangle the content and style representations,
i.e., content SA module uses position-wise self-attention to enhance content
representation and style SA module uses channel-wise self-attention to enhance
style representation; the CA module rearranges the distribution of style
representation based on content representation distribution by calculating the
local similarity between the disentangled content and style features in a
non-local fashion. Moreover, a new disentanglement loss function enables our
network to extract main style patterns and exact content structures to adapt to
various input images, respectively. Various qualitative and quantitative
experiments demonstrate that the proposed multi-adaptation network leads to
better results than the state-of-the-art style transfer methods
A Sliced Wasserstein Loss for Neural Texture Synthesis
We address the problem of computing a textural loss based on the statistics
extracted from the feature activations of a convolutional neural network
optimized for object recognition (e.g. VGG-19). The underlying mathematical
problem is the measure of the distance between two distributions in feature
space. The Gram-matrix loss is the ubiquitous approximation for this problem
but it is subject to several shortcomings. Our goal is to promote the Sliced
Wasserstein Distance as a replacement for it. It is theoretically
proven,practical, simple to implement, and achieves results that are visually
superior for texture synthesis by optimization or training generative neural
networks.Comment: 9 pages, 13 figures, under revie
Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer
Artistic style transfer aims at migrating the style from an example image to
a content image. Currently, optimization-based methods have achieved great
stylization quality, but expensive time cost restricts their practical
applications. Meanwhile, feed-forward methods still fail to synthesize complex
style, especially when holistic global and local patterns exist. Inspired by
the common painting process of drawing a draft and revising the details, we
introduce a novel feed-forward method named Laplacian Pyramid Network
(LapStyle). LapStyle first transfers global style patterns in low-resolution
via a Drafting Network. It then revises the local details in high-resolution
via a Revision Network, which hallucinates a residual image according to the
draft and the image textures extracted by Laplacian filtering. Higher
resolution details can be easily generated by stacking Revision Networks with
multiple Laplacian pyramid levels. The final stylized image is obtained by
aggregating outputs of all pyramid levels. %We also introduce a patch
discriminator to better learn local patterns adversarially. Experiments
demonstrate that our method can synthesize high quality stylized images in real
time, where holistic style patterns are properly transferred.Comment: Accepted by CVPR 2021. Codes will be released soon on
https://github.com/PaddlePaddle/PaddleGAN
Interactive Neural Style Transfer with Artists
We present interactive painting processes in which a painter and various
neural style transfer algorithms interact on a real canvas. Understanding what
these algorithms' outputs achieve is then paramount to describe the creative
agency in our interactive experiments. We gather a set of paired
painting-pictures images and present a new evaluation methodology based on the
predictivity of neural style transfer algorithms. We point some algorithms'
instabilities and show that they can be used to enlarge the diversity and
pleasing oddity of the images synthesized by the numerous existing neural style
transfer algorithms. This diversity of images was perceived as a source of
inspiration for human painters, portraying the machine as a computational
catalyst
In the light of feature distributions: moment matching for Neural Style Transfer
Style transfer aims to render the content of a given image in the
graphical/artistic style of another image. The fundamental concept underlying
NeuralStyle Transfer (NST) is to interpret style as a distribution in the
feature space of a Convolutional Neural Network, such that a desired style can
be achieved by matching its feature distribution. We show that most current
implementations of that concept have important theoretical and practical
limitations, as they only partially align the feature distributions. We propose
a novel approach that matches the distributions more precisely, thus
reproducing the desired style more faithfully, while still being
computationally efficient. Specifically, we adapt the dual form of Central
Moment Discrepancy (CMD), as recently proposed for domain adaptation, to
minimize the difference between the target style and the feature distribution
of the output image. The dual interpretation of this metric explicitly matches
all higher-order centralized moments and is therefore a natural extension of
existing NST methods that only take into account the first and second moments.
Our experiments confirm that the strong theoretical properties also translate
to visually better style transfer, and better disentangle style from semantic
image content
Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving
Image-to-image translation aims at translating a particular style of an image
to another. The synthesized images can be more photo-realistic and
identity-preserving by decomposing the image into content and style in a
disentangled manner. While existing models focus on designing specialized
network architecture to separate the two components, this paper investigates
how to explicitly constrain the content and style statistics of images. We
achieve this goal by transforming the input image into high frequency and low
frequency information, which correspond to the content and style, respectively.
We regulate the frequency distribution from two aspects: a) a spatial level
restriction to locally restrict the frequency distribution of images; b) a
spectral level regulation to enhance the global consistency among images. On
multiple datasets we show that the proposed approach consistently leads to
significant improvements on top of various state-of-the-art image translation
models.Comment: 13 page
Unbalanced Feature Transport for Exemplar-based Image Translation
Despite the great success of GANs in images translation with different
conditioned inputs such as semantic segmentation and edge maps, generating
high-fidelity realistic images with reference styles remains a grand challenge
in conditional image-to-image translation. This paper presents a general image
translation framework that incorporates optimal transport for feature alignment
between conditional inputs and style exemplars in image translation. The
introduction of optimal transport mitigates the constraint of many-to-one
feature matching significantly while building up accurate semantic
correspondences between conditional inputs and exemplars. We design a novel
unbalanced optimal transport to address the transport between features with
deviational distributions which exists widely between conditional inputs and
exemplars. In addition, we design a semantic-activation normalization scheme
that injects style features of exemplars into the image translation process
successfully. Extensive experiments over multiple image translation tasks show
that our method achieves superior image translation qualitatively and
quantitatively as compared with the state-of-the-art.Comment: Accepted to CVPR 202