529 research outputs found
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
We propose semantic region-adaptive normalization (SEAN), a simple but
effective building block for Generative Adversarial Networks conditioned on
segmentation masks that describe the semantic regions in the desired output
image. Using SEAN normalization, we can build a network architecture that can
control the style of each semantic region individually, e.g., we can specify
one style reference image per region. SEAN is better suited to encode,
transfer, and synthesize style than the best previous method in terms of
reconstruction quality, variability, and visual quality. We evaluate SEAN on
multiple datasets and report better quantitative metrics (e.g. FID, PSNR) than
the current state of the art. SEAN also pushes the frontier of interactive
image editing. We can interactively edit images by changing segmentation masks
or the style for any given region. We can also interpolate styles from two
reference images per region.Comment: Accepted as a CVPR 2020 oral paper. The interactive demo is available
at https://youtu.be/0Vbj9xFgoU
Interpreting Adversarially Trained Convolutional Neural Networks
We attempt to interpret how adversarially trained convolutional neural
networks (AT-CNNs) recognize objects. We design systematic approaches to
interpret AT-CNNs in both qualitative and quantitative ways and compare them
with normally trained models. Surprisingly, we find that adversarial training
alleviates the texture bias of standard CNNs when trained on object recognition
tasks, and helps CNNs learn a more shape-biased representation. We validate our
hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and
standard CNNs on clean images and images under different transformations. The
comparison could visually show that the prediction of the two types of CNNs is
sensitive to dramatically different types of features. Second, to achieve
quantitative verification, we construct additional test datasets that destroy
either textures or shapes, such as style-transferred version of clean data,
saturated images and patch-shuffled ones, and then evaluate the classification
accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some
light on why AT-CNNs are more robust than those normally trained ones and
contribute to a better understanding of adversarial training over CNNs from an
interpretation perspective.Comment: To apper in ICML1
SurReal: enhancing Surgical simulation Realism using style transfer
Surgical simulation is an increasingly important element of surgical
education. Using simulation can be a means to address some of the significant
challenges in developing surgical skills with limited time and resources. The
photo-realistic fidelity of simulations is a key feature that can improve the
experience and transfer ratio of trainees. In this paper, we demonstrate how we
can enhance the visual fidelity of existing surgical simulation by performing
style transfer of multi-class labels from real surgical video onto synthetic
content. We demonstrate our approach on simulations of cataract surgery using
real data labels from an existing public dataset. Our results highlight the
feasibility of the approach and also the powerful possibility to extend this
technique to incorporate additional temporal constraints and to different
applications
Unsupervised Learning of Artistic Styles with Archetypal Style Analysis
In this paper, we introduce an unsupervised learning approach to
automatically discover, summarize, and manipulate artistic styles from large
collections of paintings. Our method is based on archetypal analysis, which is
an unsupervised learning technique akin to sparse coding with a geometric
interpretation. When applied to deep image representations from a collection of
artworks, it learns a dictionary of archetypal styles, which can be easily
visualized. After training the model, the style of a new image, which is
characterized by local statistics of deep visual features, is approximated by a
sparse convex combination of archetypes. This enables us to interpret which
archetypal styles are present in the input image, and in which proportion.
Finally, our approach allows us to manipulate the coefficients of the latent
archetypal decomposition, and achieve various special effects such as style
enhancement, transfer, and interpolation between multiple archetypes.Comment: Accepted at NIPS 2018, Montr\'eal, Canad
- …