31 research outputs found
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives
Over the past few years, adversarial training has become an extremely active
research topic and has been successfully applied to various Artificial
Intelligence (AI) domains. As a potentially crucial technique for the
development of the next generation of emotional AI systems, we herein provide a
comprehensive overview of the application of adversarial training to affective
computing and sentiment analysis. Various representative adversarial training
algorithms are explained and discussed accordingly, aimed at tackling diverse
challenges associated with emotional AI systems. Further, we highlight a range
of potential future research directions. We expect that this overview will help
facilitate the development of adversarial training for affective computing and
sentiment analysis in both the academic and industrial communities
Controllable Multi-domain Semantic Artwork Synthesis
We present a novel framework for multi-domain synthesis of artwork from
semantic layouts. One of the main limitations of this challenging task is the
lack of publicly available segmentation datasets for art synthesis. To address
this problem, we propose a dataset, which we call ArtSem, that contains 40,000
images of artwork from 4 different domains with their corresponding semantic
label maps. We generate the dataset by first extracting semantic maps from
landscape photography and then propose a conditional Generative Adversarial
Network (GAN)-based approach to generate high-quality artwork from the semantic
maps without necessitating paired training data. Furthermore, we propose an
artwork synthesis model that uses domain-dependent variational encoders for
high-quality multi-domain synthesis. The model is improved and complemented
with a simple but effective normalization method, based on normalizing both the
semantic and style jointly, which we call Spatially STyle-Adaptive
Normalization (SSTAN). In contrast to previous methods that only take semantic
layout as input, our model is able to learn a joint representation of both
style and semantic information, which leads to better generation quality for
synthesizing artistic images. Results indicate that our model learns to
separate the domains in the latent space, and thus, by identifying the
hyperplanes that separate the different domains, we can also perform
fine-grained control of the synthesized artwork. By combining our proposed
dataset and approach, we are able to generate user-controllable artwork that is
of higher quality than existingComment: 15 pages, accepted by CVMJ, to appea
Painterly Image Harmonization in Dual Domains
Image harmonization aims to produce visually harmonious composite images by
adjusting the foreground appearance to be compatible with the background. When
the composite image has photographic foreground and painterly background, the
task is called painterly image harmonization. There are only few works on this
task, which are either time-consuming or weak in generating well-harmonized
results. In this work, we propose a novel painterly harmonization network
consisting of a dual-domain generator and a dual-domain discriminator, which
harmonizes the composite image in both spatial domain and frequency domain. The
dual-domain generator performs harmonization by using AdaIN modules in the
spatial domain and our proposed ResFFT modules in the frequency domain. The
dual-domain discriminator attempts to distinguish the inharmonious patches
based on the spatial feature and frequency feature of each patch, which can
enhance the ability of generator in an adversarial manner. Extensive
experiments on the benchmark dataset show the effectiveness of our method. Our
code and model are available at
https://github.com/bcmi/PHDNet-Painterly-Image-Harmonization.Comment: Accepted by AAAI202
Recommended from our members
Sparse Recovery and Representation Learning
This dissertation focuses on sparse representation and dictionary learning, with three relative topics. First, in chapter 1, we study the problem of low-rank matrix recovery in the presence of prior information. We first study the recovery of low-rank matrices with a necessary and sufficient condition, called the Null Space Property, for exact recovery from compressively sampled measurements using nuclear norm minimization. Here, we provide an alternative theoretical analysis of the bound on the number of random Gaussian measurements needed for the condition to be satisfied with high probability. We then study low-rank matrix recovery when prior information is available. We analyze an existing algorithm, provide the necessary and sufficient conditions for exact recovery and show that the existing algorithm is limited in certain cases. We provide an alternative recovery algorithm to deal with the drawback and provide sufficient recovery conditions based on that. In chapter 2, we study the problem of learning a sparsifying dictionary of a set of data, focusing on learning dictionaries that admit fast transforms. Inspired by the Fast Fourier Transform, we propose a learning algorithm involving unknown parameters for a linear transformation matrix. Empirically, our algorithm can produce dictionaries that provide lower numerical sparsity for the sparse representation of images than the Discrete Fourier Transformation (DFT). Additionally, due to its structure, the learned dictionary can recover the original signal from the sparse representation in computations. In chapter 3, we study the representation learning problem in a more complex setting. We use the concept of dictionary learning and apply it in a deep generative model. Motivated by an application in the computer gaming industry where designers needs to have an urban layout generation tool that allows fast generation and modification, we present a novel solution to synthesize high quality building placements using conditional generative latent optimization together with adversarial training. The capability of the proposed method is demonstrated in various examples. The inference is nearly in real time, thus it can assist designers to iterate their designs of virtual cities quickly
Creative Painting with Latent Diffusion Models
Artistic painting has achieved significant progress during recent years.
Using an autoencoder to connect the original images with compressed latent
spaces and a cross attention enhanced U-Net as the backbone of diffusion,
latent diffusion models (LDMs) have achieved stable and high fertility image
generation. In this paper, we focus on enhancing the creative painting ability
of current LDMs in two directions, textual condition extension and model
retraining with Wikiart dataset. Through textual condition extension, users'
input prompts are expanded with rich contextual knowledge for deeper
understanding and explaining the prompts. Wikiart dataset contains 80K famous
artworks drawn during recent 400 years by more than 1,000 famous artists in
rich styles and genres. Through the retraining, we are able to ask these
artists to draw novel and creative painting on modern topics. Direct
comparisons with the original model show that the creativity and artistry are
enriched.Comment: 17pages, 12 figure
Deep learning approaches to pattern extraction and recognition in paintings and drawings: an overview
This paper provides an overview of some of the most relevant deep learning approaches to pattern extraction and recognition in visual arts, particularly painting and drawing. Recent advances in deep learning and computer vision, coupled with the growing availability of large digitized visual art collections, have opened new opportunities for computer science researchers to assist the art community with automatic tools to analyse and further understand visual arts. Among other benefits, a deeper understanding of visual arts has the potential to make them more accessible to a wider population, ultimately supporting the spread of culture