11,917 research outputs found
A Novel BiLevel Paradigm for Image-to-Image Translation
Image-to-image (I2I) translation is a pixel-level mapping that requires a
large number of paired training data and often suffers from the problems of
high diversity and strong category bias in image scenes. In order to tackle
these problems, we propose a novel BiLevel (BiL) learning paradigm that
alternates the learning of two models, respectively at an instance-specific
(IS) and a general-purpose (GP) level. In each scene, the IS model learns to
maintain the specific scene attributes. It is initialized by the GP model that
learns from all the scenes to obtain the generalizable translation knowledge.
This GP initialization gives the IS model an efficient starting point, thus
enabling its fast adaptation to the new scene with scarce training data. We
conduct extensive I2I translation experiments on human face and street view
datasets. Quantitative results validate that our approach can significantly
boost the performance of classical I2I translation models, such as PG2 and
Pix2Pix. Our visualization results show both higher image quality and more
appropriate instance-specific details, e.g., the translated image of a person
looks more like that person in terms of identity
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
TET-GAN: Text Effects Transfer via Stylization and Destylization
Text effects transfer technology automatically makes the text dramatically
more impressive. However, previous style transfer methods either study the
model for general style, which cannot handle the highly-structured text effects
along the glyph, or require manual design of subtle matching criteria for text
effects. In this paper, we focus on the use of the powerful representation
abilities of deep neural features for text effects transfer. For this purpose,
we propose a novel Texture Effects Transfer GAN (TET-GAN), which consists of a
stylization subnetwork and a destylization subnetwork. The key idea is to train
our network to accomplish both the objective of style transfer and style
removal, so that it can learn to disentangle and recombine the content and
style features of text effects images. To support the training of our network,
we propose a new text effects dataset with as much as 64 professionally
designed styles on 837 characters. We show that the disentangled feature
representations enable us to transfer or remove all these styles on arbitrary
glyphs using one network. Furthermore, the flexible network design empowers
TET-GAN to efficiently extend to a new text style via one-shot learning where
only one example is required. We demonstrate the superiority of the proposed
method in generating high-quality stylized text over the state-of-the-art
methods.Comment: Accepted by AAAI 2019. Code and dataset will be available at
http://www.icst.pku.edu.cn/struct/Projects/TETGAN.htm
- …