1,733 research outputs found
Recovering Faces from Portraits with Auxiliary Facial Attributes
Recovering a photorealistic face from an artistic portrait is a challenging
task since crucial facial details are often distorted or completely lost in
artistic compositions. To handle this loss, we propose an Attribute-guided Face
Recovery from Portraits (AFRP) that utilizes a Face Recovery Network (FRN) and
a Discriminative Network (DN). FRN consists of an autoencoder with residual
block-embedded skip-connections and incorporates facial attribute vectors into
the feature maps of input portraits at the bottleneck of the autoencoder. DN
has multiple convolutional and fully-connected layers, and its role is to
enforce FRN to generate authentic face images with corresponding facial
attributes dictated by the input attribute vectors. %Leveraging on the spatial
transformer networks, FRN automatically compensates for misalignments of
portraits. % and generates aligned face images. For the preservation of
identities, we impose the recovered and ground-truth faces to share similar
visual features. Specifically, DN determines whether the recovered image looks
like a real face and checks if the facial attributes extracted from the
recovered image are consistent with given attributes. %Our method can recover
high-quality photorealistic faces from unaligned portraits while preserving the
identity of the face images as well as it can reconstruct a photorealistic face
image with a desired set of attributes. Our method can recover photorealistic
identity-preserving faces with desired attributes from unseen stylized
portraits, artistic paintings, and hand-drawn sketches. On large-scale
synthesized and sketch datasets, we demonstrate that our face recovery method
achieves state-of-the-art results.Comment: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV
Identity-preserving Face Recovery from Portraits
Recovering the latent photorealistic faces from their artistic portraits aids
human perception and facial analysis. However, a recovery process that can
preserve identity is challenging because the fine details of real faces can be
distorted or lost in stylized images. In this paper, we present a new
Identity-preserving Face Recovery from Portraits (IFRP) to recover latent
photorealistic faces from unaligned stylized portraits. Our IFRP method
consists of two components: Style Removal Network (SRN) and Discriminative
Network (DN). The SRN is designed to transfer feature maps of stylized images
to the feature maps of the corresponding photorealistic faces. By embedding
spatial transformer networks into the SRN, our method can compensate for
misalignments of stylized faces automatically and output aligned realistic face
images. The role of the DN is to enforce recovered faces to be similar to
authentic faces. To ensure the identity preservation, we promote the recovered
and ground-truth faces to share similar visual features via a distance measure
which compares features of recovered and ground-truth faces extracted from a
pre-trained VGG network. We evaluate our method on a large-scale synthesized
dataset of real and stylized face pairs and attain state of the art results. In
addition, our method can recover photorealistic faces from previously unseen
stylized portraits, original paintings and human-drawn sketches
Face Hallucination via Deep Neural Networks.
We firstly address aligned low-resolution (LR) face images (i.e. 16X16 pixels) by designing a discriminative generative network, named URDGN. URDGN is composed of two networks: a generative model and a discriminative model.
We introduce a pixel-wise L2 regularization term to the generative model and exploit the feedback of the discriminative network to make the upsampled face images more similar to real ones.
We present an end-to-end transformative discriminative neural network (TDN) devised for super-resolving unaligned tiny face images. TDN embeds spatial transformation layers to enforce local receptive fields to line-up with similar spatial supports. To upsample noisy unaligned LR face images, we propose decoder-encoder-decoder networks. A transformative discriminative decoder network is employed to upsample and denoise LR inputs simultaneously. Then we project the intermediate HR faces to aligned and noise-free LR faces by a transformative encoder network. Finally, high-quality hallucinated HR images are generated by our second decoder. Furthermore, we present an end-to-end multiscale transformative discriminative neural network (MTDN) to super-resolve unaligned LR face images of different resolutions in a unified framework.
We propose a method that explicitly incorporates structural information of faces into the face super-resolution process by using a multi-task convolutional neural network (CNN). Our method not only uses low-level information (i.e. intensity similarity), but also middle-level information (i.e. face structure) to further explore spatial constraints of facial components from LR inputs images.
We demonstrate that supplementing residual images or feature maps with additional facial attribute information can significantly reduce the ambiguity in face super-resolution. To explore this idea, we develop an attribute-embedded upsampling network. In this manner, our method is able to super-resolve LR faces by a large upscaling factor while reducing the uncertainty of one-to-many mappings remarkably.
We further push the boundaries of hallucinating a tiny, non-frontal face image to understand how much of this is possible by leveraging the availability of large datasets and deep networks. To this end, we introduce a novel Transformative Adversarial Neural Network (TANN) to jointly frontalize very LR out-of-plane rotated face images (including profile views) and aggressively super-resolve them by 8X, regardless of their original poses and without using any 3D information. Besides recovering an HR face images from an LR version, this thesis also addresses the task of restoring realistic faces from stylized portrait images, which can also be regarded as face hallucination
An Assyrian King in Turin
At the Museum of Antiquity in Turin (Italy) is now again exhibited the famous portrait of Sargon II (donated
by P.E. Botta to his hometown, along with another portrait of a courtier) together with a few small fragments
of other lesser-known Assyrian wall reliefs, from Khorsabad and Nineveh. In this paper we present some
considerations on the two Botta’s reliefs, considering in particular the representational codes used by artists
in the composition of the Assyrian royal portraitsEn el museo de Antigüedades de TurÃn (Italia), actualmente se exhibe de nuevo el famoso retrato de Sargón
II (donado por P.E. Botta a su ciudad natal a la par que otro retrato de un cortesano) junto con unos
cuantos fragmentos pequeños de otros relieves murales asirios menos conocidos de Jorsabad y NÃnive. En el
presente artÃculo exponemos algunas consideraciones sobre los dos relieves de Botta, valorando en
particular los códigos de representación utilizados por los artistas en la composición de los retratos asirios
reale
Face Hallucination With Finishing Touches
Obtaining a high-quality frontal face image from a low-resolution (LR) non-frontal face image is primarily important for many facial analysis applications. However, mainstreams either focus on super-resolving near-frontal LR faces or frontalizing non-frontal high-resolution (HR) faces. It is desirable to perform both tasks seamlessly for daily-life unconstrained face images. In this paper, we present a novel Vivid Face Hallucination Generative Adversarial Network (VividGAN) for simultaneously super-resolving and frontalizing tiny non-frontal face images. VividGAN consists of coarse-level and fine-level Face Hallucination Networks (FHnet) and two discriminators, i.e., Coarse-D and Fine-D. The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i.e., fine-grained facial components, to attain a frontal HR face image with authentic details. In the fine-level FHnet, we also design a facial component-aware module that adopts the facial geometry guidance as clues to accurately align and merge the frontal coarse HR face and prior information. Meanwhile, two-level discriminators are designed to capture both the global outline of a face image as well as detailed facial characteristics. The Coarse-D enforces the coarsely hallucinated faces to be upright and complete while the Fine-D focuses on the fine hallucinated ones for sharper details. Extensive experiments demonstrate that our VividGAN achieves photo-realistic frontal HR faces, reaching superior performance in downstream tasks, i.e., face recognition and expression classification, compared with other state-of-the-art methods
From rule-based to learning-based image-conditional image generation
Visual contents, such as movies, animations, computer games, videos and photos, are
massively produced and consumed nowadays. Most of these contents are the combination
of materials captured from real-world and contents synthesized by computers. Particularly,
computer-generated visual contents are increasingly indispensable in modern entertainment
and production. The generation of visual contents by computers is typically conditioned on
real-world materials, driven by the imagination of designers and artists, or a combination
of both. However, creating visual contents manually are both challenging and labor intensive.
Therefore, enabling computers to automatically or semi-automatically synthesize
needed visual contents becomes essential. Among all these efforts, a stream of research
is to generate novel images based on given image priors, e.g., photos and sketches. This
research direction is known as image-conditional image generation, which covers a wide
range of topics such as image stylization, image completion, image fusion, sketch-to-image
generation, and extracting image label maps. In this thesis, a set of novel approaches for
image-conditional image generation are presented.
The thesis starts with an exemplar-based method for facial image stylization in Chapter
2. This method involves a unified framework for facial image stylization based on a single
style exemplar. A two-phase procedure is employed, where the first phase searches a dense
and semantic-aware correspondence between the input and the exemplar images, and the
second phase conducts edge-preserving texture transfer. While this algorithm has the merit
of requiring only a single exemplar, it is constrained to face photos. To perform generalized
image-to-image translation, Chapter 3 presents a data-driven and learning-based method. Inspired by the dual learning paradigm designed for natural language translation [115], a
novel dual Generative Adversarial Network (DualGAN) mechanism is developed, which
enables image translators to be trained from two sets of unlabeled images from two domains.
This is followed by another data-driven method in Chapter 4, which learns multiscale
manifolds from a set of images and then enables synthesizing novel images that mimic
the appearance of the target image dataset. The method is named as Branched Generative
Adversarial Network (BranchGAN) and employs a novel training method that enables unconditioned
generative adversarial networks (GANs) to learn image manifolds at multiple
scales. As a result, we can directly manipulate and even combine latent manifold codes
that are associated with specific feature scales. Finally, to provide users more control over
image generation results, Chapter 5 discusses an upgraded version of iGAN [126] (iGANHD)
that significantly improves the art of manipulating high-resolution images through
utilizing the multi-scale manifold learned with BranchGAN
- …