52 research outputs found
Age Progression and Regression with Spatial Attention Modules
Age progression and regression refers to aesthetically render-ing a given
face image to present effects of face aging and rejuvenation, respectively.
Although numerous studies have been conducted in this topic, there are two
major problems: 1) multiple models are usually trained to simulate different
age mappings, and 2) the photo-realism of generated face images is heavily
influenced by the variation of training images in terms of pose, illumination,
and background. To address these issues, in this paper, we propose a framework
based on conditional Generative Adversarial Networks (cGANs) to achieve age
progression and regression simultaneously. Particularly, since face aging and
rejuvenation are largely different in terms of image translation patterns, we
model these two processes using two separate generators, each dedicated to one
age changing process. In addition, we exploit spatial attention mechanisms to
limit image modifications to regions closely related to age changes, so that
images with high visual fidelity could be synthesized for in-the-wild cases.
Experiments on multiple datasets demonstrate the ability of our model in
synthesizing lifelike face images at desired ages with personalized features
well preserved, and keeping age-irrelevant regions unchanged
Age Progression/Regression by Conditional Adversarial Autoencoder
"If I provide you a face image of mine (without telling you the actual age
when I took the picture) and a large amount of face images that I crawled
(containing labeled faces of different ages but not necessarily paired), can
you show me what I would look like when I am 80 or what I was like when I was
5?" The answer is probably a "No." Most existing face aging works attempt to
learn the transformation between age groups and thus would require the paired
samples as well as the labeled query image. In this paper, we look at the
problem from a generative modeling perspective such that no paired samples is
required. In addition, given an unlabeled image, the generative model can
directly produce the image with desired age attribute. We propose a conditional
adversarial autoencoder (CAAE) that learns a face manifold, traversing on which
smooth age progression and regression can be realized simultaneously. In CAAE,
the face is first mapped to a latent vector through a convolutional encoder,
and then the vector is projected to the face manifold conditional on age
through a deconvolutional generator. The latent vector preserves personalized
face features (i.e., personality) and the age condition controls progression
vs. regression. Two adversarial networks are imposed on the encoder and
generator, respectively, forcing to generate more photo-realistic faces.
Experimental results demonstrate the appealing performance and flexibility of
the proposed framework by comparing with the state-of-the-art and ground truth.Comment: Accepted by The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2017
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
Conditional Image Synthesis by Generative Adversarial Modeling
Recent years, image synthesis has attracted more interests. This work explores the recovery of details (low-level information) from high-level features. The generative adversarial nets (GAN) has led to the explosion of image synthesis. Moving away from those application-oriented alternatives, this work investigates its intrinsic drawbacks and derives corresponding improvements in a theoretical manner.Based on GAN, this work further investigates the conditional image synthesis by incorporating an autoencoder (AE) to GAN. The GAN+AE structure has been demonstrated to be an effective framework for image manipulation. This work emphasizes the effectiveness of GAN+AE structure by proposing the conditional adversarial autoencoder (CAAE) for human facial age progression and regression. Instead of editing on the image level, i.e., explicitly changing the shape of face, adding wrinkle, etc., this work edits the high-level features which implicitly guide the recovery of images towards expected appearance.While GAN+AE being prevalent in image manipulation, its drawbacks lack exploration. For example, GAN+AE requires a weight to balance the effects of GAN and AE. An inappropriate weight would generate unstable results. This work provides an insight to such instability, which is due to the interaction between GAN and AE. Therefore, this work proposes the decoupled learning (GAN//AE) to avoid the interaction between them and achieve a robust and effective framework for image synthesis. Most existing works used GAN+AE structure could be easily adapted to the proposed GAN//AE structure to boost their robustness. Experimental results demonstrate the correctness and effectiveness of the provided derivation and proposed methods, respectively.In addition, this work extends the conditional image synthesis to the traditional area of image super-resolution, which recovers the high-resolution image according the low-resolution counterpart. Diverting from such traditional routine, this work explores a new research direction | reference-conditioned super-resolution, in which a reference image containing desired high-resolution texture details is used besides the low-resolution image. We focus on transferring the high-resolution texture from reference images to the super-resolution process without the constraint of content similarity between reference and target images, which is a key difference from previous example-based methods
Video based reconstruction system for mixed reality environments supporting contextualised non-verbal communication and its study
This Thesis presents a system to capture, reconstruct and render the three-dimensional form of people and objects of interest in such detail that the spatial and visual aspects of non-verbal behaviour can be communicated.The system supports live distribution and simultaneous rendering in multiple locations enabling the apparent teleportation of people and objects. Additionally, the system allows for the recording of live sessions and their playback in natural time with free-viewpoint.It utilises components of a video based reconstruction and a distributed video implementation to create an end-to-end system that can operate in real-time and on commodity hardware.The research addresses the specific challenges of spatial and colour calibration, segmentation and overall system architecture to overcome technical barriers, the requirement of domain specific knowledge to setup and generate avatars to a consistent high quality.Applications of the system include, but are not limited to, telepresence, where the computer generated avatars used in Immersive Collaborative Virtual Environments can be replaced with ones that are faithful of the people they represent and supporting researchers in their study of human communication such as gaze, inter-personal distance and facial expression.The system has been adopted in other research projects and is integrated with a mixed reality application where, during a live linkup, a three-dimensional avatar is streamed to multiple end-points across different countries
- …