190 research outputs found
Style and Pose Control for Image Synthesis of Humans from a Single Monocular View
Photo-realistic re-rendering of a human from a single image with explicit control over body pose, shape and appearance enables a wide range of applications, such as human appearance transfer, virtual try-on, motion imitation, and novel view synthesis. While significant progress has been made in this direction using learning-based image generation tools, such as GANs, existing approaches yield noticeable artefacts such as blurring of fine details, unrealistic distortions of the body parts and garments as well as severe changes of the textures. We, therefore, propose a new method for synthesising photo-realistic human images with explicit control over pose and part-based appearance, i.e., StylePoseGAN, where we extend a non-controllable generator to accept conditioning of pose and appearance separately. Our network can be trained in a fully supervised way with human images to disentangle pose, appearance and body parts, and it significantly outperforms existing single image re-rendering methods. Our disentangled representation opens up further applications such as garment transfer, motion transfer, virtual try-on, head (identity) swap and appearance interpolation. StylePoseGAN achieves state-of-the-art image generation fidelity on common perceptual metrics compared to the current best-performing methods and convinces in a comprehensive user study
DyNCA: Real-time Dynamic Texture Synthesis Using Neural Cellular Automata
Current Dynamic Texture Synthesis (DyTS) models in the literature can
synthesize realistic videos. However, these methods require a slow iterative
optimization process to synthesize a single fixed-size short video, and they do
not offer any post-training control over the synthesis process. We propose
Dynamic Neural Cellular Automata (DyNCA), a framework for real-time and
controllable dynamic texture synthesis. Our method is built upon the recently
introduced NCA models, and can synthesize infinitely-long and arbitrary-size
realistic texture videos in real-time. We quantitatively and qualitatively
evaluate our model and show that our synthesized videos appear more realistic
than the existing results. We improve the SOTA DyTS performance by
orders of magnitude. Moreover, our model offers several real-time and
interactive video controls including motion speed, motion direction, and an
editing brush tool
Two-Stream Convolutional Networks for Dynamic Texture Synthesis
This thesis introduces a two-stream model for dynamic texture synthesis. The model is based on pre-trained convolutional networks (ConvNets) that target two independent tasks: (i) object recognition, and (ii) optical flow regression. Given an input dynamic texture, statistics of filter responses from the object recognition and optical flow ConvNets encapsulate the per-frame appearance and dynamics of the input texture, respectively. To synthesize a dynamic texture, a randomly initialized input sequence is optimized to match the feature statistics from each stream of an example texture. In addition, the synthesis approach is applied to combine the texture appearance from one texture with the dynamics of another to generate entirely novel dynamic textures. Overall, the proposed approach generates high quality samples that match both the framewise appearance and temporal evolution of input texture. Finally, a quantitative evaluation of the proposed dynamic texture synthesis approach is performed via a large-scale user study
- …