671 research outputs found
Recycle-GAN: Unsupervised Video Retargeting
We introduce a data-driven approach for unsupervised video retargeting that
translates content from one domain to another while preserving the style native
to a domain, i.e., if contents of John Oliver's speech were to be transferred
to Stephen Colbert, then the generated content/speech should be in Stephen
Colbert's style. Our approach combines both spatial and temporal information
along with adversarial losses for content translation and style preservation.
In this work, we first study the advantages of using spatiotemporal constraints
over spatial constraints for effective retargeting. We then demonstrate the
proposed approach for the problems where information in both space and time
matters such as face-to-face translation, flower-to-flower, wind and cloud
synthesis, sunrise and sunset.Comment: ECCV 2018; Please refer to project webpage for videos -
http://www.cs.cmu.edu/~aayushb/Recycle-GA
Shortcut-V2V: Compression Framework for Video-to-Video Translation based on Temporal Redundancy Reduction
Video-to-video translation aims to generate video frames of a target domain
from an input video. Despite its usefulness, the existing networks require
enormous computations, necessitating their model compression for wide use.
While there exist compression methods that improve computational efficiency in
various image/video tasks, a generally-applicable compression method for
video-to-video translation has not been studied much. In response, we present
Shortcut-V2V, a general-purpose compression framework for video-to-video
translation. Shourcut-V2V avoids full inference for every neighboring video
frame by approximating the intermediate features of a current frame from those
of the previous frame. Moreover, in our framework, a newly-proposed block
called AdaBD adaptively blends and deforms features of neighboring frames,
which makes more accurate predictions of the intermediate features possible. We
conduct quantitative and qualitative evaluations using well-known
video-to-video translation models on various tasks to demonstrate the general
applicability of our framework. The results show that Shourcut-V2V achieves
comparable performance compared to the original video-to-video translation
model while saving 3.2-5.7x computational cost and 7.8-44x memory at test time.Comment: to be update
- …