Search CORE

5,353 research outputs found

Recycle-GAN: Unsupervised Video Retargeting

Author: C Cao
C Liu
E Hsu
J Walker
N Kholgade
O Ronneberger
O Russakovsky
Qi-Xing Huang
Publication venue
Publication date: 15/08/2018
Field of study

We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i.e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style. Our approach combines both spatial and temporal information along with adversarial losses for content translation and style preservation. In this work, we first study the advantages of using spatiotemporal constraints over spatial constraints for effective retargeting. We then demonstrate the proposed approach for the problems where information in both space and time matters such as face-to-face translation, flower-to-flower, wind and cloud synthesis, sunrise and sunset.Comment: ECCV 2018; Please refer to project webpage for videos - http://www.cs.cmu.edu/~aayushb/Recycle-GA

arXiv.org e-Print Archive

Crossref

Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation

Author: Arani Elahe
Chawla Hemang
Varma Arnav
Zonooz Bahram
Publication venue
Publication date: 03/03/2021
Field of study

Dense depth estimation is essential to scene-understanding for autonomous driving. However, recent self-supervised approaches on monocular videos suffer from scale-inconsistency across long sequences. Utilizing data from the ubiquitously copresent global positioning systems (GPS), we tackle this challenge by proposing a dynamically-weighted GPS-to-Scale (g2s) loss to complement the appearance-based losses. We emphasize that the GPS is needed only during the multimodal training, and not at inference. The relative distance between frames captured through the GPS provides a scale signal that is independent of the camera setup and scene distribution, resulting in richer learned feature representations. Through extensive evaluation on multiple datasets, we demonstrate scale-consistent and -aware depth estimation during inference, improving the performance even when training with low-frequency GPS data.Comment: Accepted at 2021 IEEE International Conference on Robotics and Automation (ICRA

arXiv.org e-Print Archive