2 research outputs found
Learning image-to-image translation using paired and unpaired training samples
Image-to-image translation is a general name for a task where an image from
one domain is converted to a corresponding image in another domain, given
sufficient training data. Traditionally different approaches have been proposed
depending on whether aligned image pairs or two sets of (unaligned) examples
from both domains are available for training. While paired training samples
might be difficult to obtain, the unpaired setup leads to a highly
under-constrained problem and inferior results. In this paper, we propose a new
general purpose image-to-image translation model that is able to utilize both
paired and unpaired training data simultaneously. We compare our method with
two strong baselines and obtain both qualitatively and quantitatively improved
results. Our model outperforms the baselines also in the case of purely paired
and unpaired training data. To our knowledge, this is the first work to
consider such hybrid setup in image-to-image translation
Low Light Video Enhancement using Synthetic Data Produced with an Intermediate Domain Mapping
Advances in low-light video RAW-to-RGB translation are opening up the
possibility of fast low-light imaging on commodity devices (e.g. smartphone
cameras) without the need for a tripod. However, it is challenging to collect
the required paired short-long exposure frames to learn a supervised mapping.
Current approaches require a specialised rig or the use of static videos with
no subject or object motion, resulting in datasets that are limited in size,
diversity, and motion. We address the data collection bottleneck for low-light
video RAW-to-RGB by proposing a data synthesis mechanism, dubbed SIDGAN, that
can generate abundant dynamic video training pairs. SIDGAN maps videos found
'in the wild' (e.g. internet videos) into a low-light (short, long exposure)
domain. By generating dynamic video data synthetically, we enable a recently
proposed state-of-the-art RAW-to-RGB model to attain higher image quality
(improved colour, reduced artifacts) and improved temporal consistency,
compared to the same model trained with only static real video data.Comment: Accepted to ECCV 202