Joint learning of super-resolution (SR) and inverse tone-mapping (ITM) has
been explored recently, to convert legacy low resolution (LR) standard dynamic
range (SDR) videos to high resolution (HR) high dynamic range (HDR) videos for
the growing need of UHD HDR TV/broadcasting applications. However, previous
CNN-based methods directly reconstruct the HR HDR frames from LR SDR frames,
and are only trained with a simple L2 loss. In this paper, we take a
divide-and-conquer approach in designing a novel GAN-based joint SR-ITM
network, called JSI-GAN, which is composed of three task-specific subnets: an
image reconstruction subnet, a detail restoration (DR) subnet and a local
contrast enhancement (LCE) subnet. We delicately design these subnets so that
they are appropriately trained for the intended purpose, learning a pair of
pixel-wise 1D separable filters via the DR subnet for detail restoration and a
pixel-wise 2D local filter by the LCE subnet for contrast enhancement.
Moreover, to train the JSI-GAN effectively, we propose a novel detail GAN loss
alongside the conventional GAN loss, which helps enhancing both local details
and contrasts to reconstruct high quality HR HDR results. When all subnets are
jointly trained well, the predicted HR HDR results of higher quality are
obtained with at least 0.41 dB gain in PSNR over those generated by the
previous methods.Comment: The first two authors contributed equally to this work. Accepted at
AAAI 2020. (Camera-ready version