Image-to-image (I2I) translation is a challenging topic in computer vision.
We divide this problem into three tasks: strongly constrained translation,
normally constrained translation, and weakly constrained translation. The
constraint here indicates the extent to which the content or semantic
information in the original image is preserved. Although previous approaches
have achieved good performance in weakly constrained tasks, they failed to
fully preserve the content in both strongly and normally constrained tasks,
including photo-realism synthesis, style transfer, and colorization, etc. To
achieve content-preserving transfer in strongly constrained and normally
constrained tasks, we propose StyleFlow, a new I2I translation model that
consists of normalizing flows and a novel Style-Aware Normalization (SAN)
module. With the invertible network structure, StyleFlow first projects input
images into deep feature space in the forward pass, while the backward pass
utilizes the SAN module to perform content-fixed feature transformation and
then projects back to image space. Our model supports both image-guided
translation and multi-modal synthesis. We evaluate our model in several I2I
translation benchmarks, and the results show that the proposed model has
advantages over previous methods in both strongly constrained and normally
constrained tasks