Image colorization is a challenging problem due to multi-modal uncertainty
and high ill-posedness. Directly training a deep neural network usually leads
to incorrect semantic colors and low color richness. While transformer-based
methods can deliver better results, they often rely on manually designed
priors, suffer from poor generalization ability, and introduce color bleeding
effects. To address these issues, we propose DDColor, an end-to-end method with
dual decoders for image colorization. Our approach includes a pixel decoder and
a query-based color decoder. The former restores the spatial resolution of the
image, while the latter utilizes rich visual features to refine color queries,
thus avoiding hand-crafted priors. Our two decoders work together to establish
correlations between color and multi-scale semantic representations via
cross-attention, significantly alleviating the color bleeding effect.
Additionally, a simple yet effective colorfulness loss is introduced to enhance
the color richness. Extensive experiments demonstrate that DDColor achieves
superior performance to existing state-of-the-art works both quantitatively and
qualitatively. The codes and models are publicly available at
https://github.com/piddnad/DDColor.Comment: ICCV 2023; Code: https://github.com/piddnad/DDColo