In the domain of unsupervised image-to-image transformation using generative
transformative models, CycleGAN has become the architecture of choice. One of
the primary downsides of this architecture is its relatively slow rate of
convergence. In this work, we use discriminator-driven explainability to speed
up the convergence rate of the generative model by using saliency maps from the
discriminator that mask the gradients of the generator during backpropagation,
based on the work of Nagisetty et al., and also introducing the saliency map on
input, added onto a Gaussian noise mask, by using an interpretable latent
variable based on Wang M.'s Mask CycleGAN. This allows for an explainability
fusion in both directions, and utilizing the noise-added saliency map on input
as evidence-based counterfactual filtering. This new architecture has much
higher rate of convergence than a baseline CycleGAN architecture while
preserving the image quality.Comment: 10 pages, 4 figures, ICVS TU Wien 202