Despite the proliferation of generative models, achieving fast sampling
during inference without compromising sample diversity and quality remains
challenging. Existing models such as Denoising Diffusion Probabilistic Models
(DDPM) deliver high-quality, diverse samples but are slowed by an inherently
high number of iterative steps. The Denoising Diffusion Generative Adversarial
Networks (DDGAN) attempted to circumvent this limitation by integrating a GAN
model for larger jumps in the diffusion process. However, DDGAN encountered
scalability limitations when applied to large datasets. To address these
limitations, we introduce a novel approach that tackles the problem by matching
implicit and explicit factors. More specifically, our approach involves
utilizing an implicit model to match the marginal distributions of noisy data
and the explicit conditional distribution of the forward diffusion. This
combination allows us to effectively match the joint denoising distributions.
Unlike DDPM but similar to DDGAN, we do not enforce a parametric distribution
for the reverse step, enabling us to take large steps during inference. Similar
to the DDPM but unlike DDGAN, we take advantage of the exact form of the
diffusion process. We demonstrate that our proposed method obtains comparable
generative performance to diffusion-based models and vastly superior results to
models with a small number of sampling steps