20 research outputs found
MCMC-Interactive Variational Inference
Leveraging well-established MCMC strategies, we propose MCMC-interactive
variational inference (MIVI) to not only estimate the posterior in a time
constrained manner, but also facilitate the design of MCMC transitions.
Constructing a variational distribution followed by a short Markov chain that
has parameters to learn, MIVI takes advantage of the complementary properties
of variational inference and MCMC to encourage mutual improvement. On one hand,
with the variational distribution locating high posterior density regions, the
Markov chain is optimized within the variational inference framework to
efficiently target the posterior despite a small number of transitions. On the
other hand, the optimized Markov chain with considerable flexibility guides the
variational distribution towards the posterior and alleviates its
underestimation of uncertainty. Furthermore, we prove the optimized Markov
chain in MIVI admits extrapolation, which means its marginal distribution gets
closer to the true posterior as the chain grows. Therefore, the Markov chain
can be used separately as an efficient MCMC scheme. Experiments show that MIVI
not only accurately and efficiently approximates the posteriors but also
facilitates designs of stochastic gradient MCMC and Gibbs sampling transitions.Comment: 25 pages, 7 figures, 3 table
Beta Diffusion
We introduce beta diffusion, a novel generative modeling method that
integrates demasking and denoising to generate data within bounded ranges.
Using scaled and shifted beta distributions, beta diffusion utilizes
multiplicative transitions over time to create both forward and reverse
diffusion processes, maintaining beta distributions in both the forward
marginals and the reverse conditionals, given the data at any point in time.
Unlike traditional diffusion-based generative models relying on additive
Gaussian noise and reweighted evidence lower bounds (ELBOs), beta diffusion is
multiplicative and optimized with KL-divergence upper bounds (KLUBs) derived
from the convexity of the KL divergence. We demonstrate that the proposed KLUBs
are more effective for optimizing beta diffusion compared to negative ELBOs,
which can also be derived as the KLUBs of the same KL divergence with its two
arguments swapped. The loss function of beta diffusion, expressed in terms of
Bregman divergence, further supports the efficacy of KLUBs for optimization.
Experimental results on both synthetic data and natural images demonstrate the
unique capabilities of beta diffusion in generative modeling of range-bounded
data and validate the effectiveness of KLUBs in optimizing diffusion models,
thereby making them valuable additions to the family of diffusion-based
generative models and the optimization techniques used to train them
Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond
Although text-to-image diffusion models have made significant strides in
generating images from text, they are sometimes more inclined to generate
images like the data on which the model was trained rather than the provided
text. This limitation has hindered their usage in both 2D and 3D applications.
To address this problem, we explored the use of negative prompts but found that
the current implementation fails to produce desired results, particularly when
there is an overlap between the main and negative prompts. To overcome this
issue, we propose Perp-Neg, a new algorithm that leverages the geometrical
properties of the score space to address the shortcomings of the current
negative prompts algorithm. Perp-Neg does not require any training or
fine-tuning of the model. Moreover, we experimentally demonstrate that Perp-Neg
provides greater flexibility in generating images by enabling users to edit out
unwanted concepts from the initially generated images in 2D cases. Furthermore,
to extend the application of Perp-Neg to 3D, we conducted a thorough
exploration of how Perp-Neg can be used in 2D to condition the diffusion model
to generate desired views, rather than being biased toward the canonical views.
Finally, we applied our 2D intuition to integrate Perp-Neg with the
state-of-the-art text-to-3D (DreamFusion) method, effectively addressing its
Janus (multi-head) problem.Comment: Our project page is available at https://PerpNeg.github.io