23 research outputs found
Measuring Cluster Stability for Bayesian Nonparametrics Using the Linear Bootstrap
Clustering procedures typically estimate which data points are clustered
together, a quantity of primary importance in many analyses. Often used as a
preliminary step for dimensionality reduction or to facilitate interpretation,
finding robust and stable clusters is often crucial for appropriate for
downstream analysis. In the present work, we consider Bayesian nonparametric
(BNP) models, a particularly popular set of Bayesian models for clustering due
to their flexibility. Because of its complexity, the Bayesian posterior often
cannot be computed exactly, and approximations must be employed. Mean-field
variational Bayes forms a posterior approximation by solving an optimization
problem and is widely used due to its speed. An exact BNP posterior might vary
dramatically when presented with different data. As such, stability and
robustness of the clustering should be assessed.
A popular mean to assess stability is to apply the bootstrap by resampling
the data, and rerun the clustering for each simulated data set. The time cost
is thus often very expensive, especially for the sort of exploratory analysis
where clustering is typically used. We propose to use a fast and automatic
approximation to the full bootstrap called the "linear bootstrap", which can be
seen by local data perturbation. In this work, we demonstrate how to apply this
idea to a data analysis pipeline, consisting of an MFVB approximation to a BNP
clustering posterior of time course gene expression data. We show that using
auto-differentiation tools, the necessary calculations can be done
automatically, and that the linear bootstrap is a fast but approximate
alternative to the bootstrap.Comment: 9 pages, NIPS 2017 Advances in Approximate Bayesian Inference
Worksho
RoughSet-DDPM: An Image Super-Resolution Method Based on Rough set Denoising Diffusion Probability Model
Image super-resolution aims to generate high-resolution (HR) images from low-resolution (LR) inputs. Existing methods like autoregressive models, generative adversarial networks (GANs), and denoising diffusion probability models (DDPMs) have limitations in image quality or sampling efficiency. This paper proposes Rough Set-DDPM, a new super-resolution technique combining rough set theory and DDPMs. The rough set formulation divides the DDPM sampling sequence into optimal sub-columns by minimizing roughness of sample sets. Particle swarm optimization identifies the sub-columns with lowest roughness. Rough Set-DDPM applies iterative denoising on these optimal columns to output HR images. Experiments on the FFHQ dataset validate that Rough Set-DDPM improves DDPM sampling efficiency while maintaining image fidelity. Quantitative results show Rough Set-DDPM requires fewer sampling steps and generates higher quality HR images compared to autoregressive models and GANs. By enhancing DDPM sampling, Rough Set-DDPM provides an effective approach to super-resolution that balances image quality and sampling speed. The key contributions include introducing rough sets to optimize DDPM sampling and demonstrating superior performance over existing methods
Posterior Covariance Information Criterion
We introduce an information criterion, PCIC, for predictive evaluation based
on quasi-posterior distributions. It is regarded as a natural generalisation of
the widely applicable information criterion (WAIC) and can be computed via a
single Markov chain Monte Carlo run. PCIC is useful in a variety of predictive
settings that are not well dealt with in WAIC, including weighted likelihood
inference and quasi-Bayesian predictio