23 research outputs found

    Measuring Cluster Stability for Bayesian Nonparametrics Using the Linear Bootstrap

    Full text link
    Clustering procedures typically estimate which data points are clustered together, a quantity of primary importance in many analyses. Often used as a preliminary step for dimensionality reduction or to facilitate interpretation, finding robust and stable clusters is often crucial for appropriate for downstream analysis. In the present work, we consider Bayesian nonparametric (BNP) models, a particularly popular set of Bayesian models for clustering due to their flexibility. Because of its complexity, the Bayesian posterior often cannot be computed exactly, and approximations must be employed. Mean-field variational Bayes forms a posterior approximation by solving an optimization problem and is widely used due to its speed. An exact BNP posterior might vary dramatically when presented with different data. As such, stability and robustness of the clustering should be assessed. A popular mean to assess stability is to apply the bootstrap by resampling the data, and rerun the clustering for each simulated data set. The time cost is thus often very expensive, especially for the sort of exploratory analysis where clustering is typically used. We propose to use a fast and automatic approximation to the full bootstrap called the "linear bootstrap", which can be seen by local data perturbation. In this work, we demonstrate how to apply this idea to a data analysis pipeline, consisting of an MFVB approximation to a BNP clustering posterior of time course gene expression data. We show that using auto-differentiation tools, the necessary calculations can be done automatically, and that the linear bootstrap is a fast but approximate alternative to the bootstrap.Comment: 9 pages, NIPS 2017 Advances in Approximate Bayesian Inference Worksho

    RoughSet-DDPM: An Image Super-Resolution Method Based on Rough set Denoising Diffusion Probability Model

    Get PDF
    Image super-resolution aims to generate high-resolution (HR) images from low-resolution (LR) inputs. Existing methods like autoregressive models, generative adversarial networks (GANs), and denoising diffusion probability models (DDPMs) have limitations in image quality or sampling efficiency. This paper proposes Rough Set-DDPM, a new super-resolution technique combining rough set theory and DDPMs. The rough set formulation divides the DDPM sampling sequence into optimal sub-columns by minimizing roughness of sample sets. Particle swarm optimization identifies the sub-columns with lowest roughness. Rough Set-DDPM applies iterative denoising on these optimal columns to output HR images. Experiments on the FFHQ dataset validate that Rough Set-DDPM improves DDPM sampling efficiency while maintaining image fidelity. Quantitative results show Rough Set-DDPM requires fewer sampling steps and generates higher quality HR images compared to autoregressive models and GANs. By enhancing DDPM sampling, Rough Set-DDPM provides an effective approach to super-resolution that balances image quality and sampling speed. The key contributions include introducing rough sets to optimize DDPM sampling and demonstrating superior performance over existing methods

    Posterior Covariance Information Criterion

    Full text link
    We introduce an information criterion, PCIC, for predictive evaluation based on quasi-posterior distributions. It is regarded as a natural generalisation of the widely applicable information criterion (WAIC) and can be computed via a single Markov chain Monte Carlo run. PCIC is useful in a variety of predictive settings that are not well dealt with in WAIC, including weighted likelihood inference and quasi-Bayesian predictio
    corecore