Search CORE

434 research outputs found

Manual for mcclust.ext R package

Author: Wade Sara
Publication venue
Publication date: 14/05/2015
Field of study

This R package provides post-processing tools for MCMC samples of partitions to summarize the posterior in Bayesian clustering models. Functions for point estimation are provided, giving a single representative clustering of the posterior. And, to characterize uncertainty in the point estimate, credible balls can be computed

Warwick Research Archives Portal Repository

Bayesian Cluster Analysis

Author: Wade Sara K
Publication venue
Publication date: 15/05/2023
Field of study

Edinburgh Research Explorer

Shared Differential Clustering across Single-cell RNA Sequencing Datasets with the Hierarchical Dirichlet Process

Author: Bochkina Natalia
Liua Jinlu
Wade Sara
Publication venue
Publication date: 17/02/2024
Field of study

Single-cell RNA sequencing (scRNA-seq) is powerful technology that allows researchers to understand gene expression patterns at the single-cell level. However, analysing scRNA-seq data is challenging due to issues and biases in data collection. In this work, we construct an integrated Bayesian model that simultaneously addresses normalization, imputation and batch effects and also nonparametrically clusters cells into groups across multiple datasets. A Gibbs sampler based on a finite-dimensional approximation of the HDP is developed for posterior inference

Edinburgh Research Explorer

Ultra-fast Deep Mixtures of Gaussian Process Experts

Author: Etienam Clement
Law Kody
Wade Sara
Publication venue
Publication date: 11/06/2020
Field of study

Mixtures of experts have become an indispensable tool for flexible modelling in a supervised learning context, and sparse Gaussian processes (GP) have shown promise as a leading candidate for the experts in such models. In the present article, we propose to design the gating network for selecting the experts from such mixtures of sparse GPs using a deep neural network (DNN). This combination provides a flexible, robust, and efficient model which is able to significantly outperform competing models. We furthermore consider efficient approaches to computing maximum a posteriori (MAP) estimators of these models by iteratively maximizing the distribution of experts given allocations and allocations given experts. We also show that a recently introduced method called Cluster-Classify-Regress (CCR) is capable of providing a good approximation of the optimal solution extremely quickly. This approximation can then be further refined with the iterative algorithm

arXiv.org e-Print Archive

Leveraging variational autoencoders for multiple data imputation

Author: Roskams-Hieter Breeshey
Wade Sara
Wells Jude
Publication venue
Publication date: 30/09/2022
Field of study

Missing data persists as a major barrier to data analysis across numerous applications. Recently, deep generative models have been used for imputation of missing data, motivated by their ability to capture highly non-linear and complex relationships in the data. In this work, we investigate the ability of deep models, namely variational autoencoders (VAEs), to account for uncertainty in missing data through multiple imputation strategies. We find that VAEs provide poor empirical coverage of missing data, with underestimation and overconfident imputations, particularly for more extreme missing data values. To overcome this, we employ

\beta

-VAEs, which viewed from a generalized Bayes framework, provide robustness to model misspecification. Assigning a good value of

\beta

is critical for uncertainty calibration and we demonstrate how this can be achieved using cross-validation. In downstream tasks, we show how multiple imputation with

\beta

-VAEs can avoid false discoveries that arise as artefacts of imputation.Comment: 17 pages, 3 main figures, 6 supplementary figure

arXiv.org e-Print Archive

Pseudo-marginal Bayesian inference for Gaussian process latent variable models

Author: Gadd C.
Shah A. A.
Wade Sara K
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2021
Field of study

A Bayesian inference framework for supervised Gaussian process latent variable models is introduced. The framework overcomes the high correlations between latent variables and hyperparameters by collapsing the statistical model through approximate integration of the latent variables. Using an unbiased pseudo estimate for the marginal likelihood, the exact hyperparameter posterior can then be explored using collapsed Gibbs sampling and, conditional on these samples, the exact latent posterior can be explored through elliptical slice sampling. The framework is tested on both simulated and real examples. When compared with the standard approach based on variational inference, this approach leads to significant improvements in the predictive accuracy and quantification of uncertainty, as well as a deeper insight into the challenges of performing inference in this class of models

Edinburgh Research Explorer

Warwick Research Archives Portal Repository

Colombian Women’s Life Patterns: A Multivariate Density Regression Approach

Author: Antoniano-Villalobos Isadora
Cremaschi Andrea
Piccarreta Raffaella
Wade Sara K
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 12/01/2021
Field of study

Women in Colombia face difficulties related to the patriarchal traits of their societies and well-known conflict afflicting the country since 1948. In this critical context, our aim is to study the relationship between baseline socio-demographic factors and variables associated to fertility, partnership patterns, and work activity. To best exploit the explanatory structure, we propose a Bayesian multivariate density regression model, which can accommodate mixed responses with censored, constrained, and binary traits. The flexible nature of the models allows for nonlinear regression functions and non-standard features in the errors, such as asymmetry or multi-modality. The model has interpretable covariate-dependent weights constructed through normalization, allowing for combinations of categorical and continuous covariates. Computational difficulties for inference are overcome through an adaptive truncation algorithm combining adaptive Metropolis-Hastings and sequential Monte Carlo to create a sequence of automatically truncated posterior mixtures. For our study on Colombian women's life patterns, a variety of quantities are visualised and described, and in particular, our findings highlight the detrimental impact of family violence on women's choices and behaviors.Comment: to appear in Bayesian analysi

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della Ricerca - Bocconi

Edinburgh Research Explorer

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari