Search CORE

42 research outputs found

Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution

Author: Carl Edward Rasmussen
CE Antoniak
CE Rasmussen
D Blackwell
Dilan Görür
H Ishwaran
M Forina
MD Escobar
MD Escobar
P Green
P Müller
R Fisher
RM Neal
SN MacEachern
SN MacEachern
TS Ferguson
WR Gilks
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Generalized Bayesian Record Linkage and Regression with Exact Error Propagation

Author: A Tancredi
B Liseo
G Kim
H Goldstein
H Yamato
J Copas
J Pitman
M Hof
M Sadinle
P Blasi De
P Christen
P Lahiri
R Gutman
R Gutman
RC Steorts
RC Steorts
RM Neal
SN MacEachern
Publication venue
Publication date: 01/01/2018
Field of study

Record linkage (de-duplication or entity resolution) is the process of merging noisy databases to remove duplicate entities. While record linkage removes duplicate entities from such databases, the downstream task is any inferential, predictive, or post-linkage task on the linked data. One goal of the downstream task is obtaining a larger reference data set, allowing one to perform more accurate statistical analyses. In addition, there is inherent record linkage uncertainty passed to the downstream task. Motivated by the above, we propose a generalized Bayesian record linkage method and consider multiple regression analysis as the downstream task. Records are linked via a random partition model, which allows for a wide class to be considered. In addition, we jointly model the record linkage and downstream task, which allows one to account for the record linkage uncertainty exactly. Moreover, one is able to generate a feedback propagation mechanism of the information from the proposed Bayesian record linkage model into the downstream task. This feedback effect is essential to eliminate potential biases that can jeopardize resulting downstream task. We apply our methodology to multiple linear regression, and illustrate empirically that the "feedback effect" is able to improve the performance of record linkage.Comment: 18 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Probabilistic Clustering of Time-Evolving Distance Data

Author: AK Jain
AY Ng
C Leslie
CP Robert
D Blei
DD Lee
DM Blei
Gunnar Rätsch
H Saigo
J Pitman
Julia E. Vogt
M Bilodeau
Marius Kloft
MB Eisen
MS Srivastava
P McCullagh
P McCullagh
RM Neal
S Sonnenburg
Sandhya Prabhakaran
SN MacEachern
Stefan Stark
Sudhir S. Raman
SVN Vishwanathan
TS Ferguson
TW Anderson
Volker Roth
WJ Ewens
Publication venue
Publication date: 01/01/2015
Field of study

We present a novel probabilistic clustering model for objects that are represented via pairwise distances and observed at different time points. The proposed method utilizes the information given by adjacent time points to find the underlying cluster structure and obtain a smooth cluster evolution. This approach allows the number of objects and clusters to differ at every time point, and no identification on the identities of the objects is needed. Further, the model does not require the number of clusters being specified in advance -- they are instead determined automatically using a Dirichlet process prior. We validate our model on synthetic data showing that the proposed method is more accurate than state-of-the-art clustering methods. Finally, we use our dynamic clustering model to analyze and illustrate the evolution of brain cancer patients over time

arXiv.org e-Print Archive

Crossref

edoc

Probabilistic clustering of time-evolving distance data

Author: AK Jain
AY Ng
C Leslie
CP Robert
D Blei
DD Lee
DM Blei
Gunnar Rätsch
H Saigo
J Pitman
Julia E. Vogt
M Bilodeau
Marius Kloft
MB Eisen
MS Srivastava
P McCullagh
P McCullagh
RM Neal
S Sonnenburg
Sandhya Prabhakaran
SN MacEachern
Stefan Stark
Sudhir S. Raman
SVN Vishwanathan
TS Ferguson
TW Anderson
Volker Roth
WJ Ewens
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Analysis of paediatric visual acuity using Bayesian copula models with sinh-arcsinh marginal densities

We analyse paediatric ophthalmic data from a large sample of children aged between 3 and 8 years. We modify the Bayesian additive conditional bivariate copula regression model of Klein and Kneib [1] by using sinh-arcsinh marginal densities with location, scale and shape parameters that depend smoothly on a covariate. We perform Bayesian inference about the unknown quantities of our model using a specially tailored Markov chain Monte Carlo algorithm. We gain new insights about the processes which determine transformations in visual acuity with respect to age, including the nature of joint changes in both eyes as modelled with the age-related copula dependence parameter. We analyse posterior predictive distributions to identify children with unusual sight characteristics, distinguishing those who are bivariate, but not univariate outliers. In this way we provide an innovative tool that enables clinicians to identify children with unusual sight who may otherwise be missed. We compare our simultaneous Bayesian method with the two-step frequentist generalized additive modelling approach of Vatter and Chavez-Demoulin [2]

Crossref

UCL Discovery

Plymouth Electronic Archive and Research Library

Forest modelling: the gamma shape mixture model and simulation of tree diameter distributions

Author: A Jasra
A Kong
A Porté
AM Garay
AP Dempster
CP Robert
CP Robert
EL Lehmann
FA Roesch
FW Scholz
FW Scholz
G Gratzer
H Pretzsch
I Lopez-de-Ullibarri
J Diebolt
J Merganič
JH Ahrens
JH Ahrens
JH Gove
JM Matuszkiewicz
JR Thompson
JS Denslow
JS Liu
KR Gehringer
LJ Zhang
M Jamshidian
M Zasada
R Development Core Team
R Podlaski
R Podlaski
R Podlaski
R Podlaski
R Podlaski
Rafał Podlaski
RO Lawton
S Venturini
S Venturini
SC Kou
SN MacEachern
T Buch-Larsen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Restricting exchangeable nonparametric distributions

Author: Ghahramani Z
MacEachern SN
Williamson S
Xing EP
Publication venue
Publication date
Field of study

Distributions over exchangeable matrices with infinitely many columns, such as the Indian buffet process, are useful in constructing nonparametric latent variable models. However, the distribution implied by such models over the number of features exhibited by each data point may be poorly- suited for many modeling tasks. In this paper, we propose a class of exchangeable nonparametric priors obtained by restricting the domain of existing models. Such models allow us to specify the distribution over the number of features per data point, and can achieve better performance on data sets where the number of features is not well-modeled by the original distribution

CUED - Cambridge University Engineering Department

Geometric Sensitivity Measures for Bayesian Nonparametric Density Estimation Models

Author: A Bean
A Bhattacharyya
CA Bush
CR Rao
D Blackwell
D Görür
F Ruggeri
F Ruggeri
H Ishwaran
H Zhu
IL Dryden
J Lee
J Sethuraman
JE Griffin
JE Griffin
JE Oakley
JO Berger
JO Berger
K Roeder
L Yang
LE Nieto-Barajas
M Roos
MA Newton
MD Escobar
MD Escobar
P Fearnhead
P Gustafson
P Gustafson
P Müller
RB Millar
RE Kass
RM Neal
S Ghosal
S Kurtek
S Kurtek
S Richardson
S Walker
SG Walker
SG Walker
SJ Gershman
SN MacEachern
SN MacEachern
SN MacEachern
SN MacEachern
TS Ferguson
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A stochastic approach to quantifying the blur with uncertainty estimation for high-energy X-ray imaging systems

Author: Barnea G
Barnea G
Bishop CM
Dashti M
Fox C
Hansen PC
MacEachern SN
Oliver BV
Portillo S
Rudin L
Smith J
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Simulation and inference algorithms for stochastic biochemical reaction networks: from basic concepts to state-of-the-art

Author: Arkin A
Beaumont MA
David J. Warne
Gelman A
MacEachern SN
Matthew J. Simpson
Michaelis L
Nunes MA
Ruth E. Baker
Tavaré S
Warne DJ
Wilkinson DJ
Publication venue: 'The Royal Society'
Publication date
Field of study

Crossref