Search CORE

112 research outputs found

Bayesian solutions to the label switching problem

Author: A. Jasra
G. Celeux
G. McLachlan
G.K. Gerber
J. Diebolt
J. Geweke
J. Munkres
J.S. Liu
M. Hurn
M. Postman
M. Stephens
R.M. Neal
Publication venue: Teknillinen korkeakoulu
Publication date: 01/01/2008
Field of study

The label switching problem, the unidentifiability of the permutation of clusters or more generally latent variables, makes interpretation of results computed with MCMC sampling difficult. We introduce a fully Bayesian treatment of the permutations which performs better than alternatives. The method can be used to compute summaries of the posterior samples even for nonparametric Bayesian methods, for which no good solutions exist so far. Although being approximative in this case, the results are very promising. The summaries are intuitively appealing: A summarized cluster is defined as a set of points for which the likelihood of being in the same cluster is maximized

Crossref

Aaltodoc Publication Archive

Some discussions of D. Fearnhead and D. Prangle's Read Paper "Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation"

Author: Andrieu C
Barthelme S
Chopin N
Cornebise J
Doucet A
Girolami M
Jasra A
Kosmidis I
Lee A
Marin J-M
Pudlo P
Robert CP
Sedki M
Singh SS
Publication venue
Publication date: 10/07/2019
Field of study

This report is a collection of comments on the Read Paper of Fearnhead and Prangle (2011), to appear in the Journal of the Royal Statistical Society Series B, along with a reply from the authors

Spiral - Imperial College Digital Repository

An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration

While statisticians are well-accustomed to performing exploratory analysis in the modeling stage of an analysis, the notion of conducting preliminary general-purpose exploratory analysis in the Monte Carlo stage (or more generally, the model-fitting stage) of an analysis is an area which we feel deserves much further attention. Towards this aim, this paper proposes a general-purpose algorithm for automatic density exploration. The proposed exploration algorithm combines and expands upon components from various adaptive Markov chain Monte Carlo methods, with the Wang-Landau algorithm at its heart. Additionally, the algorithm is run on interacting parallel chains -- a feature which both decreases computational cost as well as stabilizes the algorithm, improving its ability to explore the density. Performance is studied in several applications. Through a Bayesian variable selection example, the authors demonstrate the convergence gains obtained with interacting chains. The ability of the algorithm's adaptive proposal to induce mode-jumping is illustrated through a trimodal density and a Bayesian mixture modeling application. Lastly, through a 2D Ising model, the authors demonstrate the ability of the algorithm to overcome the high correlations encountered in spatial models.Comment: 33 pages, 20 figures (the supplementary materials are included as appendices

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Polytechnique

Oskar Bordeaux

Analysis of ChIP-seq data via Bayesian finite mixture models with a non-parametric component

Author: A. Jasra
A. Nobile
C. E. Antoniak
C. E. Rodriguez
D. Nix
G. Celeux
G. Mclachlan
J. Diebolt
J. Wang
M. D. Escobar
M. Sperrin
M. Stephens
M. Stephens
P. F. Kuan
P. Green
S. Richardson
V. Hower
Y. F. M. Ramos
Y. Zhang
Z. S. Qin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In large discrete data sets which requires classification into signal and noise components, the distribution of the signal is often very bumpy and does not follow a standard distribution. Therefore the signal distribution is further modelled as a mixture of component distributions. However, when the signal component is modelled as a mixture of distributions, we are faced with the challenges of justifying the number of components and the label switching problem (caused by multimodality of the likelihood function). To circumvent these challenges, we propose a non-parametric structure for the signal component. This new method is more efficient in terms of precise estimates and better classifications. We demonstrated the efficacy of the methodology using a ChIP-sequencing data set

University of Essex Research Repository

Crossref

Brunel University Research Archive

Dynamic Mixture-of-Experts Models for Longitudinal and Discrete-Time Survival Data

Author: A Doucet
A Jasra
B Muth�n
C.-J Kim
D Cox
D Gamerman
D Nott
F Li
G Mclachlan
I Ntzoufras
J Franz�n
J Geweke
J Geweke
J Ibrahim
J Singer
J Vaupel
K Carling
K Huynh
K Mosler
L Baum
M Jordan
M Villani
M Villani
Matias Quiroz
Mattias Villani
P Allison
P Giordani
R E Kass
R Jacobs
R Kass
R Kohn
R Miller
S Fr�hwirth-Schnatter
S Fr�hwirth-Schnatter
S Richardson
T Jacobson
T Lancaster
T Shumway
X Xue
Y Qi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Crossref

A Novel Test for Gene-Ancestry Interactions in Genome-Wide Association Data

Author: A Chua
A Helgadottir
A Jasra
A Price
A Price
A Tenesa
B Zanke
C Constantine
Chris C. Holmes
D Lee
D Pollard
E Vlachodimitropoulou
E Zeggini
G Schwarz
H Cordell
I Tomlinson
I Tomlinson
Ian P. Tomlinson
J Delgado
J Novembre
J Wurzelmann
Jean-Baptiste Cazier
Joanna L. Davies
K Frazer
L Amundadottir
L Farrer
M Brookes
Malcolm G. Dunlop
N Chatterjee
N Patterson
N Patterson
P Broderick
P Donnelly
Peristera Paschou
R Nelson
Richard S. Houlston
RS Houlston
S Purcell
S Tuupanen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Genome-wide association study (GWAS) data on a disease are increasingly available from multiple related populations. In this scenario, meta-analyses can improve power to detect homogeneous genetic associations, but if there exist ancestry-specific effects, via interactions on genetic background or with a causal effect that co-varies with genetic background, then these will typically be obscured. To address this issue, we have developed a robust statistical method for detecting susceptibility gene-ancestry interactions in multi-cohort GWAS based on closely-related populations. We use the leading principal components of the empirical genotype matrix to cluster individuals into “ancestry groups” and then look for evidence of heterogeneous genetic associations with disease or other trait across these clusters. Robustness is improved when there are multiple cohorts, as the signal from true gene-ancestry interactions can then be distinguished from gene-collection artefacts by comparing the observed interaction effect sizes in collection groups relative to ancestry groups. When applied to colorectal cancer, we identified a missense polymorphism in iron-absorption gene CYBRD1 that associated with disease in individuals of English, but not Scottish, ancestry. The association replicated in two additional, independently-collected data sets. Our method can be used to detect associations between genetic variants and disease that have been obscured by population genetic heterogeneity. It can be readily extended to the identification of genetic interactions on other covariates such as measured environmental exposures. We envisage our methodology being of particular interest to researchers with existing GWAS data, as ancestry groups can be easily defined and thus tested for interactions

Public Library of Science (PLOS)

Crossref

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Oxford University Research Archive

Institute of Cancer Research Repository

FigShare

Assessing Order Effects in Online Community-based Health Forums

Author: A Abbasi
A Abbasi
A Ben-Sasson
A Ghose
A Ghose
A L Holbrook
A Moreno
A R Dennis
A Tversky
B Gu
B Kn�uper
B Liu
B Liu
C Shah
C Speelman
C.-P Wei
D A Schweidel
D Falbel
D Fallis
D J Woltz
D L Lasorsa
E Agichtein
E M Quinn
F North
F.-Y Wang
G Adomavicius
G Burtch
G D Logan
G S Wig
H Hu
I Blohm
J A Krosnick
J A Krosnick
J D Culver
J Jeon
J Klayman
J Mueller
J Nasiry
J Otterbacher
J P D&apos
J Wang
K A Carlson
K Goh
K R T Larsen
Keith Frey
L Bowler
L Papke
L R Wilkas
L Yan
M Chung
M Galesic
M Jasra
M Lash
M Liu
M Siering
M Weimer
N Huang
N M Bradburn
N Rajesh
N Schwarz
P Briggs
P Goes
P Lavrakas
P Scullard
P V Singh
Pew Research
Q Cao
R A Poldrack
R Aggarwal
R M Hogarth
R Y K Lau
Reza Mousavi
S Fox
S G Mcfarland
S Loria
S M Mudambi
S Oh
S Oh
S Oh
S R Das
S Spangler
S Stieglitz
T Sahni
T Wang
T. S. Raghu
V Storey
W Dou
W Jabr
W Li
X Li
X Luo
X.-B Li
Y Lu
Y Yao
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2015
Field of study

Measuring the quality of health content in online health forums is a challenging task. The majority of the existing measures are based on evaluations of forum users and may not be reliable. We employed machine learning techniques, text mining methods, and Big Data platforms to construct four measures of textual quality to automatically determine the similarity of a given answer to professional answers. We then used them to assess the quality of 66,888 answers posted on Yahoo! Answers Health section. All four measures of textual quality revealed a higher quality for asker-selected best answers indicating that askers, to some extent, have a proper judgment to select the best answers. We also studied the presence of order effects in online health forums. Our results suggest that the textual quality of the first answer positively influences the mean textual quality of the subsequent answers and negatively influences the quantity of subsequent answers

Crossref

AIS Electronic Library (AISeL)