Search CORE

6,639 research outputs found

Multimodal nested sampling: an efficient and robust alternative to MCMC methods for astronomical data analysis

Author: Alfano
Allanach
Basset
Beltran
Bennett
Bridges
Bryan
Dunkley
F. Feroz
Girshick
Hobson
Hobson
Jeffreys
Liddle
M. P. Hobson
MacKay
Marshall
Mukherjee
Niarchou
O'Ruanaidh
Shaw
Sivia
Skilling
Slosar
Trotta
Verde
Publication venue: 'Wiley'
Publication date: 23/07/2007
Field of study

In performing a Bayesian analysis of astronomical data, two difficult problems often emerge. First, in estimating the parameters of some model for the data, the resulting posterior distribution may be multimodal or exhibit pronounced (curving) degeneracies, which can cause problems for traditional MCMC sampling methods. Second, in selecting between a set of competing models, calculation of the Bayesian evidence for each model is computationally expensive. The nested sampling method introduced by Skilling (2004), has greatly reduced the computational expense of calculating evidences and also produces posterior inferences as a by-product. This method has been applied successfully in cosmological applications by Mukherjee et al. (2006), but their implementation was efficient only for unimodal distributions without pronounced degeneracies. Shaw et al. (2007), recently introduced a clustered nested sampling method which is significantly more efficient in sampling from multimodal posteriors and also determines the expectation and variance of the final evidence from a single run of the algorithm, hence providing a further increase in efficiency. In this paper, we build on the work of Shaw et al. and present three new methods for sampling and evidence evaluation from distributions that may contain multiple modes and significant degeneracies; we also present an even more efficient technique for estimating the uncertainty on the evaluated evidence. These methods lead to a further substantial improvement in sampling efficiency and robustness, and are applied to toy problems to demonstrate the accuracy and economy of the evidence calculation and parameter estimation. Finally, we discuss the use of these methods in performing Bayesian object detection in astronomical datasets.Comment: 14 pages, 11 figures, submitted to MNRAS, some major additions to the previous version in response to the referee's comment

arXiv.org e-Print Archive

Identifying Mixtures of Mixtures Using Bayesian Estimation

Author: Frühwirth-Schnatter Sylvia
Grün Bettina
Malsiner-Walli Gertraud
Publication venue: 'Informa UK Limited'
Publication date: 20/06/2016
Field of study

The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition this prior allows to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semi-parametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark data sets.Comment: 49 page

arXiv.org e-Print Archive

FigShare

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Directory of Open Access Journals

eScholarship - University of California

Spatial Guilds in the Serengeti Food Web Revealed by a Bayesian Group Model

Author: A Bodini
A Clauset
A Krause
A Sinclair
Andy P. Dobson
B Good
B Mau
BM Bolker
C Geyer
CS Elton
D Reagan
D Vesey-FitzGerald
DN Reed
E Rezende
E Thébault
Edward B. Baskerville
G Polis
G Schaller
H Jeffreys
H Jeffreys
H Jeong
H Kruuk
J Bascompte
J Dunne
J Lamprecht
J Memmott
J Teng
JE Cohen
JE Cohen
JJ Luczkovich
K McCann
L Camerano
L Talbot
Lauren Ancel Meyers
LM Talbot
M Girvan
M Murray
M Newman
M Poelchau
M Sales-Pardo
MA McCarthy
Mercedes Pascual
N Lartillot
N Martinez
N Rooney
P Hoff
R Casebeer
R Guimera
R Hansen
R Kass
R Williams
R Williams
RM May
S Allesina
S Allesina
S Cooper
S McNaughton
S McNaughton
S Pimm
SN de Visser
Stefano Allesina
T Burns
T Caro
T Duong
T Ferguson
T. Michael Anderson
TM Anderson
Trevor Bedford
V Novotny
Y Park
Y Wang
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/01/2011
Field of study

Food webs, networks of feeding relationships among organisms, provide fundamental insights into mechanisms that determine ecosystem stability and persistence. Despite long-standing interest in the compartmental structure of food webs, past network analyses of food webs have been constrained by a standard definition of compartments, or modules, that requires many links within compartments and few links between them. Empirical analyses have been further limited by low-resolution data for primary producers. In this paper, we present a Bayesian computational method for identifying group structure in food webs using a flexible definition of a group that can describe both functional roles and standard compartments. The Serengeti ecosystem provides an opportunity to examine structure in a newly compiled food web that includes species-level resolution among plants, allowing us to address whether groups in the food web correspond to tightly-connected compartments or functional groups, and whether network structure reflects spatial or trophic organization, or a combination of the two. We have compiled the major mammalian and plant components of the Serengeti food web from published literature, and we infer its group structure using our method. We find that network structure corresponds to spatially distinct plant groups coupled at higher trophic levels by groups of herbivores, which are in turn coupled by carnivore groups. Thus the group structure of the Serengeti web represents a mixture of trophic guild structure and spatial patterns, in contrast to the standard compartments typically identified in ecological networks. From data consisting only of nodes and links, the group structure that emerges supports recent ideas on spatial coupling and energy channels in ecosystems that have been proposed as important for persistence.Comment: 28 pages, 6 figures (+ 3 supporting), 2 tables (+ 4 supporting

arXiv.org e-Print Archive

Directory of Open Access Journals

Modeling and analysis of residential flexibility: timing of white good usage

Author: Benoit Dries
Demeester Thomas
Develder Chris
Sadeghianpourhamami Nasrin
Strobbe Matthias
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study