Search CORE

35 research outputs found

Scaling analysis of MCMC algorithms

Author: Thiéry Alexandre H.
Publication venue
Publication date
Field of study

Markov Chain Monte Carlo (MCMC) methods have become a workhorse for modern scientific computations. Practitioners utilize MCMC in many different areas of applied science yet very few rigorous results are available for justifying the use of these methods. The purpose of this dissertation is to analyse random walk type MCMC algorithms in several limiting regimes that frequently occur in applications. Scaling limits arguments are used as a unifying method for studying the asymptotic complexity of these MCMC algorithms. Two distinct strands of research are developed: (a) We analyse and prove diffusion limit results for MCMC algorithms in high or infinite dimensional state spaces. Contrarily to previous results in the literature, the target distributions that we consider do not have a product structure; this leads to Stochastic Partial Differential Equation (SPDE) limits. This proves among other things that optimal proposals results already known for product form target distributions extend to much more general settings. We then show how to use these MCMC algorithms in an infinite dimensional Hilbert space in order to imitate a gradient descent without computing any derivative. (b) We analyse the behaviour of the Random Walk Metropolis (RWM) algorithm when used to explore target distributions concentrating on the neighbourhood of a low dimensional manifold of Rn. We prove that the algorithm behaves, after being suitably rescaled, as a diffusion process evolving on a manifold

Warwick Research Archives Portal Repository

Noisy gradient flow from a random walk in Hilbert space

Author: Pillai Natesh S.
Stuart Andrew M.
Thiéry Alexandre H.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/06/2014
Field of study

Consider a probability measure on a Hilbert space defined via its density with respect to a Gaussian. The purpose of this paper is to demonstrate that an appropriately defined Markov chain, which is reversible with respect to the measure in question, exhibits a diffusion limit to a noisy gradient flow, also reversible with respect to the same measure. The Markov chain is defined by applying a Metropolis–Hastings accept–reject mechanism (Tierney, Ann Appl Probab 8:1–9, 1998) to an Ornstein–Uhlenbeck (OU) proposal which is itself reversible with respect to the underlying Gaussian measure. The resulting noisy gradient flow is a stochastic partial differential equation driven by a Wiener process with spatial correlation given by the underlying Gaussian structure. There are two primary motivations for this work. The first concerns insight into Monte Carlo Markov Chain (MCMC) methods for sampling of measures on a Hilbert space defined via a density with respect to a Gaussian measure. These measures must be approximated on finite dimensional spaces of dimension N in order to be sampled. A conclusion of the work herein is that MCMC methods based on prior-reversible OU proposals will explore the target measure in O(1) steps with respect to dimension N. This is to be contrasted with standard MCMC methods based on the random walk or Langevin proposals which require O(N) and O(N^(1/3)) steps respectively (Mattingly et al., Ann Appl Prob 2011; Pillai et al., Ann Appl Prob 22:2320–2356 2012). The second motivation relates to optimization. There are many applications where it is of interest to find global or local minima of a functional defined on an infinite dimensional Hilbert space. Gradient flow or steepest descent is a natural approach to this problem, but in its basic form requires computation of a gradient which, in some applications, may be an expensive or complex task. This paper shows that a stochastic gradient descent described by a stochastic partial differential equation can emerge from certain carefully specified Markov chains. This idea is well-known in the finite state (Kirkpatricket al., Science 220:671–680, 1983; Cerny, J Optim Theory Appl 45:41–51, 1985) or finite dimensional context (German, IEEE Trans Geosci Remote Sens 1:269–276, 1985; German, SIAM J Control Optim 24:1031, 1986; Chiang, SIAM J Control Optim 25:737–753, 1987; J Funct Anal 83:333–347, 1989). The novelty of the work in this paper is that the emergence of the noisy gradient flow is developed on an infinite dimensional Hilbert space. In the context of global optimization, when the noise level is also adjusted as part of the algorithm, methods of the type studied here go by the name of simulated–annealing; see the review (Bertsimas and Tsitsiklis, Stat Sci 8:10–15, 1993) for further references. Although we do not consider adjusting the noise-level as part of the algorithm, the noise strength is a tuneable parameter in our construction and the methods developed here could potentially be used to study simulated annealing in a Hilbert space setting. The transferable idea behind this work is that conceiving of algorithms directly in the infinite dimensional setting leads to methods which are robust to finite dimensional approximation. We emphasize that discretizing, and then applying standard finite dimensional techniques in ℝ^N, to either sample or optimize, can lead to algorithms which degenerate as the dimension N increases

Caltech Authors

Noisy gradient flow from a random walk in Hilbert space

Author: Pillai Natesh S.
Stuart Andrew M.
Thiéry Alexandre H.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/06/2014
Field of study

Comparison of calling pipelines for whole genome sequencing: an empirical study demonstrating the importance of mapping and alignment

Author: Aguilera-Garcia Domingo
Betschart Raphael O
Blankenberg Stefan
Moch Holger
Thiéry Alexandre
Twerenbold Raphael
Zeller Tanja
Ziegler Andreas
Zoche Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/12/2022
Field of study

Rapid advances in high-throughput DNA sequencing technologies have enabled the conduct of whole genome sequencing (WGS) studies, and several bioinformatics pipelines have become available. The aim of this study was the comparison of 6 WGS data pre-processing pipelines, involving two mapping and alignment approaches (GATK utilizing BWA-MEM2 2.2.1, and DRAGEN 3.8.4) and three variant calling pipelines (GATK 4.2.4.1, DRAGEN 3.8.4 and DeepVariant 1.1.0). We sequenced one genome in a bottle (GIAB) sample 70 times in different runs, and one GIAB trio in triplicate. The truth set of the GIABs was used for comparison, and performance was assessed by computation time, F1 score, precision, and recall. In the mapping and alignment step, the DRAGEN pipeline was faster than the GATK with BWA-MEM2 pipeline. DRAGEN showed systematically higher F1 score, precision, and recall values than GATK for single nucleotide variations (SNVs) and Indels in simple-to-map, complex-to-map, coding and non-coding regions. In the variant calling step, DRAGEN was fastest. In terms of accuracy, DRAGEN and DeepVariant performed similarly and both superior to GATK, with slight advantages for DRAGEN for Indels and for DeepVariant for SNVs. The DRAGEN pipeline showed the lowest Mendelian inheritance error fraction for the GIAB trios. Mapping and alignment played a key role in variant calling of WGS, with the DRAGEN outperforming GATK

ZORA

OCT-GAN: single step shadow and noise removal from optical coherence tomography images of the human optic nerve head

Author: Aung Tin
Boote Craig
Buist Martin L.
Cheong Haris
Chuangsuwanich Thanadet
Girard Michaël J. A.
Krishna Devalla Sripad
Schmetterer Leopold
Thiéry Alexandre H.
Tun Tin A.
Wang Xiaofei
Publication venue: 'The Optical Society'
Publication date: 01/03/2021
Field of study

Speckle noise and retinal shadows within OCT B-scans occlude important edges, fine textures and deep tissues, preventing accurate and robust diagnosis by algorithms and clinicians. We developed a single process that successfully removed both noise and retinal shadows from unseen single-frame B-scans within 10.4ms. Mean average gradient magnitude (AGM) for the proposed algorithm was 57.2% higher than current state-of-the-art, while mean peak signal to noise ratio (PSNR), contrast to noise ratio (CNR), and structural similarity index metric (SSIM) increased by 11.1%, 154% and 187% respectively compared to single-frame B-scans. Mean intralayer contrast (ILC) improvement for the retinal nerve fiber layer (RNFL), photoreceptor layer (PR) and retinal pigment epithelium (RPE) layers decreased from 0.362 ± 0.133 to 0.142 ± 0.102, 0.449 ± 0.116 to 0.0904 ± 0.0769, 0.381 ± 0.100 to 0.0590 ± 0.0451 respectively. The proposed algorithm reduces the necessity for long image acquisition times, minimizes expensive hardware requirements and reduces motion artifacts in OCT images

Online Research @ Cardiff

The genomic origins of the world’s first farmers

The precise genetic origins of the first Neolithic farming populations in Europe and Southwest Asia, as well as the processes and the timing of their differentiation, remain largely unknown. Demogenomic modeling of high-quality ancient genomes reveals that the early farmers of Anatolia and Europe emerged from a multiphase mixing of a Southwest Asian population with a strongly bottlenecked western hunter-gatherer population after the last glacial maximum. Moreover, the ancestors of the first farmers of Europe and Anatolia went through a period of extreme genetic drift during their westward range expansion, contributing highly to their genetic distinctiveness. This modeling elucidates the demographic processes at the root of the Neolithic transition and leads to a spatial interpretation of the population history of Southwest Asia and Europe during the late Pleistocene and early Holocene.Open access articleThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

Repository for Publications and Research Data

VU Research Portal

PubMed Central

The University of Arizona

Leiden University Scholary Publications

Bern Open Repository and Information System (BORIS)

MPG.PuRe

Defining the importance of landscape metrics for large branchiopod biodiversity and conservation: the case of the Iberian Peninsula and Balearic Islands

Author: A Asem
A Lumbreras
A Mabidi
A Thiéry
A Thiéry
A Waterkeyn
A-C Weibull
Alexandre Miró
B Samraoui
B Samraoui
BJ O’Neill
BJ Robson
BV Timms
C Strobl
D Belk
D Boix
D Boix
D Boix
D Borcard
D Verdiell-Cubedo
Dani Boix
David Cunillera-Montcusí
David Verdiell-Cubedo
DC Rogers
DC Rogers
DG Angeler
DL Hall
DM Griffith
E Eder
E Eder
EC Underwood
ER Roeck De
F Amat
F Amat
F Marrone
F Marrone
F Prunier
F Scanabissi
F Stoch
Florent Prunier
Francisco Amat
G Alfonso
G Mura
G Mura
H Ben Naceur
I Zacharias
J García de Lomas
J Muñoz
Javier Ripoll
JB Gallego-Fernández
Joan Lluís Pretus
Jordi Sala
José Luis Pérez-Bote
Juan García-de-Lomas
Juan Rueda
K Martens
L Boven
L Brendonck
L Cancela da Fonseca
L Rhazi
L Rhazi
L Rosati
Laura Serrano
Luís Cancela da Fonseca
M Alonso
M Alonso
M Alonso
M Alonso
M Broeck Van den
M Broeck Van den
M Korn
M Korn
M Korn
M Logan
M Machado
M Machado
M Nourisson
M Sahuquillo
M Sahuquillo
Marc Ventura
Margarida Cristo
Margarida Machado
Margarita Florencio
Maria Rosa Miracle
María Sahuquillo
MC Peel
Miguel Alonso
MJ Crawley
MR Miracle
MS Ghomari
N Rabet
NH Euliss Jr
OE Sala
P Abellán
P Abellán
P Beja
PC Rodríguez-Flores
S Bagella
S Bagella
S Gascón
S Redón
Stéphanie Gascón
T Caro
T Hartel
T Hothorn
T Nhiwatiwa
T Nhiwatiwa
T Siqueira
V Cottarelli
Z Horváth
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The deficiency in the distributional data of invertebrate taxa is one of the major impediments acting on the bias towards the low awareness of its conservation status. The present study sets a basic framework to understand the large branchiopods distribution in the Iberian Peninsula and Balearic Islands. Since the extensive surveys performed in the late 1980s, no more studies existed updating the information for the whole studied area. The present study fills the gap, gathering together all available information on large branchiopods distribution since 1995, and analysing the effect of human population density and several landscape characteristics on their distribution, taking into consideration different spatial scales (100 m, 1 km and 10 km). In overall, 28 large branchiopod taxa (17 anostracans, 7 notostracans and 4 spinicaudatans) are known to occur in the area. Approximately 30% of the sites hosted multiple species, with a maximum of 6 species. Significant positive co-occurring species pairs were found clustered together, forming 4 different associations of large branchiopod species. In general, species clustered in the same group showed similar responses to analysed landscape characteristics, usually showing a better fit at higher spatial scales.Brazilian Conselho Nacional de Desenvolvimento Cientifico e Tecnologico-CNPq [401045/2014-5]Spanish Ministry of Education, Culture and Sport [FPU014/06783]info:eu-repo/semantics/publishedVersio

Crossref

Universidade de Lisboa: Repositório.UL

Sapientia

Digital.CSIC

Structural, hystochemical and cytochemical characteristics of the stigma and style in Passiflora edulis f. flavicarpa (Passifloraceae)

Crossref