Search CORE

9,393 research outputs found

Genealogies of rapidly adapting populations

Author: Bedford
Brunet
Brunet
Cohen
Drummond
Hermisson
Krone
O. Hallatschek
R. A. Neher
Rouzine
Stephan
Tsimring
Yule
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 10/12/2012
Field of study

The genetic diversity of a species is shaped by its recent evolutionary history and can be used to infer demographic events or selective sweeps. Most inference methods are based on the null hypothesis that natural selection is a weak or infrequent evolutionary force. However, many species, particularly pathogens, are under continuous pressure to adapt in response to changing environments. A statistical framework for inference from diversity data of such populations is currently lacking. Toward this goal, we explore the properties of genealogies in a model of continual adaptation in asexual populations. We show that lineages trace back to a small pool of highly fit ancestors, in which almost simultaneous coalescence of more than two lineages frequently occurs. While such multiple mergers are unlikely under the neutral coalescent, they create a unique genetic footprint in adapting populations. The site frequency spectrum of derived neutral alleles, for example, is non-monotonic and has a peak at high frequencies, whereas Tajima's D becomes more and more negative with increasing sample size. Since multiple merger coalescents emerge in many models of rapid adaptation, we argue that they should be considered as a null-model for adapting populations.Comment: to appear in PNA

arXiv.org e-Print Archive

A Strategy analysis for genetic association studies with known inbreeding

Author: Bertolino Francesco
Biino Ginevra
Cabras Stefano
Castellanos Maria Eugenia
Casula Laura
Del Giacco Stefano
Persico Ivana
Pirastu Mario
Pirastu Nicola
Sassu Alessandro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Background: Association studies consist in identifying the genetic variants which are related to a specific disease through the use of statistical multiple hypothesis testing or segregation analysis in pedigrees. This type of studies has been very successful in the case of Mendelian monogenic disorders while it has been less successful in identifying genetic variants related to complex diseases where the insurgence depends on the interactions between different genes and the environment. The current technology allows to genotype more than a million of markers and this number has been rapidly increasing in the last years with the imputation based on templates sets and whole genome sequencing. This type of data introduces a great amount of noise in the statistical analysis and usually requires a great number of samples. Current methods seldom take into account gene-gene and gene-environment interactions which are fundamental especially in complex diseases. In this paper we propose to use a non-parametric additive model to detect the genetic variants related to diseases which accounts for interactions of unknown order. Although this is not new to the current literature, we show that in an isolated population, where the most related subjects share also most of their genetic code, the use of additive models may be improved if the available genealogical tree is taken into account. Specifically, we form a sample of cases and controls with the highest inbreeding by means of the Hungarian method, and estimate the set of genes/environmental variables, associated with the disease, by means of Random Forest. Results: We have evidence, from statistical theory, simulations and two applications, that we build a suitable procedure to eliminate stratification between cases and controls and that it also has enough precision in identifying genetic variants responsible for a disease. This procedure has been successfully used for the betathalassemia, which is a well known Mendelian disease, and also to the common asthma where we have identified candidate genes that underlie to the susceptibility of the asthma. Some of such candidate genes have been also found related to common asthma in the current literature. Conclusions: The data analysis approach, based on selecting the most related cases and controls along with the Random Forest model, is a powerful tool for detecting genetic variants associated to a disease in isolated populations. Moreover, this method provides also a prediction model that has accuracy in estimating the unknown disease status and that can be generally used to build kit tests for a wide class of Mendelian diseases

Archivio istituzionale della ricerca - Università di Trieste

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Cagliari

UnissResearch

Stochastic modelling, Bayesian inference, and new in vivo measurements elucidate the debated mtDNA bottleneck mechanism

Author: Aiken
Apostolova
Bacman
Beaumont
Bergstrom
Bogenhagen
Brand
Bredenoord
Burgstaller
Burgstaller
Bustin
Cao
Cao
Capps
Chae
Chinnery
Craven
Cree
de Vries
Deffieu
Detmer
Drost
Elson
Futuyma
Gardiner
Gilkerson
Gillespie
Greene
Hastings
Hill
Hoitzing
Jacobs
Jakobs
Jenuth
Johnston
Johnston
Johnston
Khrapko
Kimura
Kukat
Lawson
Lee
Lightowlers
Marchington
Marjoram
Monnot
Monnot
Mouli
Nunnari
Paull
Poe
Poovathingal
Poulton
Poulton
Rand
Rausenberger
Rossignol
Samuels
Sentelle
Steffann
Steinborn
Sunnåker
Tachibana
Toni
Treff
Twig
Twig
Wai
Wallace
Wallace
Wolff
Wonnapinij
Wonnapinij
Youle
Youle
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 29/05/2015
Field of study

Dangerous damage to mitochondrial DNA (mtDNA) can be ameliorated during mammalian development through a highly debated mechanism called the mtDNA bottleneck. Uncertainty surrounding this process limits our ability to address inherited mtDNA diseases. We produce a new, physically motivated, generalisable theoretical model for mtDNA populations during development, allowing the first statistical comparison of proposed bottleneck mechanisms. Using approximate Bayesian computation and mouse data, we find most statistical support for a combination of binomial partitioning of mtDNAs at cell divisions and random mtDNA turnover, meaning that the debated exact magnitude of mtDNA copy number depletion is flexible. New experimental measurements from a wild-derived mtDNA pairing in mice confirm the theoretical predictions of this model. We analytically solve a mathematical description of this mechanism, computing probabilities of mtDNA disease onset, efficacy of clinical sampling strategies, and effects of potential dynamic interventions, thus developing a quantitative and experimentally-supported stochastic theory of the bottleneck.Comment: Main text: 14 pages, 5 figures; Supplement: 17 pages, 4 figures; Total: 31 pages, 9 figure

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

PubMed Central

Spiral - Imperial College Digital Repository

Selection strategies for randomly partitioned genetic replicators

Author: Rondelez Yannick
Zadorin Anton S.
Publication venue: 'American Physical Society (APS)'
Publication date: 15/05/2019
Field of study

The amplification cycle of many replicators (natural or artificial) involves the usage of a host compartment, inside of which the replicator express phenotypic compounds necessary to carry out its genetic replication. For example, viruses infect cells, where they express their own proteins and replicate. In this process, the host cell boundary limits the diffusion of the viral protein products, thereby ensuring that phenotypic compounds, such as proteins, promote the replication of the genes that encoded them. This role of maintaining spatial co-localization, also called genotype-phenotype linkage, is a critical function of compartments in natural selection. In most cases however, individual replicating elements do not distribute systematically among the hosts, but are randomly partitioned. Depending on the replicator-to-host ratio, more than one variant may thus occupy some compartments, blurring the genotype-phenotype linkage and affecting the effectiveness of natural selection. We derive selection equations for a variety of such random multiple occupancy situations, in particular considering the effect of replicator population polymorphism and internal replication dynamics. We conclude that the deleterious effect of random multiple occupancy on selection is relatively benign, and may even completely vanish is some specific cases. In addition, given that higher mean occupancy allows larger populations to be channeled through the selection process, and thus provide a better exploration of phenotypic diversity, we show that it may represent a valid strategy in both natural and technological cases.Comment: 36 pages, 7 figure

arXiv.org e-Print Archive

Bayesian modeling of recombination events in bacterial populations

Author: A Baldwin
A Baldwin
A Baldwin
A Rambaut
A Skalka
Adam Baldwin
C Fraser
Chris Dowson
CP Robert
CX Chan
D Falush
D Husmeier
D Posada
DJ Hand
E Mahenthiralingam
E Mahenthiralingam
EHL Aarts
Eshwar Mahenthiralingam
FM Cohan
J Corander
J Corander
J Corander
J Corander
J Felsenstein
J Hein
J Maynard Smith
JG Lawrence
JS Sinsheimer
Jukka Corander
JV Braun
M Arenas
M Hasegawa
MA Suchard
MJ Schervish
NC Grassly
P Marttinen
Pekka Marttinen
R Jain
RA Elton
S Sawyer
SA Sisson
VN Minin
VN Minin
William P Hanage
WJ Wiersinga
WP Hanage
X Didelot
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background: We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results: We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion: A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/ mnf//mate/jc/software/brat.html

Crossref

Online Research @ Cardiff

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo

Author: Barnes C
Cornebise J
Filippi S
Stumpf MPH
Publication venue
Publication date: 01/01/2011
Field of study

Approximate Bayesian computation (ABC) has gained popularity over the past few years for the analysis of complex models arising in population genetics, epidemiology and system biology. Sequential Monte Carlo (SMC) approaches have become work-horses in ABC. Here we discuss how to construct the perturbation kernels that are required in ABC SMC approaches, in order to construct a sequence of distributions that start out from a suitably defined prior and converge towards the unknown posterior. We derive optimality criteria for different kernels, which are based on the Kullback-Leibler divergence between a distribution and the distribution of the perturbed particles. We will show that for many complicated posterior distributions, locally adapted kernels tend to show the best performance. We find that the added moderate cost of adapting kernel functions is easily regained in terms of the higher acceptance rate. We demonstrate the computational efficiency gains in a range of toy examples which illustrate some of the challenges faced in real-world applications of ABC, before turning to two demanding parameter inference problems in molecular biology, which highlight the huge increases in efficiency that can be gained from choice of optimal kernels. We conclude with a general discussion of the rational choice of perturbation kernels in ABC SMC settings

CiteSeerX

UCL Discovery