Search CORE

Coalescent simulation has become an indispensable tool in population genetics and many complex evolutionary scenarios have been incorporated into the basic algorithm. Despite many years of intense interest in spatial structure, however, there are no available methods to simulate the ancestry of a sample of genes that occupy a spatial continuum. This is mainly due to the severe technical problems encountered by the classical model of isolation by distance. A recently introduced model solves these technical problems and provides a solid theoretical basis for the study of populations evolving in continuous space. We present a detailed algorithm to simulate the coalescent process in this model, and provide an efficient implementation of a generalised version of this algorithm as a freely available Python module

IST Austria: PubRep (Institute of Science and Technology)

Genetic relatedness through the lens of tree sequences

Author: Gorjanc Gregor
Kelleher Jerome
Lehmann Brieuc
Ralph Peter L.
Tsambos Georgia
Publication venue
Publication date: 28/03/2022
Field of study

Edinburgh Research Explorer

A general and efficient representation of ancestral recombination graphs

Author: Gorjanc Gregor
Ignatieva Anastasia
Kelleher Jerome
Koskela Jere
Wohns Anthony W
Wong Yan
Publication venue
Publication date: 03/11/2023
Field of study

Edinburgh Research Explorer

Measuring the degree of starshape in genealogies - summary statistics and demographic inference

Author: Baudry
Braverman
Di Rienzo
Fay
Fu
Fu
Galtier
Glinka
Harpending
Hudson
Jennings
JEROME KELLEHER
Kliman
KONRAD LOHSE
Kuhner
Pluzhnikov
Schierup
Schneider
Simonsen
Slatkin
Tajima
Tajima
Uyenoyama
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/08/2009
Field of study

Crossref

Edinburgh Research Explorer

Bayesian inference of ancestral recombination graphs

Author: Balding David
Chan Yao-ban
Kelleher Jerome
Koskela Jere
Mahmoudi Ali
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2022
Field of study

We present a novel algorithm, implemented in the software ARGinfer, for probabilistic inference of the Ancestral Recombination Graph under the Coalescent with Recombination. Our Markov Chain Monte Carlo algorithm takes advantage of the Succinct Tree Sequence data structure that has allowed great advances in simulation and point estimation, but not yet probabilistic inference. Unlike previous methods, which employ the Sequentially Markov Coalescent approximation, ARGinfer uses the Coalescent with Recombination, allowing more accurate inference of key evolutionary parameters. We show using simulations that ARGinfer can accurately estimate many properties of the evolutionary history of the sample, including the topology and branch lengths of the genealogical tree at each sequence site, and the times and locations of mutation and recombination events. ARGinfer approximates posterior probability distributions for these and other quantities, providing interpretable assessments of uncertainty that we show to be well calibrated. ARGinfer is currently limited to tens of DNA sequences of several hundreds of kilobases, but has scope for further computational improvements to increase its applicability

PubMed Central

UCL Discovery

Warwick Research Archives Portal Repository

Oxford University Research Archive

A community-maintained standard library of population genetic models

Author: Adrion Jeffrey R.
Baumdicker Franz
Carlson Jedidiah
Cartwright Reed A.
Cole Christopher B.
Dukler Noah
Durvasula Arun
Galloway Jared G.
Gladstein Ariella L.
Gower Graham
Gravel Simon
Gronau Ilan
Gutenkunst Ryan N.
Kelleher Jerome
Kern Andrew D.
Kim Bernard Y.
Kyriazis Christopher C.
Lohmueller Kirk E.
McKenzie Patrick
Messer Philipp W.
Noskova Ekaterina
Ortega-Del Vecchyo Diego
Racimo Fernando
Ragsdale Aaron P.
Ralph Peter L.
Schrider Daniel R.
Siepel Adam
Struck Travis J.
Tsambos Georgia
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 01/01/2020
Field of study

The explosion in population genomic data demands ever more complex modes of analysis, and increasingly, these analyses depend on sophisticated simulations. Recent advances in population genetic simulation have made it possible to simulate large and complex models, but specifying such models for a particular simulation engine remains a difficult and error-prone task. Computational genetics researchers currently re-implement simulation models independently, leading to inconsistency and duplication of effort. This situation presents a major barrier to empirical researchers seeking to use simulations for power analyses of upcoming studies or sanity checks on existing genomic data. Population genetics, as a field, also lacks standard benchmarks by which new tools for inference might be measured. Here, we describe a new resource, stdpopsim, that attempts to rectify this situation. Stdpopsim is a community-driven open source project, which provides easy access to a growing catalog of published simulation models from a range of organisms and supports multiple simulation engine backends. This resource is available as a well-documented python library with a simple command-line interface. We share some examples demonstrating how stdpopsim can be used to systematically compare demographic inference methods, and we encourage a broader community of developers to contribute to this growing resource.Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

Copenhagen University Research Information System

The University of Arizona

Efficient ancestry and mutation simulation with msprime 1.0

Author: Baumdicker Franz
Bisschop Gertjan
Eldon Bjarki
Ellerman Castedo E.
Galloway Jared G.
Gladstein Ariella L.
Goldstein Daniel
Gorjanc Gregor
Gower Graham
Gravel Simon
Guo Bing
Jeffery Ben
Kelleher Jerome
Kern Andrew D.
Koskela Jere
Kretzschmar Warren W.
Lohse Konrad
Matschiner Michael
Nelson Dominic
Pope Nathaniel S.
Quinto-Cortés Consuelo D.
Ragsdale Aaron P.
Ralph Peter L.
Rodrigues Murillo F.
Saunack Kumar
Sellinger Thibaut
Thornton Kevin
Tsambos Georgia
van Kemenade Hugo
Wohns Anthony W.
Wong H. Yan
Zhu Sha
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/09/2021
Field of study

Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement

Copenhagen University Research Information System

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

Warwick Research Archives Portal Repository

htsget: a protocol for securely streaming genomic data

Author: Alback C.H.
Birney Ewan
Davies Robert
Glazer David
Gonzalez Cristina Y.
Gourtovaia Marina
Jackson David K.
Keane Thomas M.
Kelleher Jerome
Kemp Aaron
Lin Mike
Marshall John
Nowak Andrew
Senf Alexander
Tovar-Corona Jaime M.
Vikhorev Alexander
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

Summary: Standardized interfaces for efficiently accessing high-throughput sequencing data are a fundamental requirement for large-scale genomic data sharing. We have developed htsget, a protocol for secure, efficient and reliable access to sequencing read and variation data. We demonstrate four independent client and server implementations, and the results of a comprehensive interoperability demonstration. Availability and implementation: http://samtools.github.io/hts-specs/htsget.html Supplementary information: Supplementary data are available at Bioinformatics online

Enlighten

Efficient ancestry and mutation simulation with msprime 1.0

Author: Baumdicker Franz
Bisschop Gertjan
Eldon Bjarki
Ellerman Castedo E.
Galloway Jared G.
Gladstein Ariella L.
Goldstein Daniel
Gorjanc Gregor
Gower Graham
Gravel Simon
Guo Bing
Jeffery Ben
Kelleher Jerome
Kern Andrew D.
Koskela Jere
Kretzschmar Warren W.
Lohse Konrad
Matschiner Michael
Nelson Dominic
Pope Nathaniel S.
Quinto-Cortés Consuelo D.
Ragsdale Aaron P.
Ralph Peter L.
Rodrigues Murillo F.
Saunack Kumar
Sellinger Thibaut
Thornton Kevin
Tsambos Georgia
van Kemenade Hugo
Wohns Anthony W.
Wong H. Yan
Zhu Sha
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/09/2021
Field of study

Edinburgh Research Explorer