19 research outputs found

    MSMC and MSMC2: the multiple sequentially markovian coalescent

    Get PDF
    The Multiple Sequentially Markovian Coalescent (MSMC) is a population genetic method and software for inferring demographic history and population structure through time from genome sequences. Here we describe the main program MSMC and its successor MSMC2. We go through all the necessary steps of processing genomic data from BAM files all the way to generating plots of inferred population size and separation histories. Some background on the methodology itself is provided, as well as bash scripts and python source code to run the necessary programs. The reader is also referred to community resources such as a mailing list and github repositories for further advice

    Optimality regions and fluctuations for Bernoulli last passage models

    Get PDF
    We study the sequence alignment problem and its independent version,the discrete Hammersley process with an exploration penalty. We obtain rigorous upper bounds for the number of optimality regions in both models near the soft edge.At zero penalty the independent model becomes an exactly solvable model and we identify cases for which the law of the last passage time converges to a Tracy-Widom law

    Papuan mitochondrial genomes and the settlement of Sahul

    Get PDF
    New Guineans represent one of the oldest locally continuous populations outside Africa, harboring among the greatest linguistic and genetic diversity on the planet. Archeological and genetic evidence suggest that their ancestors reached Sahul (present day New Guinea and Australia) by at least 55,000 years ago (kya). However, little is known about this early settlement phase or subsequent dispersal and population structuring over the subsequent period of time. Here we report 379 complete Papuan mitochondrial genomes from across Papua New Guinea, which allow us to reconstruct the phylogenetic and phylogeographic history of northern Sahul. Our results support the arrival of two groups of settlers in Sahul within the same broad time window (50–65 kya), each carrying a different set of maternal lineages and settling Northern and Southern Sahul separately. Strong geographic structure in northern Sahul remains visible today, indicating limited dispersal over time despite major climatic, cultural, and historical changes. However, following a period of isolation lasting nearly 20 ky after initial settlement, environmental changes postdating the Last Glacial Maximum stimulated diversification of mtDNA lineages and greater interactions within and beyond Northern Sahul, to Southern Sahul, Wallacea and beyond. Later, in the Holocene, populations from New Guinea, in contrast to those of Australia, participated in early interactions with incoming Asian populations from Island Southeast Asia and continuing into Oceania

    Approximate Bayesian computation with deep learning supports a third archaic introgression in Asia and Oceania

    No full text
    Since anatomically modern humans dispersed Out of Africa, the evolutionary history of Eurasian populations has been marked by introgressions from presently extinct hominins. Some of these introgressions have been identified using sequenced ancient genomes (Neanderthal and Denisova). Other introgressions have been proposed for still unidentified groups using the genetic diversity present in current human populations. We built a demographic model based on deep learning in an Approximate Bayesian Computation framework to infer the evolutionary history of Eurasian populations including past introgression events in Out of Africa populations fitting the current genetic evidence. In addition to the reported Neanderthal and Denisovan introgressions, our results support a third introgression in all Asian and Oceanian populations from an archaic population. This population is either related to the Neanderthal-Denisova clade or diverged early from the Denisova lineage. We propose the use of deep learning methods for clarifying situations with high complexity in evolutionary genomics.M.M was supported by the European Union through the European Regional Development Fund (Project No. 2014-2020.4.01.16-0030). For J.B, this study has been possible thanks to grant BFU2016-77961-P (AEI/FEDER, UE) awarded by the Agencia Estatal de Investigación (MINECO, Spain) and with the support of Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya (GRC 2017 SGR 702). Part of the “Unidad de Excelencia María de Maeztu”, funded by the MINECO (ref: MDM-2014-0370). O.L. was supported by a Ramón y Cajal grant from the Spanish Ministerio de Economia y Competitividad (MINECO) with reference RYC-2013-14797, a BFU2015-68759-P (MINECO/FEDER) grant and the support of Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya (GRC 2017 SGR 937)
    corecore