42 research outputs found

    Safe and complete contig assembly via omnitigs

    Full text link
    Contig assembly is the first stage that most assemblers solve when reconstructing a genome from a set of reads. Its output consists of contigs -- a set of strings that are promised to appear in any genome that could have generated the reads. From the introduction of contigs 20 years ago, assemblers have tried to obtain longer and longer contigs, but the following question was never solved: given a genome graph GG (e.g. a de Bruijn, or a string graph), what are all the strings that can be safely reported from GG as contigs? In this paper we finally answer this question, and also give a polynomial time algorithm to find them. Our experiments show that these strings, which we call omnitigs, are 66% to 82% longer on average than the popular unitigs, and 29% of dbSNP locations have more neighbors in omnitigs than in unitigs.Comment: Full version of the paper in the proceedings of RECOMB 201

    Parallel effects of the inversion In(3R)Payne on body size across the North American and Australian clines in Drosophila melanogaster

    Get PDF
    Chromosomal inversions are thought to play a major role in climatic adaptation. In D. melanogaster, the cosmopolitan inversion In(3R)Payne exhibits latitudinal clines on multiple continents. As many fitness traits show similar clines, it is tempting to hypothesize that In(3R)P underlies observed clinal patterns for some of these traits. In support of this idea, previous work in Australian populations has demonstrated that In(3R)P affects body size but not development time or cold resistance. However, similar data from other clines of this inversion are largely lacking; finding parallel effects of In(3R)P across multiple clines would considerably strengthen the case for clinal selection. Here, we have analysed the phenotypic effects of In(3R)P in populations originating from the endpoints of the latitudinal cline along the North American east coast. We measured development time, egg‐to‐adult survival, several size‐related traits (femur and tibia length, wing area and shape), chill coma recovery, oxidative stress resistance and triglyceride content in homokaryon lines carrying In(3R)P or the standard arrangement. Our central finding is that the effects of In(3R)P along the North American cline match those observed in Australia: standard arrangement lines were larger than inverted lines, but the inversion did not influence development time or cold resistance. Similarly, In(3R)P did not affect egg‐to‐adult survival, oxidative stress resistance and lipid content. In(3R)P thus seems to specifically affect size traits in populations from both continents. This parallelism strongly suggests an adaptive pattern, whereby the inversion has captured alleles associated with growth regulation and clinal selection acts on size across both continents

    Geographic Variation in Genomic Signals of Admixture Between Two Closely Related European Sepsid Fly Species.

    Get PDF
    UNLABELLED The extent of interspecific gene flow and its consequences for the initiation, maintenance, and breakdown of species barriers in natural systems remain poorly understood. Interspecific gene flow by hybridization may weaken adaptive divergence, but can be overcome by selection against hybrids, which may ultimately promote reinforcement. An informative step towards understanding the role of gene flow during speciation is to describe patterns of past gene flow among extant species. We investigate signals of admixture between allopatric and sympatric populations of the two closely related European dung fly species Sepsis cynipsea and S. neocynipsea (Diptera: Sepsidae). Based on microsatellite genotypes, we first inferred a baseline demographic history using Approximate Bayesian Computation. We then used genomic data from pooled DNA of natural and laboratory populations to test for past interspecific gene flow based on allelic configurations discordant with the inferred population tree (ABBA-BABA test with D-statistic). Comparing the detected signals of gene flow with the contemporary geographic relationship among interspecific pairs of populations (sympatric vs. allopatric), we made two contrasting observations. At one site in the French Cevennes, we detected an excess of past interspecific gene flow, while at two sites in Switzerland we observed lower signals of past microsatellite genotypes gene flow among populations in sympatry compared to allopatric populations. These results suggest that the species boundaries between these two species depend on the past and/or present eco-geographic context in Europe, which indicates that there is no uniform link between contemporary geographic proximity and past interspecific gene flow in natural populations. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s11692-023-09612-5

    Geographic Variation in Genomic Signals of Admixture Between Two Closely Related European Sepsid Fly Species

    Full text link
    The extent of interspecific gene flow and its consequences for the initiation, maintenance, and breakdown of species barriers in natural systems remain poorly understood. Interspecific gene flow by hybridization may weaken adaptive divergence, but can be overcome by selection against hybrids, which may ultimately promote reinforcement. An informative step towards understanding the role of gene flow during speciation is to describe patterns of past gene flow among extant species. We investigate signals of admixture between allopatric and sympatric populations of the two closely related European dung fly species Sepsis cynipsea and S. neocynipsea (Diptera: Sepsidae). Based on microsatellite genotypes, we first inferred a baseline demographic history using Approximate Bayesian Computation. We then used genomic data from pooled DNA of natural and laboratory populations to test for past interspecific gene flow based on allelic configurations discordant with the inferred population tree (ABBA–BABA test with D-statistic). Comparing the detected signals of gene flow with the contemporary geographic relationship among interspecific pairs of populations (sympatric vs. allopatric), we made two contrasting observations. At one site in the French Cevennes, we detected an excess of past interspecific gene flow, while at two sites in Switzerland we observed lower signals of past microsatellite genotypes gene flow among populations in sympatry compared to allopatric populations. These results suggest that the species boundaries between these two species depend on the past and/or present eco-geographic context in Europe, which indicates that there is no uniform link between contemporary geographic proximity and past interspecific gene flow in natural populations

    Cold adaptation drives population genomic divergence in the ecological specialist, Drosophila montana

    Get PDF
    Funding: UK Natural Environment Research Council (Grant Number(s): NE/L501852/1, NE/P000592/1); Academy of Finland (GrantNumber(s): 267244, 268214, 322980), Ella ja Georg Ehrnroothin SÀÀtiö.Detecting signatures of ecological adaptation in comparative genomics is challenging, but analysing population samples with characterised geographic distributions, such as clinal variation, can help identify genes showing covariation with important ecological variation. Here, we analysed patterns of geographic variation in the cold-adapted species Drosophila montana across phenotypes, genotypes and environmental conditions and tested for signatures of cold adaptation in population genomic divergence. We first derived the climatic variables associated with the geographic distribution of 24 populations across two continents to trace the scale of environmental variation experienced by the species, and measured variation in the cold tolerance of the flies of six populations from different geographic contexts. We then performed pooled whole genome sequencing of these six populations, and used Bayesian methods to identify SNPs where genetic differentiation is associated with both climatic variables and the population phenotypic measurements, while controlling for effects of demography and population structure. The top candidate SNPs were enriched on the X and fourth chromosomes, and they also lay near genes implicated in other studies of cold tolerance and population divergence in this species and its close relatives. We conclude that ecological adaptation has contributed to the divergence of D. montana populations throughout the genome and in particular on the X and fourth chromosomes, which also showed highest interpopulation FST. This study demonstrates that ecological selection can drive genomic divergence at different scales, from candidate genes to chromosome-wide effects.Publisher PDFPeer reviewe

    Drosophila Evolution over Space and Time (DEST): A New Population Genomics Resource

    Get PDF
    Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome data sets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate data sets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in >20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This data set, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental metadata. A web-based genome browser and web portal provide easy access to the SNP data set. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan data set. Our resource will enable population geneticists to analyze spatiotemporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail.We thank four reviewers and the handling editor for helpful comments on previous versions of our manuscript. We are grateful to the members of the DrosEU and DrosRTEC consortia for their long-standing support, collaboration, and for discussion. DrosEU was funded by a Special Topic Networks (STN) grant from the European Society for Evolutionary Biology (ESEB). M.K. was supported by the Austrian Science Foundation (grant no. FWF P32275); J.G. by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (H2020-ERC-2014-CoG-647900) and by the Spanish Ministry of Science and Innovation (BFU-2011-24397); T.F. by the Swiss National Science Foundation (SNSF grants PP00P3_133641, PP00P3_165836, and 31003A_182262) and a Mercator Fellowship from the German Research Foundation (DFG), held as a EvoPAD Visiting Professor at the Institute for Evolution and Biodiversity, University of MĂŒnster; AOB by the National Institutes of Health (R35 GM119686); M.K. by Academy of Finland grant 322980; V.L. by Danish Natural Science Research Council (FNU) (grant no. 4002-00113B); FS Deutsche Forschungsgemeinschaft (DFG) (grant no. STA1154/4-1), Project 408908608; J.P. by the Deutsche Forschungsgemeinschaft Projects 274388701 and 347368302; A.U. by FPI fellowship (BES-2012-052999); ET Israel Science Foundation (ISF) (grant no. 1737/17); M.S.V., M.S.R. and M.J. by a grant from the Ministry of Education, Science and Technological Development of the Republic of Serbia (451-03-68/2020-14/200178); A.P., K.E. and M.T. by a grant from the Ministry of Education, Science and Technological Development of the Republic of Serbia (451-03-68/2020-14/200007); and TM NSERC grant RGPIN-2018-05551. The authors acknowledge Research Computing at The University of Virginia for providing computational resources and technical support that have contributed to the results reported within this publication (https://rc.virginia.edu, last accessed September 6, 2021)

    Drosophila evolution over space and time (DEST):A new population genomics resource

    Get PDF
    Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome datasets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate datasets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in > 20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This dataset, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental meta-data. A web-based genome browser and web portal provide easy access to the SNP dataset. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan dataset. Our resource will enable population geneticists to analyze spatio-temporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail.DrosEU is funded by a Special Topic Networks (STN) grant from the European Society for Evolutionary Biology (ESEB). MK (M. Kapun) was supported by the Austrian Science Foundation (grant no. FWF P32275); JG by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (H2020-ERC-2014-CoG-647900) and by the Spanish Ministry of Science and Innovation (BFU-2011-24397); TF by the Swiss National Science Foundation (SNSF grants PP00P3_133641, PP00P3_165836, and 31003A_182262) and a Mercator Fellowship from the German Research Foundation (DFG), held as a EvoPAD Visiting Professor at the Institute for Evolution and Biodiversity, University of MĂŒnster; AOB by the National Institutes of Health (R35 GM119686); MK (M. Kankare) by Academy of Finland grant 322980; VL by Danish Natural Science Research Council (FNU) grant 4002-00113B; FS Deutsche Forschungsgemeinschaft (DFG) grant STA1154/4-1, Project 408908608; JP by the Deutsche Forschungsgemeinschaft Projects 274388701 and 347368302; AU by FPI fellowship (BES-2012-052999); ET Israel Science Foundation (ISF) grant 1737/17; MSV, MSR and MJ by a grant from the Ministry of Education, Science and Technological Development of the Republic of Serbia (451-03-68/2020-14/200178); AP, KE and MT by a grant from the Ministry of Education, Science and Technological Development of the Republic of Serbia (451-03-68/2020-14/200007); and TM NSERC grant RGPIN-2018-05551.Peer reviewe

    Isolation of a natural DNA virus of <i>Drosophila melanogaster</i>, and characterisation of host resistance and immune responses

    Get PDF
    <div><p><i>Drosophila melanogaster</i> has played a key role in our understanding of invertebrate immunity. However, both functional and evolutionary studies of host-virus interaction in <i>Drosophila</i> have been limited by a dearth of native virus isolates. In particular, despite a long history of virus research, DNA viruses of <i>D</i>. <i>melanogaster</i> have only recently been described, and none have been available for experimental study. Here we report the isolation and comprehensive characterisation of Kallithea virus, a large double-stranded DNA virus, and the first DNA virus to have been reported from wild populations of <i>D</i>. <i>melanogaster</i>. We find that Kallithea virus infection is costly for adult flies, reaching high titres in both sexes and disproportionately reducing survival in males, and movement and late fecundity in females. Using the <i>Drosophila</i> Genetic Reference Panel, we quantify host genetic variance for virus-induced mortality and viral titre and identify candidate host genes that may underlie this variation, including <i>Cdc42-interacting protein 4</i>. Using full transcriptome sequencing of infected males and females, we examine the transcriptional response of flies to Kallithea virus infection and describe differential regulation of virus-responsive genes. This work establishes Kallithea virus as a new tractable model to study the natural interaction between <i>D</i>. <i>melanogaster</i> and DNA viruses, and we hope it will serve as a basis for future studies of immune responses to DNA viruses in insects.</p></div

    Combining experimental evolution with next-generation sequencing: a powerful tool to study adaptation from standing genetic variation

    Get PDF
    Evolve and resequence (E&R) is a new approach to investigate the genomic responses to selection during experimental evolution. By using whole genome sequencing of pools of individuals (Pool-Seq), this method can identify selected variants in controlled and replicable experimental settings. Reviewing the current state of the field, we show that E&R can be powerful enough to identify causative genes and possibly even single-nucleotide polymorphisms. We also discuss how the experimental design and the complexity of the trait could result in a large number of false positive candidates. We suggest experimental and analytical strategies to maximize the power of E&R to uncover the genotype–phenotype link and serve as an important research tool for a broad range of evolutionary questions.C Schlötterer, R Kofler, E Versace, R Tobler and SU Fransse
    corecore