12 research outputs found

    Asian wild rice is a hybrid swarm with extensive gene flow and feralization from domesticated rice

    Get PDF
    The domestication history of rice remains controversial, with multiple studies reaching different conclusions regarding its origin(s). These studies have generally assumed that populations of living wild rice, O. rufipogon, are descendants of the ancestral population that gave rise to domesticated rice, but relatively little attention has been paid to the origins and history of wild rice itself. Here, we investigate the genetic ancestry of wild rice by analyzing a diverse panel of rice genomes consisting of 203 domesticated and 435 wild rice accessions. We show that most modern wild rice is heavily admixed with domesticated rice through both pollen- and seed-mediated gene flow. In fact, much presumed wild rice may simply represent different stages of feralized domesticated rice. In line with this hypothesis, many presumed wild rice varieties show remnants of the effects of selective sweeps in previously identified domestication genes, as well as evidence of recent selection in flowering genes possibly associated with the feralization process. Furthermore, there is a distinct geographical pattern of gene flow from aus, indica, and japonica varieties into colocated wild rice. We also show that admixture from aus and indica is more recent than gene flow from japonica, possibly consistent with an earlier spread of japonica varieties. We argue that wild rice populations should be considered a hybrid swarm, connected to domesticated rice by continuous and extensive gene flow

    Genomic population structure of freshwater-resident and anadromous ide (<i>Leuciscus idus</i>) in north-western Europe

    Get PDF
    Climate change experts largely agree that future climate change and associated rises in oceanic water levels over the upcoming decades, will affect marine salinity levels. The subsequent effects on fish communities in estuarine ecosystems however, are less clear. One species that is likely to become increasingly affected by changes in salinity is the ide (Leuciscus idus). The ide is a stenohaline freshwater fish that primarily inhabits rivers, with frequent anadromous behavior when sea salinity does not exceed 15%. Unlike most other anadromous Baltic Sea fish species, the ide has yet to be subjected to large‐scale stocking programs, and thus provides an excellent opportunity for studying the natural population structure across the current salinity gradient in the Danish Belts. To explore this, we used Genotyping‐by‐Sequencing to determine genomic population structure of both freshwater resident and anadromous ide populations in the western Baltic Sea region, and relate the results to the current salinity gradient and the demographic history of ide in the region. The sample sites separate into four clusters, with all anadromous populations in one cluster and the freshwater resident populations in the remaining three. Results demonstrate high level of differentiation between sites hosting freshwater resident populations, but little differentiation among anadromous populations. Thus ide exhibit the genomic population structure of both a typical freshwater species, and a typical anadromous species. In addition to providing a first insight into the population structure of north‐western European ide, our data also (1) provide indications of a single illegal introduction by man; (2) suggest limited genetic effects of heavy pollution in the past; and (3) indicate possible historical anadromous behavior in a now isolated freshwater population

    Genomic characterization of a South American <i>Phytophthora </i>hybrid mandates reassessment of the geographic origins of <i>Phytophthora infestans</i>

    Get PDF
    As the oomycete pathogen causing potato late blight disease, Phytophthora infestans triggered the famous 19th-century Irish potato famine and remains the leading cause of global commercial potato crop destruction. But the geographic origin of the genotype that caused this devastating initial outbreak remains disputed, as does the New World center of origin of the species itself. Both Mexico and South America have been proposed, generating considerable controversy. Here, we readdress the pathogen’s origins using a genomic data set encompassing 71 globally sourced modern and historical samples of P. infestans and the hybrid species P. andina, a close relative known only from the Andean highlands. Previous studies have suggested that the nuclear DNA lineage behind the initial outbreaks in Europe in 1845 is now extinct. Analysis of P. andina’s phased haplotypes recovered eight haploid genome sequences, four of which represent a previously unknown basal lineage of P. infestans closely related to the famine-era lineage. Our analyses further reveal that clonal lineages of both P. andina and historical P. infestans diverged earlier than modern Mexican lineages, casting doubt on recent claims of a Mexican center of origin. Finally, we use haplotype phasing to demonstrate that basal branches of the clade comprising Mexican samples are occupied by clonal isolates collected from wild Solanum hosts, suggesting that modern Mexican P. infestans diversified on Solanum tuberosum after a host jump from a wild species and that the origins of P. infestans are more complex than was previously thought

    Estimating IBD tracts from low coverage NGS data

    No full text
    MotivationThe amount of IBD in an individual depends on the relatedness of the individual's parents. However, it can also provide information regarding mating system, past history and effective size of the population from which the individual has been sampled.ResultsHere, we present a new method for estimating inbreeding IBD tracts from low coverage NGS data. Contrary to other methods that use genotype data, the one presented here uses genotype likelihoods to take the uncertainty of the data into account. We benchmark it under a wide range of biologically relevant conditions and show that the new method provides a marked increase in accuracy even at low coverage.Availability and implementationThe methods presented in this work were implemented in C/C ++ and are freely available for non-commercial use from https://github.com/fgvieira/ngsF-HMM CONTACT: [email protected] informationSupplementary data are available at Bioinformatics online

    Estimating inbreeding coefficients from NGS data:impact on genotype calling and allele frequency estimation

    No full text
    Most methods for next-generation sequencing (NGS) data analyses incorporate information regarding allele frequencies using the assumption of Hardy–Weinberg equilibrium (HWE) as a prior. However, many organisms including those that are domesticated, partially selfing, or with asexual life cycles show strong deviations from HWE. For such species, and specially for low-coverage data, it is necessary to obtain estimates of inbreeding coefficients (F) for each individual before calling genotypes. Here, we present two methods for estimating inbreeding coefficients from NGS data based on an expectation-maximization (EM) algorithm. We assess the impact of taking inbreeding into account when calling genotypes or estimating the site frequency spectrum (SFS), and demonstrate a marked increase in accuracy on low-coverage highly inbred samples. We demonstrate the applicability and efficacy of these methods in both simulated and real data sets

    <i>ngsTools </i>:methods for population genetics analyses from next-generation sequencing data

    Get PDF
    Next-generation sequencing technologies produce short reads that are either de novo assembled or mapped to a reference genome. Genotypes and/or single-nucleotide polymorphisms are then determined from the read composition at each site, which become the basis for many downstream analyses. However, for low sequencing depths, e.g. , there is considerable statistical uncertainty in the assignment of genotypes because of random sampling of homologous base pairs in heterozygotes and sequencing or alignment errors. Recently, several probabilistic methods have been proposed to account for this uncertainty and make accurate inferences from low quality and/or coverage sequencing data. We present ngsTools, a collection of programs to perform population genetics analyses from next-generation sequencing data. The methods implemented in these programs do not rely on single-nucleotide polymorphism or genotype calling and are particularly suitable for low sequencing depth data

    The power of inbreeding:NGS-based GWAS of rice reveals convergent evolution during rice domestication

    No full text
    Low-coverage whole-genome sequencing is an effective strategy for genome-wide association studies in humans, due to the availability of large reference panels for genotype imputation. However, it is unclear whether this strategy can be utilized in other species without reference panels. Using simulations, we show that this approach is even more relevant in inbred species such as rice (Oryza sativa L.), which are&nbsp;effectively haploid, allowing easy haplotype construction and imputation-based genotype calling, even without the availability of large reference panels. We sequenced 203 rice varieties with well-characterized phenotypes from the United States Department of Agriculture Rice Mini-Core Collection at an average depth of 1.5× and used the data for mapping three traits. For the first two traits, amylose content and seed length, our approach leads to direct identification of the previously identified causal SNPs in the major-effect loci. For the third trait, pericarp color, an important trait underwent selection during domestication, we identified a new major-effect locus. Although known loci can explain color variation in the varieties of two main subspecies of Asian domesticated rice, japonica and indica, the new locus identified is unique to another domesticated rice subgroup, aus, and together with existing loci, can fully explain&nbsp;the major variation in pericarp color in aus. Our discovery of a unique genetic basis of white pericarp in aus&nbsp;provides an example of convergent evolution during rice domestication and suggests that aus may have a domestication history independent of japonica and indica

    The origin and evolution of maize in the Southwestern United States

    No full text
    The origin of maize (Zea mays mays) in the US Southwest remains contentious, with conflicting archaeological data supporting either coastal1,​2,​3,​4 or highland5,6 routes of diffusion of maize into the United States. Furthermore, the genetics of adaptation to the new environmental and cultural context of the Southwest is largely uncharacterized7. To address these issues, we compared nuclear DNA from 32 archaeological maize samples spanning 6,000 years of evolution to modern landraces. We found that the initial diffusion of maize into the Southwest about 4,000 years ago is likely to have occurred along a highland route, followed by gene flow from a lowland coastal maize beginning at least 2,000 years ago. Our population genetic analysis also enabled us to differentiate selection during domestication for adaptation to the climatic and cultural environment of the Southwest, identifying adaptation loci relevant to drought tolerance and sugar content.No Full Tex
    corecore