5,473 research outputs found

    Population genomics of domestic and wild yeasts

    Get PDF
    The natural genetics of an organism is determined by the distribution of sequences of its genome. Here we present one- to four-fold, with some deeper, coverage of the genome sequences of over seventy isolates of the domesticated baker's yeast, _Saccharomyces cerevisiae_, and its closest relative, the wild _S. paradoxus_, which has never been associated with human activity. These were collected from numerous geographic locations and sources (including wild, clinical, baking, wine, laboratory and food spoilage). These sequences provide an unprecedented view of the population structure, natural (and artificial) selection and genome evolution in these species. Variation in gene content, SNPs, indels, copy numbers and transposable elements provide insights into the evolution of different lineages. Phenotypic variation broadly correlates with global genome-wide phylogenetic relationships however there is no correlation with source. _S. paradoxus_ populations are well delineated along geographic boundaries while the variation among worldwide _S. cerevisiae_ isolates show less differentiation and is comparable to a single _S. paradoxus_ population. Rather than one or two domestication events leading to the extant baker's yeasts, the population structure of _S. cerevisiae_ shows a few well defined geographically isolated lineages and many different mosaics of these lineages, supporting the notion that human influence provided the opportunity for outbreeding and production of new combinations of pre-existing variation

    Strong signature of natural selection within an FHIT intron implicated in prostate cancer risk

    Get PDF
    Previously, a candidate gene linkage approach on brother pairs affected with prostate cancer identified a locus of prostate cancer susceptibility at D3S1234 within the fragile histidine triad gene (FHIT), a tumor suppressor that induces apoptosis. Subsequent association tests on 16 SNPs spanning approximately 381 kb surrounding D3S1234 in Americans of European descent revealed significant evidence of association for a single SNP within intron 5 of FHIT. In the current study, resequencing and genotyping within a 28.5 kb region surrounding this SNP further delineated the association with prostate cancer risk to a 15 kb region. Multiple SNPs in sequences under evolutionary constraint within intron 5 of FHIT defined several related haplotypes with an increased risk of prostate cancer in European-Americans. Strong associations were detected for a risk haplotype defined by SNPs 138543, 142413, and 152494 in all cases (Pearson's χ2 = 12.34, df 1, P = 0.00045) and for the homozygous risk haplotype defined by SNPs 144716, 142413, and 148444 in cases that shared 2 alleles identical by descent with their affected brothers (Pearson's χ2 = 11.50, df 1, P = 0.00070). In addition to highly conserved sequences encompassing SNPs 148444 and 152413, population studies revealed strong signatures of natural selection for a 1 kb window covering the SNP 144716 in two human populations, the European American (π = 0.0072, Tajima's D= 3.31, 14 SNPs) and the Japanese (π = 0.0049, Fay & Wu's H = 8.05, 14 SNPs), as well as in chimpanzees (Fay & Wu's H = 8.62, 12 SNPs). These results strongly support the involvement of the FHIT intronic region in an increased risk of prostate cancer. © 2008 Ding et al

    Genome-wide tests for introgression between cactophilic Drosophila implicate a role of inversions during speciation

    Get PDF
    K.L. was funded by a junior research fellowship from the National Environmental Research Council, UK (NE/I020288/1, NBAF659).Models of speciation-with-gene-flow have shown that the reduction in recombination between alternative chromosome arrangements can facilitate the fixation of locally adaptive genes in the face of gene flow and contribute to speciation. However, it has proven frustratingly difficult to show empirically that inversions have reduced gene flow and arose during or shortly after the onset of species divergence rather than represent ancestral polymorphisms. Here, we present an analysis of whole genome data from a pair of cactophilic fruit flies, Drosophila mojavensis and D. arizonae, which are reproductively isolated in the wild and differ by several large inversions on three chromosomes. We found an increase in divergence at rearranged compared to colinear chromosomes. Using the density of divergent sites in short sequence blocks we fit a series of explicit models of species divergence in which gene flow is restricted to an initial period after divergence and may differ between colinear and rearranged parts of the genome. These analyses show that D. mojavensis and D. arizonae have experienced postdivergence gene flow that ceased around 270 KY ago and was significantly reduced in chromosomes with fixed inversions. Moreover, we show that these inversions most likely originated around the time of species divergence which is compatible with theoretical models that posit a role of inversions in speciation with gene flow.Publisher PDFPeer reviewe

    High levels of nucleotide diversity and fast decline of linkage disequilibrium in rye (Secale cereale L.) genes involved in frost response

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Rye (<it>Secale cereale </it>L.) is the most frost tolerant cereal species. As an outcrossing species, rye exhibits high levels of intraspecific diversity, which makes it well-suited for allele mining in genes involved in the frost responsive network. For investigating genetic diversity and the extent of linkage disequilibrium (LD) we analyzed eleven candidate genes and 37 microsatellite markers in 201 lines from five Eastern and Middle European rye populations.</p> <p>Results</p> <p>A total of 147 single nucleotide polymorphisms (SNPs) and nine insertion-deletion polymorphisms were found within 7,639 bp of DNA sequence from eleven candidate genes, resulting in an average SNP frequency of 1 SNP/52 bp. Nucleotide and haplotype diversity of candidate genes were high with average values <it>π </it>= 5.6 × 10<sup>-3 </sup>and <it>Hd </it>= 0.59, respectively. According to an analysis of molecular variance (AMOVA), most of the genetic variation was found between individuals within populations. Haplotype frequencies varied markedly between the candidate genes. <it>ScCbf14</it>, <it>ScVrn1</it>, and <it>ScDhn1 </it>were dominated by a single haplotype, while the other 8 genes (<it>ScCbf2</it>, <it>ScCbf6</it>, <it>ScCbf9b</it>, <it>ScCbf11</it>, <it>ScCbf12</it>, <it>ScCbf15</it>, <it>ScIce2</it>, and <it>ScDhn3</it>) had a more balanced haplotype frequency distribution. Intra-genic LD decayed rapidly, within approximately 520 bp on average. Genome-wide LD based on microsatellites was low.</p> <p>Conclusions</p> <p>The Middle European population did not differ substantially from the four Eastern European populations in terms of haplotype frequencies or in the level of nucleotide diversity. The low LD in rye compared to self-pollinating species promises a high resolution in genome-wide association mapping. SNPs discovered in the promoters or coding regions, which attribute to non-synonymous substitutions, are suitable candidates for association mapping.</p

    Local-scale patterns of genetic variability, outcrossing, and spatial structure in natural stands of Arabidopsis thaliana

    Get PDF
    As Arabidopsis thaliana is increasingly employed in evolutionary and ecological studies, it is essential to understand patterns of natural genetic variation and the forces that shape them. Previous work focusing mostly on global and regional scales has demonstrated the importance of historical events such as long-distance migration and colonization. Far less is known about the role of contemporary factors or environmental heterogeneity in generating diversity patterns at local scales. We sampled 1,005 individuals from 77 closely spaced stands in diverse settings around Tübingen, Germany. A set of 436 SNP markers was used to characterize genome-wide patterns of relatedness and recombination. Neighboring genotypes often shared mosaic blocks of alternating marker identity and divergence. We detected recent outcrossing as well as stretches of residual heterozygosity in largely homozygous recombinants. As has been observed for several other selfing species, there was considerable heterogeneity among sites in diversity and outcrossing, with rural stands exhibiting greater diversity and heterozygosity than urban stands. Fine-scale spatial structure was evident as well. Within stands, spatial structure correlated negatively with observed heterozygosity, suggesting that the high homozygosity of natural A. thaliana may be partially attributable to nearest-neighbor mating of related individuals. The large number of markers and extensive local sampling employed here afforded unusual power to characterize local genetic patterns. Contemporary processes such as ongoing outcrossing play an important role in determining distribution of genetic diversity at this scale. Local "outcrossing hotspots" appear to reshuffle genetic information at surprising rates, while other stands contribute comparatively little. Our findings have important implications for sampling and interpreting diversity among A. thaliana accessions.Financial support came from an NIH Ruth Kirschstein NRSA Postdoctoral Fellowship (KB), a Human Frontiers Science Program Postdoctoral Fellowship (RAL), grants DFG ERA-PG ARelatives and FP6 IP AGRON-OMICS (contract LSHG-CT-2006-037704), from a Gottfried Wilhelm Leibniz Award of the DFG, and the Max Planck Society (DW)

    Single-nucleotide polymorphism, linkage disequilibrium and geographic structure in the malaria parasite Plasmodium vivax: prospects for genome-wide association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ideal malaria parasite populations for initial mapping of genomic regions contributing to phenotypes such as drug resistance and virulence, through genome-wide association studies, are those with high genetic diversity, allowing for numerous informative markers, and rare meiotic recombination, allowing for strong linkage disequilibrium (LD) between markers and phenotype-determining loci. However, levels of genetic diversity and LD in field populations of the major human malaria parasite <it>P. vivax </it>remain little characterized.</p> <p>Results</p> <p>We examined single-nucleotide polymorphisms (SNPs) and LD patterns across a 100-kb chromosome segment of <it>P. vivax </it>in 238 field isolates from areas of low to moderate malaria endemicity in South America and Asia, where LD tends to be more extensive than in holoendemic populations, and in two monkey-adapted strains (Salvador-I, from El Salvador, and Belem, from Brazil). We found varying levels of SNP diversity and LD across populations, with the highest diversity and strongest LD in the area of lowest malaria transmission. We found several clusters of contiguous markers with rare meiotic recombination and characterized a relatively conserved haplotype structure among populations, suggesting the existence of recombination hotspots in the genome region analyzed. Both silent and nonsynonymous SNPs revealed substantial between-population differentiation, which accounted for ~40% of the overall genetic diversity observed. Although parasites clustered according to their continental origin, we found evidence for substructure within the Brazilian population of <it>P. vivax</it>. We also explored between-population differentiation patterns revealed by loci putatively affected by natural selection and found marked geographic variation in frequencies of nucleotide substitutions at the <it>pvmdr-1 </it>locus, putatively associated with drug resistance.</p> <p>Conclusion</p> <p>These findings support the feasibility of genome-wide association studies in carefully selected populations of <it>P. vivax</it>, using relatively low densities of markers, but underscore the risk of false positives caused by population structure at both local and regional levels.</p> <p>See commentary: <url>http://www.biomedcentral.com/1741-7007/8/90</url></p

    Epidemic clones, oceanic gene pools and eco-LD in the free living marine pathogen Vibrio parahaemolyticus

    Full text link
    We investigated global patterns of variation in 157 whole genome sequences of Vibrio parahaemolyticus, a free-living and seafood associated marine bacterium. Pandemic clones, responsible for recent outbreaks of gastroenteritis in humans have spread globally. However, there are oceanic gene pools, one located in the oceans surrounding Asia and another in the Mexican Gulf. Frequent recombination means that most isolates have acquired the genetic profile of their current location. We investigated the genetic structure in the Asian gene pool by calculating the effective population size in two different ways. Under standard neutral models, the two estimates should give similar answers but we found a thirty fold difference. We propose that this discrepancy is caused by the subdivision of the species into a hundred or more ecotypes which are maintained stably in the population. To investigate the genetic factors involved, we used 51 unrelated isolates to conduct a genome-wide scan for epistatically interacting loci. We found a single example of strong epistasis between distant genome regions. A majority of strains had a type VI secretion system associated with bacterial killing. The remaining strains had genes associated with biofilm formation and regulated by c-di-GMP signaling. All strains had one or other of the two systems and none of isolate had complete complements of both systems, although several strains had remnants. Further top-down analysis of patterns of linkage disequilibrium within frequently recombining species will allow a detailed understanding of how selection acts to structure the pattern of variation within natural bacterial populations

    Assessing the Performance of the Haplotype Block Model of Linkage Disequilibrium

    Get PDF
    Several recent studies have suggested that linkage disequilibrium (LD) in the human genome has a fundamentally “blocklike” structure. However, thus far there has been little formal assessment of how well the haplotype block model captures the underlying structure of LD. Here we propose quantitative criteria for assessing how blocklike LD is and apply these criteria to both real and simulated data. Analyses of several large data sets indicate that real data show a partial fit to the haplotype block model; some regions conform quite well, whereas others do not. Some improvement could be obtained by genotyping higher marker densities but not by increasing the number of samples. Nonetheless, although the real data are only moderately blocklike, our simulations indicate that, under a model of uniform recombination, the structure of LD would actually fit the block model much less well. Simulations of a model in which much of the recombination occurs in narrow hotspots provide a much better fit to the observed patterns of LD, suggesting that there is extensive fine-scale variation in recombination rates across the human genome
    corecore