49 research outputs found
DNA sequence diversity and the efficiency of natural selection in animal mitochondrial DNA
Selection is expected to be more efficient in species that are more diverse because both the efficiency of natural selection and DNA sequence diversity are expected to depend upon the effective population size. We explore this relationship across a data set of 751 mammal species for which we have mitochondrial polymorphism data. We introduce a method by which we can examine the relationship between our measure of the efficiency of natural selection, the nonsynonymous relative to the synonymous nucleotide site diversity (πN/πS), and synonymous nucleotide diversity (πS), avoiding the statistical non-independence between the two quantities. We show that these two variables are strongly negatively and linearly correlated on a log scale. The slope is such that as πS doubles, πN/πS is reduced by 34%. We show that the slope of this relationship differs between the two phylogenetic groups for which we have the most data, rodents and bats, and that it also differs between species with high and low body mass, and between those with high and low mass-specific metabolic rate
Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm
Background Population genetics and association studies usually rely on a set of known variable sites that are then genotyped in subsequent samples, because it is easier to genotype than to discover the variation. This is also true for structural variation detected from sequence data. However, the genotypes at known variable sites can only be inferred with uncertainty from low coverage data. Thus, statistical approaches that infer genotype likelihoods, test hypotheses, and estimate population parameters without requiring accurate genotypes are becoming popular. Unfortunately, the current implementations of these methods are intended to analyse only single nucleotide and short indel variation, and they usually assume that the two alleles in a heterozygous individual are sampled with equal probability. This is generally false for structural variants detected with paired ends or split reads. Therefore, the population genetics of structural variants cannot be studied, unless a painstaking and potentially biased genotyping is performed first. Results We present svgem, an expectation-maximization implementation to estimate allele and genotype frequencies, calculate genotype posterior probabilities, and test for Hardy-Weinberg equilibrium and for population differences, from the numbers of times the alleles are observed in each individual. Although applicable to single nucleotide variation, it aims at bi-allelic structural variation of any type, observed by either split reads or paired ends, with arbitrarily high allele sampling bias. We test svgem with simulated and real data from the 1000 Genomes Project. Conclusions svgem makes it possible to use low-coverage sequencing data to study the population distribution of structural variants without having to know their genotypes. Furthermore, this advance allows the combined analysis of structural and nucleotide variation within the same genotype-free statistical framework, thus preventing biases introduced by genotype imputation
Discovery and population genomics of structural variation in a songbird genus
Structural variation (SV) constitutes an important type of genetic mutations providing the raw material for evolution. Here, we uncover the genome-wide spectrum of intra- and interspecific SV segregating in natural populations of seven songbird species in the genus Corvus. Combining short-read (N = 127) and long-read re-sequencing (N = 31), as well as optical mapping (N = 16), we apply both assembly- and read mapping approaches to detect SV and characterize a total of 220,452 insertions, deletions and inversions. We exploit sampling across wide phylogenetic timescales to validate SV genotypes and assess the contribution of SV to evolutionary processes in an avian model of incipient speciation. We reveal an evolutionary young (~530,000 years) cis-acting 2.25-kb LTR retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth and evolutionary significance of SV segregating in natural populations and highlight the need for reliable SV genotyping
Genome-wide fine-scale recombination rate variation in Drosophila melanogaster
Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and diversity
The determinants of genetic diversity in butterflies
This is the final version. Available on open access from Nature Research via the DOI in this recordUnder the neutral theory, genetic diversity is expected to increase with population size. While comparative analyses have
consistently failed to find strong relationships between census population size and genetic diversity, a recent study across
animals identified a strong correlation between propagule size and genetic diversity, suggesting that r-strategists that produce
many small offspring, have greater long-term population sizes. Here we compare genome-wide genetic diversity across 38
species of European butterflies (Papilionoidea), a group that shows little variation in reproductive strategy. We show that
genetic diversity across butterflies varies over an order of magnitude and that this variation cannot be explained by differences
in current abundance, propagule size, host or geographic range. Instead, neutral genetic diversity is negatively correlated
with body size and positively with the length of the genetic map. This suggests that genetic diversity is determined both by
differences in long-term population size and the elect of selection on linked sites.Biotechnology & Biological Sciences Research Council (BBSRC)European Research CouncilNatural Environmental Research Council (NERC)Institute of Evolutionary Biology at Edinburgh Universit