52 research outputs found
Bayesian inference of natural selection from allele frequency time series
The advent of accessible ancient DNA technology now allows the direct
ascertainment of allele frequencies in ancestral populations, thereby enabling
the use of allele frequency time series to detect and estimate natural
selection. Such direct observations of allele frequency dynamics are expected
to be more powerful than inferences made using patterns of linked neutral
variation obtained from modern individuals. We develop a Bayesian method to
make use of allele frequency time series data and infer the parameters of
general diploid selection, along with allele age, in non-equilibrium
populations. We introduce a novel path augmentation approach, in which we use
Markov chain Monte Carlo to integrate over the space of allele frequency
trajectories consistent with the observed data. Using simulations, we show that
this approach has good power to estimate selection coefficients and allele age.
Moreover, when applying our approach to data on horse coat color, we find that
ignoring a relevant demographic history can significantly bias the results of
inference. Our approach is made available in a C++ software package.Comment: 27 page
Inferring evolutionary histories of pathway regulation from transcriptional profiling data
One of the outstanding challenges in comparative genomics is to interpret the
evolutionary importance of regulatory variation between species. Rigorous
molecular evolution-based methods to infer evidence for natural selection from
expression data are at a premium in the field, and to date, phylogenetic
approaches have not been well-suited to address the question in the small sets
of taxa profiled in standard surveys of gene expression. We have developed a
strategy to infer evolutionary histories from expression profiles by analyzing
suites of genes of common function. In a manner conceptually similar to
molecular evolution models in which the evolutionary rates of DNA sequence at
multiple loci follow a gamma distribution, we modeled expression of the genes
of an \emph{a priori}-defined pathway with rates drawn from an inverse gamma
distribution. We then developed a fitting strategy to infer the parameters of
this distribution from expression measurements, and to identify gene groups
whose expression patterns were consistent with evolutionary constraint or rapid
evolution in particular species. Simulations confirmed the power and accuracy
of our inference method. As an experimental testbed for our approach, we
generated and analyzed transcriptional profiles of four \emph{Saccharomyces}
yeasts. The results revealed pathways with signatures of constrained and
accelerated regulatory evolution in individual yeasts and across the phylogeny,
highlighting the prevalence of pathway-level expression change during the
divergence of yeast species. We anticipate that our pathway-based phylogenetic
approach will be of broad utility in the search to understand the evolutionary
relevance of regulatory change.Comment: 30 pages, 12 figures, 2 tables, contact authors for supplementary
table
The population genomic legacy of the second plague pandemic
Human populations have been shaped by catastrophes that may have left long-lasting signatures in their genomes. One notable example is the second plague pandemic that entered Europe in ca. 1,347 CE and repeatedly returned for over 300 years, with typical village and town mortality estimated at 10%–40%.1 It is assumed that this high mortality affected the gene pools of these populations. First, local population crashes reduced genetic diversity. Second, a change in frequency is expected for sequence variants that may have affected survival or susceptibility to the etiologic agent (Yersinia pestis).2 Third, mass mortality might alter the local gene pools through its impact on subsequent migration patterns. We explored these factors using the Norwegian city of Trondheim as a model, by sequencing 54 genomes spanning three time periods: (1) prior to the plague striking Trondheim in 1,349 CE, (2) the 17th–19th century, and (3) the present. We find that the pandemic period shaped the gene pool by reducing long distance immigration, in particular from the British Isles, and inducing a bottleneck that reduced genetic diversity. Although we also observe an excess of large FST values at multiple loci in the genome, these are shaped by reference biases introduced by mapping our relatively low genome coverage degraded DNA to the reference genome. This implies that attempts to detect selection using ancient DNA (aDNA) datasets that vary by read length and depth of sequencing coverage may be particularly challenging until methods have been developed to account for the impact of differential reference bias on test statistics.publishedVersio
The population genomic legacy of the second plague pandemic
Human populations have been shaped by catastrophes that may have left long-lasting signatures in their genomes. One notable example is the second plague pandemic that entered Europe in ca. 1,347 CE and repeatedly returned for over 300 years, with typical village and town mortality estimated at 10%-40%.1 It is assumed that this high mortality affected the gene pools of these populations. First, local population crashes reduced genetic diversity. Second, a change in frequency is expected for sequence variants that may have affected survival or susceptibility to the etiologic agent (Yersinia pestis).2 Third, mass mortality might alter the local gene pools through its impact on subsequent migration patterns. We explored these factors using the Norwegian city of Trondheim as a model, by sequencing 54 genomes spanning three time periods: (1) prior to the plague striking Trondheim in 1,349 CE, (2) the 17th-19th century, and (3) the present. We find that the pandemic period shaped the gene pool by reducing long distance immigration, in particular from the British Isles, and inducing a bottleneck that reduced genetic diversity. Although we also observe an excess of large FST values at multiple loci in the genome, these are shaped by reference biases introduced by mapping our relatively low genome coverage degraded DNA to the reference genome. This implies that attempts to detect selection using ancient DNA (aDNA) datasets that vary by read length and depth of sequencing coverage may be particularly challenging until methods have been developed to account for the impact of differential reference bias on test statistics
A global catalog of whole-genome diversity from 233 primate species.
The rich diversity of morphology and behavior displayed across primate species provides an informative context in which to study the impact of genomic diversity on fundamental biological processes. Analysis of that diversity provides insight into long-standing questions in evolutionary and conservation biology and is urgent given severe threats these species are facing. Here, we present high-coverage whole-genome data from 233 primate species representing 86% of genera and all 16 families. This dataset was used, together with fossil calibration, to create a nuclear DNA phylogeny and to reassess evolutionary divergence times among primate clades. We found within-species genetic diversity across families and geographic regions to be associated with climate and sociality, but not with extinction risk. Furthermore, mutation rates differ across species, potentially influenced by effective population sizes. Lastly, we identified extensive recurrence of missense mutations previously thought to be human specific. This study will open a wide range of research avenues for future primate genomic research
The landscape of tolerated genetic variation in humans and primates.
Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these variants can be inferred to have nondeleterious effects in humans based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases
Recommended from our members
Analyses of pig genomes provide insight into porcine demography and evolution
For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars ∼1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model
- …